Charmaine McClarie, McClarie Group | Women Transforming Technology

>>from around the globe. It's the queue with digital coverage of women transforming technology brought to you by VM Ware. >>Hi, this is Lisa Martin covering fifth Annual Women Transforming Technology WT two from my home in San Jose, California Because this is the first year than WT two has gone digital. Very excited to welcome next one of the speakers from the executive track. We have Charmaine Macquarie, president of Macquarie Group, but also offer C Suite Advisor. You know, Speaker Charmaine. Nice to start with you. >>An absolute pleasure. Thank you for having me. >>So you have an incredible background. You have been for two decades working with leaders and I read 27 industries, five continents and from some pretty big, well known brands Coca Cola, Johnson and Johnson, my particular favorite Starbucks. Tell me a little bit about your background in your career and how you came to be working with potential leaders >>early on in my career, I was working in politics, actually helping politicians understand their constituency and then how to communicate effectively with them and then went on into marketing. And really, what I say is that what I do is a conglomeration of all of my life experience working with leaders either in politics, in marketing and sales on a variety of industries, including gas and oil, and coming together and helping them understand how to communicate their message effectively. How have executive presence and ensuring that they're seeing, heard and remembered >>what? One of the things. One of the things that talking about being remembered especially now during a crisis that nobody has ever experienced before, when there are so much, so much concern and so much uncertainty. Um, I e. Read that you said effective communication is more than just words and phrases, especially in today's climate. What is effective communication? >>Effective communication is making sure that people hear your value, your value proposition, and that is really essential today. One of the things you want to do is that you want to elevate your visibility and when elevate the value that you bring to your organization. There are a number of competing priorities, and what you want organization to understand is what is it that you see that others don't see, and that is a part of your value proposition. How are you going to help the organization innovate through this time, and wanting to do that is really speaking about what is the value. What is it that it's gonna make the difference for the organization today with this crises and that will also take it further into the future. >>Tell me a little bit about this session that you did at women transforming technology the other day. 35 minutes. Interactive session. Since everything for this year's event was digital, I love the name of your session. Speak up, Stand out. We heard talking a little bit about when you first learned maybe last month, that this event was going digital. Did you change anything? Were there certain elements of your expertise and your recommendations that are now more even more important? Respect to visibility and value? >>Yes, So what? I changed it. What I changed and Waas. I really wanted to make it as a conversational as possible, because in this isolation it's easy to not feel seen or heard, and I want people to be able to elevate again their visibility and their ability to add value. So a couple of things that people can do is they can actually rewrite their narrative if they need to meaning if you believe that if you do not define yourself, others well, and their definition will inevitably be inadequate. So if you know that you are seen as a very quiet person and a person that is in the background and you want to have greater visibility, this is a great opportunity for you to rewrite that narrative and make yourself more visible. Meaning, I think, the expertise that you have again the insight that you have, making sure that you bring that to the table. You can do it in a number of formats. You could do it not only on a zoom call with your colleagues, but you can. Also, your email is heightened if you're using language and the language of leadership language that really hurts. People's here, and that creates a visual. So now you want to do to really make sure that using language that is very vivid and allows a person to touch, taste and feel what it is that you're saying, so that's one of the things that you can do. The other is say, Is that what I want to make sure that my clients are not well kept secrets. I want to make sure that in this time of isolation that they're finding opportunities to reach out. So most everyone is at home sheltering in place so people have more time on their hands in terms of reading your emails. When researchers found that there is a 26% increase and say your newsletters being read your emails being written, so now is the time that you could actually heighten that kind of communication. >>That's fascinating. Look that you said about making what you're communicating in an email. Maybe it's even texture over something like slack, vivid. Say, somebody has a great idea, I think. All right, so terms have changed. My job function is difference, or it's challenging to complete certain Give me some words that you think. So now you're saying people are actually focusing more on reading what you're saying, What are some vivid words that I could use if I had an interesting finance project or a marketing project that I wanted to raise the visibility of and gets them to really feel what I'm looking at? >>So when you speak about up in a finance project, one of the things you want to do is think about what is a story that could articulate those numbers that can tell the story with those numbers. So if you were saying, um, let's just make it as simple as possible. Two plus two equals four. Well, what you want to think about is what is it that is going to be different when you finished this project, or what is it that's gonna be? It's gonna shift in the marketplace. And so you want to create that visual? What does the future look like? And using examples of things that are very basic to our life today, as opposed to using really complicated language. Now is the time to have your language simple, having very clear and having very vivid. So you >>run it, Go ahead. Sorry. >>No, please go. Right. Yet >>I'm glad that you brought up simplicity because so often I think people think maybe I'm managing a project or I'm creating a methodology, and I think, really, it's just it's the simple. But we often second guess ourselves because I think I included in this. A lot of folks think it can't be that simple. It's got to be more complex I need to show, you know, like an episode of I'm picturing an Apple sort of the Big Bang theory, and Sheldon's talking about strength there. You need to make it complex to show your value. And but sometimes it's the simplest methodology. The simplest way of communicating that is the most effective. Do you find out that sometimes spokes, regardless of their level of executive nous, are challenged to really step back and look at the simple way to communicate with the simple answer? >>Absolutely. And simplicity is best, whether it's during this time period or even beyond this pandemic, but particularly now. So I don't know if anybody's ever seen the show. The marvelous is, um, I think it's amazing. Yeah, single and one of the things that she asked her husband, She goes, Well, honey, what do you do? And so I think, in the first episode, and he says, You know, I signed papers, I do this, I do that and he says, I really don't know what the hell I do. And I remember an incident with one of my clients, and I asked her, What does she do? She gave me her job title and I said, Okay, how many people work in your company? And she said, 49,000 people work here. I said, How many people do you think have the same title issue? If she goes well, you know, I'm sure at least a couple 1000. I said yes. So what distinguishes you? And so she wanted to talk about the title, which is like talking about acronyms at a company. And I said So, Really, What do you do? What we realized is that what she does is that she was responsible the fastest growing market segment in her company that articulates your value proposition that made a very visit vivid and very brought it to life. So people are able to understand when someone asked me, What do I do? I don't say that I'm an executive coach because you may have read an article last week that says all executive coach us up, that defines May. I wanted to find myself. My value is, I hope smart people get promoted when they get promoted, they communicate the big picture. So I help smart people get promote and communicate the big picture. I provide executive coaching senior level executives. I articulated my value. You know who I work with their smart people, that they're not smart. They're not working with me when they work with me and get promoted. Why? Because it communicate the big picture. Really? Simple one sent it. So what is the value? That is what really heightens your visibility and heightens your and levels. Level up your ability to be seen and heard in organizations. >>And, you know, I was looking at your website. You've been 98% success rate of folks that have worked with you that have been promoted within the following 18 months. What are some of the both hard and soft skills that you're looking for? So when you work, when you select clients to work with that, that demonstrate they are ready to be in the six weeks >>Well, there's a couple of things. One is that person has to be open and willing and not being volunteered by the organization, meaning saying you need to do you have to do this. If it is mandatory that someone work with an executive coach, that's not a winning proposition. The winning propositions That person is open and open to change and ready to make change. As I say to my clients, if you want everything to remain the same, I am not the coach for you because you're going to see change and you're going to see significant change. So that's one the other is preparing your organization for the kind of change is going to take place so that your organization begins with C and hear what you're doing different. So, for example, I would say to a client, if you're prepared to really step up and make the commitment to making the shift, you want to let people know what kind of shift that you're taking you're making so that they can begin to look for people like to look for success. They like to be able to reward you when you're successful, but you've got to let them know that you're there >>for that shift. >>So that's one of the things that's really important is that people be open to it and they'd be ready to take their spotlight. If you want to do it and remain behind the curtain, that's wonderful. This is not the work for you. >>It requires a little bit of vulnerability that, or maybe a lot of vulnerability to be able to do that, not easy, unless you're bringing a brown fan like I am talk to me about, especially in this time with covered 19 The uncertainty in every aspect of our lives. Every single aspect is it's dense and it's an emotional challenge. So do you find that it's harder for some folks, whether they're men or women, to do what your title says? You know? Speak up, let them know I'm coming. I'm on my way. How are you advising folks from a psychological perspective, to be able to do this? >>Well, I think there's a couple of things. One is that with the three questions I ask every client and those three questions are one. How do you see yourself? How do other people see you? And the third is, How do you want to be seen? So when you're able to answer, become introspective and answer those questions from the heart from your heart, then you can get really clear about what you want the world to know about you and how you want to show up. And it does require vulnerability. It requires you to look inward first for you to make that decision on how you want the world to see you. And then once you're able to make that, get that clarity and so it's process make getting that clarity. Then you can speak about that to the world. My thing is, is again. If you don't define yourself, others will, and their definition is inadequate. So when you define yourself, you know who you are and what you stand for. You can then shout that at the top of your lungs. But you don't really have to, because your actions will speak very clearly about what it is and who you say you are and how you want the world to see you. And you're always asking, am I can grow it? >>I love that about defining yourself so that others don't do it incorrectly. Talk to me about how somebody can develop their own communication style. How what are some of the steps that they need to recognize that, for example, if you see someone, anything there too bold or there to brush, or maybe dial it back a bit, especially because messages are getting read more now, which that process internally that I would need to take to develop and effective communication style. What is it >>that you need to do to to develop that effective communication style one? As I said, being able to define what that looks like for you and what that is may not be appropriate for every organization and every corporate culture. So you need to find immediate. Make sure either evaluate whether not you're in the right corporate culture so that you can be successful and or find a new one so that you could be successful once you have that, really, um, helping the people in your organization to make it easy for them to come to you. So by extending by extending yourself first, that is one of the things that I would say it would be really important in terms of stepping up during this time frame is saying, I feel really this is really let's say, someone has been felt really shaken by this really shaken by this. But I am determined so leverage this as an opportunity to really show up as my best self and show my greatest humanity. And I think that when we let people know what did it, where we're going and where we're headed, This far more easy for people to support you and provide you with the venues in which to exhibit who you are. This is a great time for you to volunteer A so much as possible to have that visibility. Because I think one of the questions you asked me earlier is how do you get hadn't become comfortable with this? You get comfortable with it by practising, Lady Gaga says. We're born that way, but we are. The only way that it happens with people that are really successful is because they practice >>something that is so interesting. Is during this time in particular, is getting is accountability, right? It's so easy right now more than ever to lose accountability. And I like that. You said that That's what I'm hearing when you say, you know, let people know that direction that you're going in. I think for the person you set that okay, I publicly said this, I need to be held that I need to hold myself accountable so that I deliver. I think there's a lot of power in that >>there is, and when you step up and articulate to the world. Well, you're about what it is that you're going to deliver your level of excellence. You hold yourself accountable because the person who is most important for you to be accountable to is yourself. Others come second, actually, sort of like being on the airplane in the mask. You've got to do it for you first. Because if you let yourself down, that's the that is the most horrific. And so stepping up to that is so much. There's so much power. And I believe that people provide you with a lot of grace when you do that and people know they can count on you. >>And that's so important knowing demonstrating your dependability in any situation. Sherman, I wish we had more time. It's been such a pleasure talking to you. Thank you for sharing your insight. I'm gonna be visible show value and the vetted and communication and accountable. Thank you so much for joining me. >>Have a wonderful day. You >>as well. And for Charmaine McCleery. I'm Lisa Martin. You're watching the Cube's coverage of the digital version of women transforming technology 2020 for now. >>Yeah, >>Yeah, yeah, yeah, yeah, yeah

Published Date : May 14 2020

SUMMARY :

coverage of women transforming technology brought to you by VM Nice to start with you. Thank you for having me. So you have an incredible background. And really, what I say is that what I do is a conglomeration of all of my life experience working Um, I e. Read that you said effective communication is more than just is what is it that you see that others don't see, and that is a part of your value proposition. Tell me a little bit about this session that you did at women transforming technology the other day. their narrative if they need to meaning if you believe that if you do not define yourself, Look that you said about making what you're communicating is what is it that is going to be different when you finished this project, It's got to be more complex I need to show, you know, like an episode of I'm picturing an Apple sort And I said So, Really, What do you do? So when you work, when you select clients to work with that, that demonstrate they are ready and make the commitment to making the shift, you want to let people know what kind of shift that you're taking you're If you want to do it and remain behind the curtain, So do you find that it's harder for about what it is and who you say you are and how you want the world to see you. recognize that, for example, if you see someone, anything there too bold or there to brush, being able to define what that looks like for you and what that is may not be appropriate for every You said that That's what I'm hearing when you say, you know, And I believe that people provide you with a lot of grace when you do that and Thank you for sharing your insight. You And for Charmaine McCleery.

ENTITIES

Entity	Category	Confidence
Charmaine	PERSON	0.99+
Charmaine Macquarie	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Charmaine McCleery	PERSON	0.99+
Starbucks	ORGANIZATION	0.99+
Coca Cola	ORGANIZATION	0.99+
98%	QUANTITY	0.99+
three questions	QUANTITY	0.99+
35 minutes	QUANTITY	0.99+
26%	QUANTITY	0.99+
Two	QUANTITY	0.99+
Lady Gaga	PERSON	0.99+
last week	DATE	0.99+
One	QUANTITY	0.99+
Sherman	PERSON	0.99+
one	QUANTITY	0.99+
49,000 people	QUANTITY	0.99+
Johnson and Johnson	ORGANIZATION	0.99+
Macquarie Group	ORGANIZATION	0.99+
last month	DATE	0.99+
San Jose, California	LOCATION	0.99+
third	QUANTITY	0.99+
six weeks	QUANTITY	0.99+
2020	DATE	0.99+
Sheldon	PERSON	0.99+
two decades	QUANTITY	0.99+
two	QUANTITY	0.99+
five continents	QUANTITY	0.99+
first episode	QUANTITY	0.99+
27 industries	QUANTITY	0.99+
both	QUANTITY	0.99+
Charmaine McClarie	PERSON	0.99+
four	QUANTITY	0.99+
first year	QUANTITY	0.99+
single	QUANTITY	0.98+
May	DATE	0.98+
first	QUANTITY	0.98+
second	QUANTITY	0.98+
today	DATE	0.97+
VM Ware	ORGANIZATION	0.93+
this year	DATE	0.92+
k	QUANTITY	0.89+
McClarie Group	ORGANIZATION	0.88+
fifth Annual Women Transforming Technology WT two	EVENT	0.85+
18 months	QUANTITY	0.8+
Apple	ORGANIZATION	0.79+
single aspect	QUANTITY	0.78+
19	QUANTITY	0.76+
Cube	PERSON	0.74+
couple	QUANTITY	0.67+
WT two	EVENT	0.66+
pandemic	EVENT	0.61+
president	PERSON	0.6+
Big Bang	TITLE	0.58+
couple 1000	QUANTITY	0.54+

Vimal Endiran, Global Data Business Group Ecosystem Lead, Accenture @AccentureTech

>> Live from San Jose, in the heart of Silicon Valley, it's theCube. Covering Datawork Summit 2018. Brought to you by Hortonworks. >> Welcome back to theCube's live coverage of Dataworks here in San Jose, California. I'm your host, Rebecca Knight along with my cohost James Kobielus. We have with us Vimal Endiran. He is the Global Business Data Group Ecosystem Lead, at Accenture. He's coming to us straight from the Motor City. So, welcome Vimal. >> Thank you, thank you Rebecca. Thank you Jim. Looking forward to talk to you for the next ten minutes. >> So, before the cameras were rolling we were talking about how data veracity and how managers can actually know that the data that they're getting, that they're seeing, is trustworthy. What's your take on that right now? >> So, in the today's age the data is coming at you in a velocity that you never thought about, right. So today, the organizations are gathering data probably in the magnitude of petabytes. This is a new normal. We used to talk about gigs and now it's in petabytes. And the data coming in the form of images, video files, from the edge, you know edge devices, sensors, social media and everything. So, the amount of data, this is becoming the fuel for the new economy, right. So that companies, who can find a way to take advantage and figure out a way to use this data going to have a competitive advantage over their competitors. So, for that purpose, even though it's coming at that volume and velocity doesn't mean it's useful. So the thing is if they can find a way to make the data can be trustworthy, by the organization, and at the same time it's governed and secured. That's what's going to happen. It used to be it's called data quality, we call it when the structure it's okay, everything is maintained in SAP or some system. It's good it's coming to you. But now, you need to take advantage of the tools like machine learning, artificial intelligence, combining these algorithms and tool sets and abilities of people's mind, putting that in there and making it somewhat... Things can happen to itself at the same time it's trustworthy, we have offerings around that Accenture is developing place... It differs from industry to industry. Given the fact if the data coming in is something it's only worth for 15 seconds. After that it has no use other than understanding how to prevent something, from a sense of data. So, we have our offerings putting into place to make the data in a trustworthy, governed, secured, for an organization to use it and help the organization to get there. That's what we are doing. >> The standard user of your tools is it a data steward in the traditional sense or is it a data scientist or data engineer who's trying to, for example, compile a body of training data for use in building and training machine learning models? Do you see those kinds of customers for your data veracity offerings, that customer segment growing? >> Yes. We see both sides pretty much all walk of customers in our life. So, you hit the nail on the head, yes. We do see that type of aspects and also becoming, the data scientists you're also getting another set of people, the citizen data scientist. The people--- >> What is that? That's a controversial term. I've used that term on a number of occasions and a lot of my colleagues and peers in terms of other analysts bat me down and say, "No, that demeans the profession of data science by calling it..." But you tell me what how Accenture's defining that. >> The thing is, it's not demeaning. The fact is to become a citizen data scientist you need the help of data scientists. Basically, every time you need to build a model. And then you feed some data to learn. And then have an outcome to put that out. So you have a data scientist creating algorithms. What a citizen data scientist means, say if I'm not a data scientist, I should be able to take advantage of a model built for my business scenario, feed something data in, whatever I need to feed in, get an output and that program, that tool's going to tell me, go do this or don't do this, kind of things. So I become a data scientist by using a predefined model that's developed by an expert. Minds of many experts together. But rather than me going and hiring hundred experts, I go and buy a model and able to have one person maintain or tweak this model continuously. So, how can I enable that large volume of people by using more models. That's what-- >> If a predictive analytics tool that you would license from whatever vendor. If that includes prebuilt machine learning models for a particular tasks in it does that... Do you as a user of that tool, do you become automatically a citizen data scientist or do you need to do some actual active work with that model or data to live up to the notion of being a citizen data scientist? >> It's a good question. In my mind, I don't want to do it, my job is something else. To make something for the company. So, my job is not creating a model and doing that. My job is, I know my sets of data, I want to feed it in. I want to get the outcome that I can go and say increase my profit, increase my sales. That's what I want to do. So I may become a citizen data scientist without me knowing. I won't even be told that I'm using a model. I will take this set of data, feed it in here, it's going to tell you something. So, our data veracity point of view, we have these models built into some of platforms. That can be a tool from foreign works, taking advantage of the data storage tool or any other... In our own algorithms put in that helps you to create and maintain the data veracity to a scale of, if you say one to five, one is being low, five is being bad, to maintain at the five level. So that's the objective of that. >> So you're democratizing the tools of data science for the rest of us to solve real business problems. >> Right. >> So the data veracity aside, you're saying the user of these tools is doing something to manage, to correct or enhance or augment the data that's used to feed into these prebuilt models to achieve these outcomes? >> Yes. The augmented data, the feed data and the training data it comes out with an outcome to say, go do something. It tells you to perform something or do not perform. It's still an action. Comes out with an action to achieve a target. That's what it's going to be. >> You mention Hortonworks and since we are here at Dataworks and the Hortonworks show, tell us a little bit about your relationship with that company. >> Definitely. So Hortonworks is one of our premiere strategic partners. We've been the number one implementers, the partners for last two years in a row, implementing their technology across many of our clients. From partnership point of view, we have jointly developed offerings. What Accenture is best at, we're very good at industry knowledge. So with our industry knowledge and with their technology together what we're doing is we're creating some offerings that you can take to market. For example, we used to have data warehouses like using Teradata and older technology data warehouses. They're still good but at the same time, people also want to take the structured, unstructured data, images files and able to incorporate into the existing data warehouses. And how I can get the value out of the whole thing together. That's where Hortonworks' type of tools comes to play. So we have developed offerings called Modern Data Warehouse, taking advantage of your legacy systems you have plus this new data coming together and immediately you can create an analytics case, used case to do something. So, we have prebuilt programs and different scripts that take in different types of data. Moving into a data lake, Hortonworks data lake and then use your existing legacy data and all those together help you to create analytics use cases. So we have that called data modernization offering, we have one of that. Then we have-- >> So that's a prebuilt model for a specific vertical industry requirements or a specific business function, predictive analytics, anomaly detection and natural language processing, am I understanding correctly? >> Yes. We have industry based solutions as well but also to begin with, the data supply chain itself. To bring the data into the lake to use it. That's one of the offerings we play-- >> ...Pipeline and prepackaged models and rules and so forth. >> Right, prepackaged data ingestion, transformation, that prepackaged to take advantage with the new data sets along with your legacy data. That's one offering called data modernization offering. That to cloud. So, we can take to cloud. Hortonworks in a cloud it can be a joure, WS, HP, any cloud plus moving data. So that's one type of offering. Today actually we announced another offering jointly with Hortonworks, Atlas and Grainger Tool to help GDPR compliance. >> Will you explain what that tool does specifically to help customers with GDPR points. Does it work out of the box with Hortonworks data stewards studio? >> Well, to me I can get your answers from my colleagues who are much more technical on that but the fact is I can tell you functionally what the tool does is. >> Okay, please. >> So you, today the GDPR is basically, there's account regulations about you need to know about your personal data and you have your own destiny about your personal data. You can call the company and say, "Forget about me." If you are an EU resident. Or say, "Modify my data." They have to do it within certain time frame. If not they get fined. The fine can be up to 4% of the company's... So it's going to be a very large fine. >> Total revenue, yeah. >> So what we do is, basically take this tool. Put it in, working with Hortonworks this Atlas and Granger tool, we can go in and scan your data leak and we can scan at the metadata level and come into showcase. Then you know where is your personal data information about a consumer lies and now I know everything. Because what used to be in a legacy situation, the data originated someplace, somebody takes it and puts a system then somebody else downloads to an X file, somebody will put in an access data base and this kind of things. So now your data's pulling it across, you don't know where that lies. In this case, in the lake we can scan it, put this information, the meta data and the lineage information. Now, you immediately know where the data lies when somebody calls. Rebecca calls and says, "No longer use my information." I exactly know it's stored in this place in this table, in this column, let me go and take it out from here so that Rebecca doesn't exist anymore. Or whoever doesn't exist anymore. So that's the idea behind it. Also, we can catalog the entire data lake and we know not just personal information, other information, everything about other dimensions as well. And we can use it for our business advantage. So that's what we announced today. >> We're almost out of time but I want to finally ask you about talent because this is a pressing issue in Silicon Valley and beyond in really the tech industry, finding the right people, putting them in the right jobs and then keeping them happy there. So recruiting, retaining, what's Accenture's approach? >> This area, talent is the hardest one. >> Yes! >> Thanks to Hortonworks and Hortonworks point of view >> Send them to Detroit where the housing is far less expensive. >> Not a bad idea. >> Exactly! But the fact is-- >> We're both for Detroiters. >> What we did was, Hortonworks, Accenture has access to Hortonworks University, all their educational aspects. So we decided we're going to take that advantage and we going to enhance our talent by bringing the people from our... Retraining the people, taking the people to the new. People who know the legacy data aspects. So take them to see how we take the new world. So then we have a plan to use Hortonworks together the University, the materials and the people help, together we going to train about 500 people in different geos, 500 per piece and also our the development centers in India, Philippines, these places, so we have a larger plan to retrain the legacy into new. So, let's go and get people from out of the college and stuff, start building them from there, from an analyst to a consultant to a technical level and so that's the best way we are doing and actually the group I work with. Our group technology officer Sanjiv Vohra, he's basically in charge of training about 90,000 people on different technologies in and around that space. So the magnet is high but that's our approach to go and try and people and take it to that. >> Are you training them to be well rounded professionals in all things data or are you training them for specific specialties? >> Very, very good question. We do have this call master data architect program, so basically in the different levels after these trainings people go through specially you have to do so many projects, come back have an interview with a panel of people and you get certified, within the company, at certain level. At the master architect level you go and help a customer transform their data transformation, architecture vision where do you want to go to, that level. So we have the program with a university and that's the way we've taken it step by step to people to that level. >> Great. Vimal, thank you so much for coming on theCube. >> Thank you. >> It was really fun talking to you. >> Thank you so much, thank you for having me. Thank you. >> I'm Rebecca Knight for James Kobielus we will have more, well we actually will not be having any more coming up from Dataworks. This has been the Dataworks show. Thank you for tuning in. >> Signing off for now. >> And we'll see you next time.

Published Date : Jun 21 2018

SUMMARY :

Brought to you by Hortonworks. He is the Global Business Data Group Ecosystem Lead, Looking forward to talk to you for the next ten minutes. and how managers can actually know that the data and help the organization to get there. the data scientists "No, that demeans the profession of data science So you have a data scientist creating algorithms. or do you need to do some actual active work with that model and maintain the data veracity to a scale of, for the rest of us to solve real business problems. The augmented data, the feed data and the training data and the Hortonworks show, and immediately you can create an analytics case, To bring the data into the lake to use it. that prepackaged to take advantage with the new data sets to help customers with GDPR points. I can tell you functionally what the tool does is. and you have your own destiny about your personal data. So that's the idea behind it. and beyond in really the tech industry, Send them to Detroit and so that's the best way we are doing At the master architect level you go Vimal, thank you so much for coming on theCube. Thank you so much, thank you for having me. This has been the Dataworks show.

ENTITIES

Entity	Category	Confidence
Rebecca	PERSON	0.99+
James Kobielus	PERSON	0.99+
Vimal	PERSON	0.99+
Rebecca Knight	PERSON	0.99+
Jim	PERSON	0.99+
Sanjiv Vohra	PERSON	0.99+
Hortonworks	ORGANIZATION	0.99+
India	LOCATION	0.99+
Vimal Endiran	PERSON	0.99+
15 seconds	QUANTITY	0.99+
Silicon Valley	LOCATION	0.99+
Today	DATE	0.99+
San Jose	LOCATION	0.99+
Hortonworks University	ORGANIZATION	0.99+
Accenture	ORGANIZATION	0.99+
five	QUANTITY	0.99+
hundred experts	QUANTITY	0.99+
San Jose, California	LOCATION	0.99+
Detroit	LOCATION	0.99+
HP	ORGANIZATION	0.99+
one	QUANTITY	0.99+
today	DATE	0.99+
both sides	QUANTITY	0.99+
Hortonworks,	ORGANIZATION	0.99+
Hortonworks'	ORGANIZATION	0.99+
both	QUANTITY	0.98+
WS	ORGANIZATION	0.98+
about 90,000 people	QUANTITY	0.98+
500 per piece	QUANTITY	0.97+
Teradata	ORGANIZATION	0.97+
one person	QUANTITY	0.97+
GDPR	TITLE	0.97+
about 500 people	QUANTITY	0.96+
Global Business Data Group Ecosystem	ORGANIZATION	0.95+
five level	QUANTITY	0.93+
up to 4%	QUANTITY	0.93+
EU	LOCATION	0.93+
Datawork Summit 2018	EVENT	0.93+
Dataworks	ORGANIZATION	0.93+
Detroiters	PERSON	0.92+
@AccentureTech	ORGANIZATION	0.91+
Atlas and Grainger Tool	ORGANIZATION	0.88+
Global Data Business Group Ecosystem Lead	ORGANIZATION	0.86+
theCube	ORGANIZATION	0.83+
Philippines	LOCATION	0.8+
master	TITLE	0.77+
one type	QUANTITY	0.74+
petabytes	QUANTITY	0.73+
SAP	ORGANIZATION	0.61+
last two	DATE	0.58+
ten minutes	QUANTITY	0.58+
Atlas	ORGANIZATION	0.52+
years	QUANTITY	0.5+
data architect program	OTHER	0.48+
Granger	ORGANIZATION	0.46+

Steve Wooledge, Arcadia Data & Satya Ramachandran, Neustar | DataWorks Summit 2018

(upbeat electronic music) >> Live from San Jose, in the heart of Silicon Valley, it's theCUBE. Covering Dataworks Summit 2018, brought to you by Hortonworks. (electronic whooshing) >> Welcome back to theCUBE's live coverage of Dataworks, here in San Jose, California. I'm your host, Rebecca Knight, along with my co-host, James Kobielus. We have two guests in this segment, we have Steve Wooledge, he is the VP of Product Marketing at Arcadia Data, and Satya Ramachandran, who is the VP of Engineering at Neustar. Thanks so much for coming on theCUBE. >> Our pleasure and thank you. >> So let's start out by setting the scene for our viewers. Tell us a little bit about what Arcadia Data does. >> Arcadia Data is focused on getting business value from these modern scale-out architectures, like Hadoop, and the Cloud. We started in 2012 to solve the problem of how do we get value into the hands of the business analysts that understand a little bit more about the business, in addition to empowering the data scientists to deploy their models and value to a much broader audience. So I think that's been, in some ways, the last mile of value that people need to get out of Hadoop and data lakes, is to get it into the hands of the business. So that's what we're focused on. >> And start seeing the value, as you said. >> Yeah, seeing is believing, a picture is a thousand words, all those good things. And what's really emerging, I think, is companies are realizing that traditional BI technology won't solve the scale and user concurrency issues, because architecturally, big data's different, right? We're on the scale-out, MPP architectures now, like Hadoop, the data complexity and variety has changed, but the BI tools are still the same, and you pull the data out of the system to put it into some little micro cube to do some analysis. Companies want to go after all the data, and view the analysis across a much broader set, and that's really what we enable. >> I want to hear about the relationship between your two companies, but Satya, tell us a little about Neustar, what you do. >> Neustar is an information services company, we are built around identity. We are the premiere identity provider, the most authoritative identity provider for the US. And we built a whole bunch of services around that identity platform. I am part of the marketing solutions group, and I head the analytics engineering for marketing solutions. The product that I work on helps marketers do their annual planning, as well as their campaign or tactical planning, so that they can fine tune their campaigns on an ongoing basis. >> So how do you use Arcadia Data's primary product? >> So we are a predictive analytics platform, the reporting solution, we use Arcadia for the reporting part of it. So we have multi terabytes of advertising data in our values, and so we use Arcadia to provide fast taxes to our customers, and also very granular and explorative analysis of this data. High (mumbles) and explorative analysis of this data. >> So you say you help your customers with their marketing campaigns, so are you doing predictive analytics? And are you during churn analysis and so forth? And how does Arcadia fit into all of that? >> So we get data and then they build an activation model, which tells how the marketing spent corresponds to the revenue. We not only do historical analysis, we also do predictive, in the sense that the marketers frequently done what-if analysis, saying that, what if I moved my budget from page search to TV? And how does it affect the revenue? So all of this modeling is built by Neustar, the modeling platform is built by the Neustar, but the last mile of taking these reports and providing this explorative analysis of the results, that is provided by the reporting solution, which is Arcadia. >> Well, I mean, the thing about data analytics, is that it really is going to revolutionize marketing. That famous marketing adage of, I know my advertising works, I just don't know which half, and now we're really going to be able to figure out which half. Can you talk a little bit about return on investment and what your clients see? >> Sure, we've got some major Fortune 500 companies that have said publicly that they've realized over a billion dollars of incremental value. And that could be across both marketing analytics, and how we better treat our messaging, our brand, to reach our intended audience. There's things like supply chain and being able to more realtime analyze what-if analysis for different routes, it's things like cyber security and stopping fraud and waste and things like that at a much grander scale than what was really possible in the past. >> So we're here at Dataworks and it's the Hortonworks show. Give us a sense of the degree of your engagement or partnership with Hortonworks and participation in their partner ecosystem. >> Yeah, absolutely. Hortonworks is one of our key partners, and what we did that's different architecturally, is we built our BI server directly into the data platforms. So what I mean by that is, we take the concept of a BI server, we install it and run it on the data nodes of Hortonworks Data Platform. We inherit the security directly out of systems like Apache Ranger, so that all that administration and scale is done at Hadoop economics, if you will, and it leverages the things that are already in place. So that has huge advantages both in terms of scale, but also simplicity, and then you get the performance, the concurrency that companies need to deploy out to like, 5,000 users directly on that Hadoop cluster. So, Hortonworks is a fantastic partner for us and a large number of our customers run on Hortonworks, as well as other platforms, such as Amazon Web Services, where Satya's got his system deployed. >> At the show they announced Hortonworks Data Platform 3.0. There's containerization there, there's updates to Hive to enable it to be more of a realtime analytics, and also a data warehousing engine. In Arcadia Data, do you follow their product enhancements, in terms of your own product roadmap with any specific, fixed cycle? Are you going to be leveraging the new features in HDP 3.0 going forward to add value to your customers' ability to do interactive analysis of this data in close to realtime? >> Sure, yeah, no, because we're a native-- >> 'Cause marketing campaigns are often in realtime increasingly, especially when you're using, you know, you got a completely digital business. >> Yeah, absolutely. So we benefit from the innovations happening within the Hortonworks Data Platform. So, because we're a native BI tool that runs directly within that system, you know, with changes in Hive, or different things within HDFS, in terms of performance or compression and things like that, our customers generally benefit from that directly, so yeah. >> Satya, going forward, what are some of the problems that you want to solve for your clients? What is their biggest pain points and where do you see Neustar? >> So, data is the new island, right? So, marketers, also for them now, data is the biggest, is what they're going after. They want faster analysis, they want to be able to get to insights as fast as they can, and they want to obviously get, work on as large amount of data as possible. The variety of sources is becoming higher and higher and higher, in terms of marketing. There used to be a few channels in '70s and '80s, and '90s kind of increased, now you have like, hundreds of channels, if not thousands of channels. And they want visibility across all of that. It's the ability to work across this variety of data, increasing volume at a very high speed. Those are high level challenges that we have at Neustar. >> Great. >> So the difference, marketing attribution analysis you say is one of the core applications of your solution portfolio. How is that more challenging now than it had been in the past? We have far more marketing channels, digital and so forth, then how does the state-of-the-art of marketing attribution analysis, how is it changing to address this multiplicity of channels and media for advertising and for influencing the customer on social media and so forth? And then, you know, can you give us a sense for then, what are the necessary analytical tools needed for that? We often hear about a social graph analysis or semantic analysis, or for behavioral analytics and so forth, all of this makes it very challenging. How can you determine exactly what influences a customer now in this day and age, where, you think, you know, Twitter is an influencer over the conversation. How can you nail that down to specific, you know, KPIs or specific things to track? >> So I think, from our, like you pointed out, the variety is increasing, right? And I think the marketers now have a lot more options than what they have, and that that's a blessing, and it's also a curse. Because then I don't know where I'm going to move my marketing spending to. So, attribution right now, is still sitting at the headquarters, it's kind of sitting at a very high level and it is answering questions. Like we said, with the Fortune 100 companies, it's still answering questions to the CMOs, right? Where attribution will take us, next step is to then lower down, where it's able to answer the regional headquarters on what needs to happen, and more importantly, on every store, I'm able to then answer and tailor my attribution model to a particular store. Let's take Ford for an example, right? Now, instead of the CMO suite, but, if I'm able to go to every dealer, and I'm able to personal my attribution to that particular dealer, then it becomes a lot more useful. The challenge there is it all needs to be connected. Whatever model we are working for the dealer, needs to be connected up to the headquarters. >> Yes, and that personalization, it very much leverages the kind of things that Steve was talking about at Arcadia. Being able to analyze all the data to find those micro, micro, micro segments that can be influenced to varying degrees, so yeah. I like where you're going with this, 'cause it very much relates to the power of distributing federated big data fabrics like Hortonworks' offers. >> And so it's streaming analytics is coming to forward, and it's been talked about for the past longest period of time, but we have real use cases for streaming analytics right now. Similarly, the large volumes of the data volumes is, indeed, becoming a lot more. So both of them are doing a lot more right now. >> Yes. >> Great. >> Well, Satya and Steve, thank you so much for coming on theCUBE, this was really, really fun talking to you. >> Excellent. >> Thanks, it was great to meet you. Thanks for having us. >> I love marketing talk. >> (laughs) It's fun. I'm Rebecca Knight, for James Kobielus, stay tuned to theCUBE, we will have more coming up from our live coverage of Dataworks, just after this. (upbeat electronic music)

Published Date : Jun 20 2018

SUMMARY :

brought to you by Hortonworks. the VP of Product Marketing the scene for our viewers. the data scientists to deploy their models the value, as you said. and you pull the data out of the system Neustar, what you do. and I head the analytics engineering the reporting solution, we use Arcadia analysis of the results, and what your clients see? and being able to more realtime and it's the Hortonworks show. and it leverages the things of this data in close to realtime? you got a completely digital business. So we benefit from the It's the ability to work to specific, you know, KPIs and I'm able to personal my attribution the data to find those micro, analytics is coming to forward, talking to you. Thanks, it was great to meet you. stay tuned to theCUBE, we

ENTITIES

Entity	Category	Confidence
James Kobielus	PERSON	0.99+
Steve Wooledge	PERSON	0.99+
Rebecca Knight	PERSON	0.99+
Satya Ramachandran	PERSON	0.99+
Steve	PERSON	0.99+
Hortonworks	ORGANIZATION	0.99+
Neustar	ORGANIZATION	0.99+
Arcadia Data	ORGANIZATION	0.99+
Ford	ORGANIZATION	0.99+
Satya	PERSON	0.99+
2012	DATE	0.99+
San Jose	LOCATION	0.99+
two companies	QUANTITY	0.99+
Silicon Valley	LOCATION	0.99+
two guests	QUANTITY	0.99+
Arcadia	ORGANIZATION	0.99+
San Jose, California	LOCATION	0.99+
Amazon Web Services	ORGANIZATION	0.99+
US	LOCATION	0.99+
both	QUANTITY	0.99+
Hortonworks'	ORGANIZATION	0.99+
5,000 users	QUANTITY	0.99+
Dataworks	ORGANIZATION	0.98+
theCUBE	ORGANIZATION	0.98+
one	QUANTITY	0.97+
Twitter	ORGANIZATION	0.96+
hundreds of channels	QUANTITY	0.96+
Dataworks Summit 2018	EVENT	0.96+
DataWorks Summit 2018	EVENT	0.93+
thousands of channels	QUANTITY	0.93+
over a billion dollars	QUANTITY	0.93+
Data Platform 3.0	TITLE	0.9+
'70s	DATE	0.86+
Arcadia	TITLE	0.84+
Hadoop	TITLE	0.84+
HDP 3.0	TITLE	0.83+
'90s	DATE	0.82+
Apache Ranger	ORGANIZATION	0.82+
thousand words	QUANTITY	0.76+
HDFS	TITLE	0.76+
multi terabytes	QUANTITY	0.75+
Hive	TITLE	0.69+
Neustar	TITLE	0.67+
Fortune	ORGANIZATION	0.62+
80s	DATE	0.55+
500	QUANTITY	0.45+
100	QUANTITY	0.4+
theCUBE	TITLE	0.39+

Partha Seetala, Robin Systems | DataWorks Summit 2018

>> Live from San Jose, in the heart of Silicon Valley, it's theCUBE. Covering DataWorks Summit 2018. Brought to you by Hortonworks. >> Welcome back everyone, you are watching day two of theCUBE's live coverage of DataWorks here in San Jose, California. I'm your host, Rebecca Knight. I'm coming at you with my cohost Jame Kobielus. We're joined by Partha Seetala, he is the Chief Technology Officer at Robin Systems, thanks so much for coming on theCUBE. >> Pleasure to be here. >> You're a first timer, so we promise we don't bite. >> Actually I'm not, I was on theCUBE- >> Oh! >> At DockerCon in 2016. >> Oh well excellent, okay, so now you're a veteran, right. >> Yes, ma'am. >> So Robin Systems, as before the cameras were rolling, we were talking about it, it's about four years old, based here in San Jose, venture backed company. Tell us a little bit more about the company and what you do. >> Absolutely. First of all, thanks for hosting me here. Like you said, Robin is a Silicon Valley based company. Our focus is in allowing applications, such as big data, databases, no sequel and AI ML, to run within the Kubernetes platform. What we have built is a product that converges storage, complex storage, networking, application workflow management, along with Kubernetes to create a one click experience where users can get managed services kind of feel when they're deploying these applications. They can also do one click life cycle management on these apps. Our thesis has initially been to, instead of looking at this problem from an infrastructure up into application, to actually look at it from the applications down and then say, "Let the applications drive the underlying infrastructure to meet the user's requirements." >> Is that your differentiating factor, would you say? >> Yeah, I think it is because most of the folks out there today are looking at is as if it's a competent based play, it's like they want to bring storage to Kubernetes or networking to Kubernetes but the challenges are not really around storage and networking. If you talk to the operations folk they say that, "You know what? Those are underlying problems but my challenge is more along the lines of, okay, my CIO says the initiative is to make my applications mobile. They want go across to different Clouds. That's my challenge." The line of business user says, "I want to get a managed source experience." Yes, storage is the thing that you want to manage underneath, but I want to go and click and create my, let's say, an Oracle database or distributions log. >> In terms of the developer experience here, from the application down, give us a sense for how Robin Systems tooling your product enables that degree of specification of the application logic that will then get containerized within? >> Absolutely, like I said, we want applications to drive the infrastructure. What it means is that we, Robin is a software platform. We later ourselves on top of the machines that we sit on whether it is bare metal machines on premises, our VMs, or even an Azure, Google Cloud as well as AWs. Then we make the underlying compute, storage, network resources almost invisible. We treat it as a pool of resources. Now once you have this pool of resources, they can be attached to the applications that are being deployed as can inside containers. I mean, it's a software place, install on machines. Once it's installed, the experience now moves away from infrastructure into applications. You log in, you can see a portal, you have a lot of applications in that portal. We ship support for about 25 applications of some such. >> So these are templates? >> Yes. >> That the developer can then customize to their specific requirements? Or no? >> Absolutely, we ship reference templates for pretty much a wide variety of the most popular big data, no sequel, database, AI ML applications today. But again, as I said, it's a reference implementation. Typically customers take the reference recommendation and they enhance it or they use that to onboard their custom apps, for example, or the apps that we don't ship out of the box. So it's a very open, extensible platform but the goal being that whatever the application might be, in fact we keep saying that, if it runs somewhere else, it's runs on Robin, right? So the idea here is that you can bring anything, and we just, the flip of switch, you can make it a one click deploy, one click manage, one click mobile across Clouds. >> You keep mentioning this one click and this idea of it being so easy, so convenient, so seamless, is that what you say is the biggest concern of your customers? Is this ease and speed? Or what are some other things that are on their minds that you want to deliver? >> Right, so one click of course is a user experience part but what is the real challenge? The real challenges, there are a wide variety of tools being used by enterprises today. Even the data analytic pipeline, there's a lot across the data store, processor pipeline. Users don't want to deal with setting it up and keeping it up and running. They don't want that, they want to get the job done, right? Now when you only get the job done, you really want to hide the underlying details of those platforms and the best way to convey that, the best way to give that experience is to make it a single click experience from the UI. So I keep calling it all one click because that is the experience that you get to hide the underlying complexity for these apps. >> Does your environment actually compile executable code based on that one click experience? Or where does the compilation and containerization actually happen in your distributed architecture? >> Alright, so, I think the simplest- >> You're a prem based offering, right? You're not in the Cloud yourself? >> No, we are. We work on all the three big public clouds. >> Oh, okay. >> Whether it is Azure, AWS or Google. >> So your entire application is containerized itself for deployment into these Clouds? >> Yes, it is. >> Okay. >> So the idea here is let's simplify it significantly, right? You have Kubernetes today, it can run anywhere, on premises, in the public Cloud and so on. Kubernetes is a great platform for orchestrating containers but it is largely inaccessible to a certain class of data centric applications. >> Yeah. >> We make that possible. But our take is, just onboarding those applications on Kubernetes does not solve your CXO or you line of business user's problems. You ought to make the management, from an application point of view, not from a container management point of view, from an application point of view, a lot easier and that is where we kind of create this experience that I'm talking about, one click experience. >> Give us a sense for how, we're here at DataWorks and it's the Hortonworks show. Discuss with us your partnership with Hortonworks and you know, we've heard the announcement of HDP 3.0 and containerization support, just give us a rough sense for how you align or partner with Hortonworks in this area. >> Absolutely. It's kind of interesting because Hortonworks is a data management platform, if you think about it from that point of view and when we engaged with them first- So some of our customers have been using the product, Hortonworks, on top of Robin, so orchestrating Hortonworks, making it a lot easier to use. >> Right. >> One of the requirements was, "Are you certified with Hortonworks?" And the challenge that Hortonworks also had is they had never certified a container based deployment of Hortonworks before. They actually were very skeptical, you know, "You guys are saying all these things. Can you actually containerize and run Hortonworks?" So we worked with Hortonworks and we are, I mean if you go to the Hortonworks website, you'll see that we are the first in the entire industry who have been certified as a container based play that can actually deploy and manage Hortonworks. They have certified us by running a wide variety of tests, which they call the Q80 Test Suite, and when we got certified the only other players in the market that got that stamp of approval was Microsoft in Azure and EMC with Isilon. >> So you're in good company? >> I think we are in great company. >> You're certified to work with HTP 3.0 or the prior version or both? >> When we got certified we were still in the 2.X version of Hortonworks, HTP 3.0 is a more relatively newer version. But our plan is that we want to continue working with Hortonworks to get certified as they release the program and also help them because HTP 3.0 also has some container based orchestration and deployment so you want to help them provide the underlying infrastructure so that it becomes easier for beyond to spin up more containers. >> The higher level security and governance and all these things you're describing, they have to be over the Kubernetes layer. Hortonworks supports it in their data plane services portfolio. Does Robin Systems solutions portfolio tap in to any of that, or do you provide your own layer of sort of security and metadata management so forth? >> Yeah, so we don't want- >> In context of what you offer? >> Right, so we don't want to take away the security model that the application itself provides because might have step it up so that they are doing governance, it's not just logging in and auto control and things like this. Some governance is built into. We don't want to change that. We want to keep the same experience and the same workflow hat customers have so we just integrate with whatever security that the application has. We, of course, provide security in terms of isolating these different apps that are running on the Robin platform where the security or the access into the application itself is left to the apps themselves. When I say apps, I'm talking about Hortonworks. >> Yeah, sure. >> Or any other databases. >> Moving forward, as you think about ways you're going to augment and enhance and alter the Robin platform, what are some of the biggest trends that are driving your decision making around that in the sense of, as we know that companies are living with this deluge of data, how are you helping them manage it better? >> Sure. I think there are a few trends that we are closely watching. One is around Cloud mobility. CIOs want their applications along with their data to be available where their end users are. It's almost like follow the sun model, where you might have generated the data in one Cloud and at a different time, different time zone, you'll basically want to keep the app as well as data, moving. So we are following that very closely. How we can enable the mobility of data and apps a lot easier in that world. The other one is around the general AI ML workflow. One of the challenges there, of course, you have great apps like TensorFlow or Theano or Caffe, these are very good AI ML toolkits but one of the challenges that people face, is they are buying this very expensive, let's say NVIDIA DGX Box, this box costs about $150,000 each, how do you keep these boxes busy so that you're getting a good return on investment? It will require you to better manage the resources offered with these boxes. We are also monitoring that space and we're seeing that how can we take the Robin platform and how do you enable the better utilization of GPUs or the sharing of GPUs for running your AI ML kind of workload. >> Great. >> Those are, I think, two key trends that we are closely watching. >> We'll be discussing those at the next DataWorks Summit, I'm sure, at some other time in the future. >> Absolutely. >> Thank you so much for coming on theCUBE, Partha. >> Thank you. >> Thank you, my pleasure. Thanks. >> I'm Rebecca Knight for James Kobielus, We will have more from DataWorks coming up in just a little bit. (techno beat music)

Published Date : Jun 20 2018

SUMMARY :

in the heart of Silicon Valley, he is the Chief Technology we promise we don't bite. so now you're a veteran, right. and what you do. from the applications down Yes, storage is the thing that you want the machines that we sit on or the apps that we don't because that is the No, we are. So the idea here is let's and that is where we kind of create and it's the Hortonworks show. if you think about it One of the requirements was, or the prior version or both? the underlying infrastructure so that to any of that, or do you that are running on the Robin platform the Robin platform and how do you enable that we are closely watching. at the next DataWorks Summit, Thank you so much for Thank you, my pleasure. We will have more from DataWorks

ENTITIES

Entity	Category	Confidence
Rebecca Knight	PERSON	0.99+
Hortonworks	ORGANIZATION	0.99+
Jame Kobielus	PERSON	0.99+
San Jose	LOCATION	0.99+
AWS	ORGANIZATION	0.99+
James Kobielus	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Robin Systems	ORGANIZATION	0.99+
Partha Seetala	PERSON	0.99+
Silicon Valley	LOCATION	0.99+
San Jose, California	LOCATION	0.99+
Oracle	ORGANIZATION	0.99+
one click	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
one	QUANTITY	0.99+
2016	DATE	0.99+
both	QUANTITY	0.99+
HTP 3.0	TITLE	0.99+
NVIDIA	ORGANIZATION	0.99+
first	QUANTITY	0.99+
DataWorks	ORGANIZATION	0.99+
Robin	ORGANIZATION	0.98+
Kubernetes	TITLE	0.98+
One	QUANTITY	0.98+
TensorFlow	TITLE	0.98+
about $150,000 each	QUANTITY	0.98+
about 25 applications	QUANTITY	0.98+
one click	QUANTITY	0.98+
Partha	PERSON	0.98+
Isilon	ORGANIZATION	0.97+
DGX Box	COMMERCIAL_ITEM	0.97+
today	DATE	0.96+
First	QUANTITY	0.96+
DockerCon	EVENT	0.96+
Azure	ORGANIZATION	0.96+
Theano	TITLE	0.96+
DataWorks Summit 2018	EVENT	0.95+
theCUBE	ORGANIZATION	0.94+
Caffe	TITLE	0.91+
Azure	TITLE	0.91+
Robin	PERSON	0.91+
Robin	TITLE	0.9+
two key trends	QUANTITY	0.89+
HDP 3.0	TITLE	0.87+
EMC	ORGANIZATION	0.86+
single click	QUANTITY	0.86+
day two	QUANTITY	0.84+
DataWorks Summit	EVENT	0.83+
three big public clouds	QUANTITY	0.82+
DataWorks	EVENT	0.81+

Rob Bearden, Hortonworks | DataWorks Summit 2018

>> Live from San Jose in the heart of Silicon Valley, it's theCUBE covering DataWorks Summit 2018, brought to you by Hortonworks. >> Welcome back to theCUBE's live coverage of DataWorks Summit here in San Jose, California. I'm your host, Rebecca Knight, along with my co-host, James Kobielus. We're joined by Rob Bearden. He is the CEO of Hortonworks. So thanks so much for coming on theCUBE again, Rob. >> Thank you for having us. >> So you just got off of the keynote on the main stage. The big theme is really about modern data architecture. So we're going to have this modern data architecture. What is it all about? How do you think about it? What's your approach? And how do you walk customers through this process? >> Well, there's a lot of moving parts in enabling a modern data architecture. One of the first steps is what we're trying to do is unlock the siloed transactional applications, and to get that data into a central architecture so you can get real time insights around the inclusive dataset. But what we're really trying to accomplish then within that modern data architecture is to bring all types of data whether it be real time streaming data, whether it be sensor data, IoT data, whether it be data that's coming from a connected core across the network, and to be able to bring all that data together in real time, and give the enterprise the ability to be able to take best in class action so that you get a very prescriptive outcome of what you want. So if we bring that data under management from point of origination and out on the edge, and then have the platforms that move that through its entire lifecycle, and that's our HDF platform, it gives the customer the ability to, after they capture it at the edge, move it, and then have the ability to process it as an event happens, a condition changes, various conditions come together, have the ability to process and take the exact action that you want to see performed against that, and then bring it to rest, and that's where our HDP platform comes into play where then all that data can be aggregated so you can have a holistic insight, and have real time interactions on that data. But then it then becomes about deploying those datasets and workloads on the tier that's most economically and architecturally pragmatic. So if that's on-prem, we make sure that we are architected for that on-prem deployment or private cloud or even across multiple public clouds simultaneously, and give the enterprise the ability to support each of those native environments. And so we think hybrid cloud architecture is really where the vast majority of our customers today and in the future, are going to want to be able to run and deploy their applications and workloads. And that's where our DataPlane Service Offering gives them the ability to have that hybrid architecture and the architectural latitude to move workloads and datasets across each tier transparently to what storage file format that they did or where that application is, and we provide all the tooling to match the complexity from doing that, and then we ensured that it has one common security framework, one common governance through its entire lifecycle, and one management platform to handle that entire lifecycle data. And that's the modern data architecture is to be able to bring all data under management, all types of data under management, and manage that in real time through its lifecycle til it comes at rest and deploy that across whatever architecture tier is most appropriate financially and from a performance on-cloud or prem. >> Rob, this morning at the keynote here in day one at DataWorks San Jose, you presented this whole architecture that you described in the context of what you call hybrid clouds to enable connected communities and with HDP, Hortonworks Data Platform 3.0 is one of the prime announcements, you brought containerization into the story. Could you connect those dots, containerization, connected communities, and HDP 3.0? >> Well, HDP 3.0 is really the foundation for enabling that hybrid architecture natively, and what's it done is it separated the storage from the compute, and so now we have the ability to deploy those workloads via a container strategy across whichever tier makes the most sense, and to move those application and datasets around, and to be able to leverage each tier in the deployment architectures that are most pragmatic. And then what that lets us do then is be able to bring all of the different data types, whether it be customer data, supply chain data, product data. So imagine as an industrial piece of equipment is, an airplane is flying from Atlanta, Georgia to London, and you want to be able to make sure you really understand how well is that each component performing, so that that plane is going to need service when it gets there, it doesn't miss the turnaround and leave 300 passengers stranded or delayed, right? Now with our Connected platform, we have the ability to take every piece of data from every component that's generated and see that in real time, and let the airlines make that real time. >> Delineate essentially. >> And ensure that we know every person that touched it and looked at that data through its entire lifecycle from the ground crew to the pilots to the operations team to the service. Folks on the ground to the reservation agents, and we can prove that if somehow that data has been breached, that we know exactly at what point it was breached and who did or didn't get to see it, and can prevent that because of the security models that we put in place. >> And that relates to compliance and mandates such as the Global Data Protection Regulation GDPR in the EU. At DataWorks Berlin a few months ago, you laid out, Hortonworks laid out, announced a new product called the Data Steward Studio to enable GDPR compliance. Can you give our listeners now who may not have been following the Berlin event a bit of an update on Data Steward Studio, how it relates to the whole data lineage, or set of requirements that you're describing, and then going forward what does Hortonworks's roadmap for supporting the full governance lifecycle for the Connected community, from data lineage through like model governance and so forth. Can you just connect a few dots that will be helpful? >> Absolutely. What's important certainly, driven by GDPR, is the requirement to be able to prove that you understand who's touched that data and who has not had access to it, and that you ensure that you're in compliance with the GDPR regulations which are significant, but essentially what they say is you have to protect the personal data and attributes of that data of the individual. And so what's very important is that you've got to be able to have the systems that not just secure the data, but understand who has the accessibility at any point in time that you've ever maintained that individual's data. And so it's not just about when you've had a transaction with that individual, but it's the rest of the history that you've kept or the multiple datasets that you may try to correlate to try to expand relationship with that customer, and you need to make sure that you can ensure not only that you've secured their data, but then you're protecting and governing who has access to it and when. And as importantly that you can prove in the event of a breach that you had control of that, and who did or did not access it, because if you can't prove any breach, that it was secure, and that no one breached it, who has or access to this not supposed to, you can be opened up for hundreds of thousands of dollars or even multiple millions of dollars of fines just because you can't prove that it was not accessed, and that's what the variety of our platforms, you mentioned Data Studio, is part of. DataPlane is one of the capabilities that gives us the ability. The core engine that does that is Atlas, and that's the open source governance platform that we developed through the community that really drives all the capabilities for governance that moves through each of our products, HDP, HDF, then of course, and DataPlane and Data Studio takes advantage of that and how it moves and replicates data and manages that process for us. >> One of the things that we were talking about before the cameras were rolling was this idea of data driven business models, how they are disrupting current contenders, new rivals coming on the scene all the time. Can you talk a little bit about what you're seeing and what are some of the most exciting and maybe also some of the most threatening things that you're seeing? >> Sure, in the traditional legacy enterprise, it's very procedural driven. You think about classic Encore ERP. It's worked very hard to have a very rigid, very structural procedural order to cash cycle that has not a great deal of flexibility. And it takes through a design process, it builds product, that then you sell product to a customer, and then you service that customer, and then you learn from that transaction different ways to automate or improve efficiencies in their supply chain. But it's very procedural, very linear. And in the new world of connected data models, you want to bring transparency and real time understanding and connectivity between the enterprise, the customer, the product, and the supply chain, and that you can take real time best in practice action. So for example you understand how well your product is performing. Is your customer using it correctly? Are they frustrated with that? Are they using it in the patterns and the frequency that they should be if they are going to expand their use and buy more, and if they're not, how do we engage in that cycle? How do we understand if they're going through a re-review and another buying of something similar that may not be with you for a different reason. And when we have real time visibility to our customer's interaction, understand our product's performance through its entire lifecycle, then we can bring real time efficiency with linking those together with our supply chain into the various relationships we have with our customers. To do that, it requires the modern data architecture, bringing data under management from the point it originates, whether it's from the product or the customer interacting with the company, or the customer interacting potentially with our ecosystem partners, mutual partners, and then letting the best in practice supply chain techniques, make sure that we're bringing the highest level of service and support to that entire lifecycle. And when we bring data under management, manage it through its lifecycle and have the historical view at rest, and leverage that across every tier, that's when we get these high velocity, deep transparency, and connectivity between each of the constituents in the value chain, and that's what our platforms give them the ability to do. >> Not only your platform, you guys have been in business now for I think seven years or so, and you shifted from being in the minds of many and including your own strategy from being the premier data at rest company in terms of the a Hadoop platform to being one of the premier data in motion companies. Is that really where you're going? To be more of a completely streaming focus, solution provider in a multi-cloud environment? And I hear a lot of Kafka in your story now that it's like, oh yeah, that's right, Hortonworks is big on Kafka. Can you give us just a quick sense of how you're making that shift towards low latency real time streaming, big data, or small data for that matter, with embedded analytics and machine learning? >> So, we have evolved from certainly being the leader in global data platforms with all the work that we do collaboratively, and in through the community, to make Hadoop an enterprise viable data platform that has the ability to run mission critical workloads and apps at scale, ensuring that it has all the enterprise facilities from security and governance and management. But you're right, we have expanded our footprint aggressively. And we saw the opportunity to actually create more value for our customers by giving them the ability to not wait til they bring data under management to gain an insight, because in that case, they're happened to be reactive post event post transaction. We want to give them the ability to shift their business model to being interactive, pre-event, pre-conditioned. The way to do that we learned was to be able to bring the data under management from the point of origination, and that's what we used MiNiFi and NiFi for, and then HDF, to move it through its lifecycle, and your point, we have the intellect, we have the insight, and then we have the ability then to process the best in class outcome based on what we know the variables are we're trying to solve for as that's happening. >> And there's the word, the phrase asset which of course is a transactional data paradigm plan, I hear that all over your story now in streaming. So, what you're saying is it's a completely enterprise-grade streaming environment from n to n for the new era of edge computing. Would that be a fair way of-- >> It's very much so. And our model and strategy has always been bring the other best in class engines for what they do well for their particular dataset. A couple of examples of that, one, you brought up Kafka, another is Spark. And they do what they do really well. But what we do is make sure that they fit inside an overall data architecture that then embodies their access to a much broader central dataset that goes from point of origination to point of rest on a whole central architecture, and then benefit from our security, governance, and operations model, being able to manage those engines. So what we're trying to do is eliminate the silos for our customers, and having siloed datasets that just do particular functions. We give them the ability to have an enterprise modern data architecture, we manage the things that bring that forward for the enterprise to have the modern data driven business models by bringing the governance, the security, the operations management, ensure that those workflows go from beginning to end seamlessly. >> Do you, go ahead. >> So I was just going to ask about the customer concerns. So here you are, you've now given them this ability to make these real time changes, what's sort of next? What's on their mind now and what do you see as the future of what you want to deliver next? >> First and foremost we got to make sure we get this right, and we really bring this modern data architecture forward, and make sure that we truly have the governance correct, the security models correct. One pane of glass to manage this. And really enable that hybrid data architecture, and let them leverage the cloud tier where it's architecturally and financially pragmatic to do it, and give them the ability to leg into a cloud architecture without risk of either being locked in or misunderstanding where the lines of demarcation of workloads or datasets are, and not getting the economies or efficiencies they should. And we solved that with DataPlane. So we're working very hard with the community, with our ecosystem and strategic partners to make sure that we're enabling the ability to bring each type of data from any source and deploy it across any tier with a common security, governance, and management framework. So then what's next is now that we have this high velocity of data through its entire lifecycle on one common set of platforms, then we can start enabling the modern applications to function. And we can go look back into some of the legacy technologies that are very procedural based and are dependent on a transaction or an event happening before they can run their logic to get an outcome because that grinds the customer in post world activity. We want to make sure that we're bringing that kind of, for example, supply chain functionality, to the modern data architecture, so that we can put real time inventory allocation based on the patterns that our customers go in either how they're using the product, or frustrations they've had, or success they've had. And we know through artificial intelligence and machine learning that there's a high probability not only they will buy or use or expand their consumption of whatever that they have of our product or service, but it will probably to these other things as well if we do those things. >> Predict the logic as opposed to procedural, yes, AI. >> And very much so. And so it'll be bringing those what's next will be the modern applications on top of this that become very predictive and enabler versus very procedural post to that post transaction. We're little ways downstream. That's looking out. >> That's next year's conference. >> That's probably next year's conference. >> Well, Rob, thank you so much for coming on theCUBE, it's always a pleasure to have you. >> Thank you both for having us, and thank you for being here, and enjoy the summit. >> We're excited. >> Thank you. >> We'll do. >> I'm Rebecca Knight for Jim Kobielus. We will have more from DataWorks Summit just after this. (upbeat music)

Published Date : Jun 20 2018

SUMMARY :

in the heart of Silicon Valley, He is the CEO of Hortonworks. keynote on the main stage. and give the enterprise the ability in the context of what you call and let the airlines from the ground crew to the pilots And that relates to and that you ensure that and maybe also some of the most and that you can take real and you shifted from being that has the ability to run for the new era of edge computing. and then benefit from our security, and what do you see as the future and make sure that we truly have Predict the logic as the modern applications on top of this That's probably next year's it's always a pleasure to have you. and enjoy the summit. I'm Rebecca Knight for Jim Kobielus.

ENTITIES

Entity	Category	Confidence
James Kobielus	PERSON	0.99+
Rebecca Knight	PERSON	0.99+
Rob Bearden	PERSON	0.99+
Jim Kobielus	PERSON	0.99+
London	LOCATION	0.99+
300 passengers	QUANTITY	0.99+
San Jose	LOCATION	0.99+
Rob	PERSON	0.99+
Silicon Valley	LOCATION	0.99+
Hortonworks	ORGANIZATION	0.99+
seven years	QUANTITY	0.99+
hundreds of thousands of dollars	QUANTITY	0.99+
San Jose, California	LOCATION	0.99+
each component	QUANTITY	0.99+
GDPR	TITLE	0.99+
DataWorks Summit	EVENT	0.99+
one	QUANTITY	0.99+
One	QUANTITY	0.98+
millions of dollars	QUANTITY	0.98+
Atlas	TITLE	0.98+
first steps	QUANTITY	0.98+
HDP 3.0	TITLE	0.97+
One pane	QUANTITY	0.97+
both	QUANTITY	0.97+
DataWorks Summit 2018	EVENT	0.97+
First	QUANTITY	0.96+
next year	DATE	0.96+
each	QUANTITY	0.96+
DataPlane	TITLE	0.96+
theCUBE	ORGANIZATION	0.96+
Hadoop	TITLE	0.96+
DataWorks	ORGANIZATION	0.95+
Spark	TITLE	0.95+
today	DATE	0.94+
EU	LOCATION	0.93+
this morning	DATE	0.91+
Atlanta,	LOCATION	0.91+
Berlin	LOCATION	0.9+
each type	QUANTITY	0.88+
Global Data Protection Regulation GDPR	TITLE	0.87+
one common	QUANTITY	0.86+
few months ago	DATE	0.85+
NiFi	ORGANIZATION	0.85+
Data Platform 3.0	TITLE	0.84+
each tier	QUANTITY	0.84+
Data Studio	ORGANIZATION	0.84+
Data Studio	TITLE	0.83+
day one	QUANTITY	0.83+
one management platform	QUANTITY	0.82+
MiNiFi	ORGANIZATION	0.82+
San	LOCATION	0.71+
DataPlane	ORGANIZATION	0.69+
Kafka	TITLE	0.67+
Encore ERP	TITLE	0.66+
one common set	QUANTITY	0.65+
Data Steward Studio	ORGANIZATION	0.65+
HDF	ORGANIZATION	0.59+
Georgia	LOCATION	0.55+
announcements	QUANTITY	0.51+
Jose	ORGANIZATION	0.47+

Tim Vincent & Steve Roberts, IBM | DataWorks Summit 2018

>> Live from San Jose, in the heart of Silicon Valley, it's theCUBE, overing DataWorks Summit 2018. Brought to you by Hortonworks. >> Welcome back everyone to day two of theCUBE's live coverage of DataWorks, here in San Jose, California. I'm your host, Rebecca Knight, along with my co-host James Kobielus. We have two guests on this panel today, we have Tim Vincent, he is the VP of Cognitive Systems Software at IBM, and Steve Roberts, who is the Offering Manager for Big Data on IBM Power Systems. Thanks so much for coming on theCUBE. >> Oh thank you very much. >> Thanks for having us. >> So we're now in this new era, this Cognitive Systems era. Can you set the scene for our viewers, and tell our viewers a little bit about what you do and why it's so important >> Okay, I'll give a bit of a background first, because James knows me from my previous role as, and you know I spent a lot of time in the data and analytics space. I was the CTO for Bob running the analytics group up 'til about a year and a half ago, and we spent a lot of time looking at what we needed to do from a data perspective and AI's perspective. And Bob, when he moved over to the Cognitive Systems, Bob Picciano who's my current boss, Bob asked me to move over and really start helping build, help to build out more of a software, and more of an AI focus, and a workload focus on how we thinking of the Power brand. So we spent a lot of time on that. So when you talk about cognitive systems or AI, what we're really trying to do is think about how you actually couple a combination of software, so co-optimize software space and the hardware space specific of what's needed for AI systems. Because the act of processing, the data processing, the algorithmic processing for AI is very, very different then what you would have for traditional data workload. So we're spending a lot of time thinking about how you actually co-optimize those systems so you can actually build a system that's really optimized for the demands of AI. >> And is this driven by customers, is this driven by just a trend that IBM is seeing? I mean how are you, >> It's a combination of both. >> So a lot of this is, you know, there's a lot of thought put into this before I joined the team. So there was a lot of good thinking from the Power brand, but it was really foresight on things like Moore's Law coming to an end of it's lifecycle right, and the ramifications to that. And at the same time as you start getting into things like narrow NATS and the floating point operations that you need to drive a narrow NAT, it was clear that we were hitting the boundaries. And then there's new technologies such as what Nvidia produces with with their GPUs, that are clearly advantageous. So there's a lot of trends that were comin' together the technical team saw, and at the same time we were seeing customers struggling with specific things. You know how to actually build a model if the training time is going to be weeks, and months, or let alone hours. And one of the scenarios I like to think about, I was probably showing my age a bit, but went to a school called University of Waterloo, and when I went to school, and in my early years, they had a batch based system for compilation and a systems run. You sit in the lab at night and you submit a compile job and the compile job will say, okay it's going to take three hours to compile the application, and you think of the productivity hit that has to you. And now you start thinking about, okay you've got this new skill in data scientists, which is really, really hard to find, they're very, very valuable. And you're giving them systems that take hours and weeks to do what the need to do. And you know, so they're trying to drive these models and get a high degree of accuracy in their predictions, and they just can't do it. So there's foresight on the technology side and there's clear demand on the customer side as well. >> Before the cameras were rolling you were talking about how the term data scientists and app developers is used interchangeably, and that's just wrong. >> And actually let's hear, 'cause I'd be in this whole position that I agree with it. I think it's the right framework. Data science is a team sport but application development has an even larger team sport in which data scientists, data engineers play a role. So, yeah we want to hear your ideas on the broader application development ecosystem, and where data scientists, and data engineers, and sort, fall into that broader spectrum. And then how IBM is supporting that entire new paradigm of application development, with your solution portfolio including, you know Power, AI on Power? >> So I think you used the word collaboration and team sport, and data science is a collaborative team sport. But you're 100% correct, there's also a, and I think it's missing to a great degree today, and it's probably limiting the actual value AI in the industry, and that's had to be data scientists and the application developers interact with each other. Because if you think about it, one of the models I like to think about is a consumer-producer model. Who consumes things and who produces things? And basically the data scientists are producing a specific thing, which is you know simply an AI model, >> Machine models, deep-learning models. >> Machine learning and deep learning, and the application developers are consuming those things and then producing something else, which is the application logic which is driving your business processes, and this view. So they got to work together. But there's a lot of confusion about who does what. You know you see people who talk with data scientists, build application logic, and you know the number of people who are data scientists can do that is, you know it exists, but it's not where the value, the value they bring to the equation. And the application developers developing AI models, you know they exist, but it's not the most prevalent form fact. >> But you know it's kind of unbalanced Tim, in the industry discussion of these role definitions. Quite often the traditional, you know definition, our sculpting of data scientist is that they know statistical modeling, plus data management, plus coding right? But you never hear the opposite, that coders somehow need to understand how to build statistical models and so forth. Do you think that the coders of the future will at least on some level need to be conversant with the practices of building,and tuning, or training the machine learning models or no? >> I think it's absolutely happen. And I will actually take it a step further, because again the data scientist skill is hard for a lot of people to find. >> Yeah. >> And as such is a very valuable skill. And what we're seeing, and we are actually one of the offerings that we're pulling out is something called PowerAI Vision, and it takes it up another level above the application developer, which is how do you actually really unlock the capabilities of AI to the business persona, the subject matter expert. So in the case of vision, how do you actually allow somebody to build a model without really knowing what a deep learning algorithm is, what kind of narrow NATS you use, how to do data preparation. So we build a tool set which is, you know effectively a SME tool set, which allows you to automatically label, it actually allows you to tag and label images, and then as you're tagging and labeling images it learns from that and actually it helps automate the labeling of the image. >> Is this distinct from data science experience on the one hand, which is geared towards the data scientists and I think Watson Analytics among your tools, is geared towards the SME, this a third tool, or an overlap. >> Yeah this is a third tool, which is really again one of the co-optimized capabilities that I talked about, is it's a tool that we built out that really is leveraging the combination of what we do in Power, the interconnect which we have with the GPU's, which is the NVLink interconnect, which gives us basically a 10X improvement in bandwidth between the CPU and GPU. That allows you to actually train your models much more quickly, so we're seeing about a 4X improvement over competitive technologies that are also using GPU's. And if we're looking at machine learning algorithms, we've recently come out with some technology we call Snap ML, which allows you to push machine learning, >> Snap ML, >> Yeah, it allows you to push machine learning algorithms down into the GPU's, and this is, we're seeing about a 40 to 50X improvement over traditional processing. So it's coupling all these capabilities, but really allowing a business persona to something specific, which is allow them to build out AI models to do recognition on either images or videos. >> Is there a pre-existing library of models in the solution that they can tap into? >> Basically it allows, it has a, >> Are they pre-trained? >> No they're not pre-trained models that's one of the differences in it. It actually has a set of models that allow, it picks for you, and actually so, >> Oh yes, okay. >> So this is why it helps the business persona because it's helping them with labeling the data. It's also helping select the best model. It's doing things under the covers to optimize things like hyper-parameter tuning, but you know the end-user doesn't have to know about all these things right? So you're tryin' to lift, and it comes back to your point on application developers, it allows you to lift the barrier for people to do these tasks. >> Even for professional data scientists, there may be a vast library of models that they don't necessarily know what is the best fit for the particular task. Ideally you should have, the infrastructure should recommend and choose, under various circumstances, the models, and the algorithms, the libraries, whatever for you for to the task, great. >> One extra feature of PowerAI Enterprises is that it does include a way to do a quick visual inspection of a models accuracy with a small data sample before you invest in scaling over a cluster or large data set. So you can get a visual indicator as to the, whether the models moving towards accuracy or you need to go and test an alternate model. >> So it's like a dashboard, of like Gini coefficients and all that stuff, okay. >> Exactly it gives you a snapshot view. And the other thing I was going to mention, you guys talked about application development, data scientists and of course a big message here at the conference is, you know data science meets big data and the work that Hortonworks is doing involving the notion of container support in YARN, GPU awareness in YARN, bringing data science experience, which you can include the PowerAI capability that Tim was talking about, as a workload tightly coupled with Hadoop. And this is where our Power servers are really built, not for just a monolithic building block that always has the same ratio of compute and storage, but fit for purpose servers that can address either GPU optimized workloads, providing the bandwidth enhancements that Tim talked about with the GPU, but also day-to-day servers, that can now support two terrabytes of memory, double the overall memory bandwidth on the box, 44 cores that can support up to 176 threads for parallelization of Spark workloads, Sequel workloads, distributed data science workloads. So it's really about choosing the combination of servers that can really mix this evolving workload need, 'cause a dupe isn't now just map produced, it's a multitude of workloads that you need to be able to mix and match, and bring various capabilities to the table for a compute, and that's where Power8, now Power9 has really been built for this kind of combination workloads where you can add acceleration where it makes sense, add big data, smaller core, smaller memory, where it makes sense, pick and choose. >> So Steve at this show, at DataWorks 2018 here in San Jose, the prime announcement, partnership announced between IBM and Hortonworks was IHAH, which I believe is IBM Host Analytics on Hortonworks. What I want to know is that solution that runs inside, I mean it runs on top of HDP 3.0 and so forth, is there any tie-in from an offering management standpoint between that and PowerAI so you can build models in the PowerAI environment, and then deploy them out to, in conjunction with the IHAH, is there, going forward, I mean just wanted to get a sense of whether those kinds of integrations. >> Well the same data science capability, data science experience, whether you choose to run it in the public cloud, or run it in private cloud monitor on prem, it's the same data science package. You know PowerAI has a set of optimized deep-learning libraries that can provide advantage on power, apply when you choose to run those deployments on our Power system alright, so we can provide additional value in terms of these optimized libraries, this memory bandwidth improvements. So really it depends upon the customer requirements and whether a Power foundation would make sense in some of those deployment models. I mean for us here with Power9 we've recently announced a whole series of Linux Power9 servers. That's our latest family, including as I mentioned, storage dense servers. The one we're showcasing on the floor here today, along with GPU rich servers. We're releasing fresh reference architecture. It's really to support combinations of clustered models that can as I mentioned, fit for purpose for the workload, to bring data science and big data together in the right combination. And working towards cloud models as well that can support mixing Power in ICP with big data solutions as well. >> And before we wrap, we just wanted to wrap. I think in the reference architecture you describe, I'm excited about the fact that you've commercialized distributed deep-learning for the growing number of instances where you're going to build containerized AI and distributing pieces of it across in this multi-cloud, you need the underlying middleware fabric to allow all those pieces to play together into some larger applications. So I've been following DDL because you've, research lab has been posting information about that, you know for quite a while. So I'm excited that you guys have finally commercialized it. I think there's a really good job of commercializing what comes out of the lab, like with Watson. >> Great well a good note to end on. Thanks so much for joining us. >> Oh thank you. Thank you for the, >> Thank you. >> We will have more from theCUBE's live coverage of DataWorks coming up just after this. (bright electronic music)

Published Date : Jun 20 2018

SUMMARY :

in the heart of Silicon he is the VP of Cognitive little bit about what you do and you know I spent a lot of time And at the same time as you how the term data scientists on the broader application one of the models I like to think about and the application developers in the industry discussion because again the data scientist skill So in the case of vision, on the one hand, which is geared that really is leveraging the combination down into the GPU's, and this is, that's one of the differences in it. it allows you to lift the barrier for the particular task. So you can get a visual and all that stuff, okay. and the work that Hortonworks is doing in the PowerAI environment, in the right combination. So I'm excited that you guys Thanks so much for joining us. Thank you for the, of DataWorks coming up just after this.

ENTITIES

Entity	Category	Confidence
James Kobielus	PERSON	0.99+
Rebecca Knight	PERSON	0.99+
Bob	PERSON	0.99+
Steve Roberts	PERSON	0.99+
Tim Vincent	PERSON	0.99+
IBM	ORGANIZATION	0.99+
James	PERSON	0.99+
Hortonworks	ORGANIZATION	0.99+
Bob Picciano	PERSON	0.99+
Steve	PERSON	0.99+
San Jose	LOCATION	0.99+
100%	QUANTITY	0.99+
44 cores	QUANTITY	0.99+
two guests	QUANTITY	0.99+
Tim	PERSON	0.99+
Silicon Valley	LOCATION	0.99+
10X	QUANTITY	0.99+
Nvidia	ORGANIZATION	0.99+
San Jose, California	LOCATION	0.99+
IBM Power Systems	ORGANIZATION	0.99+
Cognitive Systems Software	ORGANIZATION	0.99+
today	DATE	0.99+
three hours	QUANTITY	0.99+
one	QUANTITY	0.99+
both	QUANTITY	0.99+
Cognitive Systems	ORGANIZATION	0.99+
University of Waterloo	ORGANIZATION	0.98+
third tool	QUANTITY	0.98+
DataWorks Summit 2018	EVENT	0.97+
50X	QUANTITY	0.96+
PowerAI	TITLE	0.96+
DataWorks 2018	EVENT	0.93+
theCUBE	ORGANIZATION	0.93+
two terrabytes	QUANTITY	0.93+
up to 176 threads	QUANTITY	0.92+
40	QUANTITY	0.91+
about	DATE	0.91+
Power9	COMMERCIAL_ITEM	0.89+
a year and a half ago	DATE	0.89+
IHAH	ORGANIZATION	0.88+
4X	QUANTITY	0.88+
IHAH	TITLE	0.86+
DataWorks	TITLE	0.85+
Watson	ORGANIZATION	0.84+
Linux Power9	TITLE	0.83+
Snap ML	OTHER	0.78+
Power8	COMMERCIAL_ITEM	0.77+
Spark	TITLE	0.76+
first	QUANTITY	0.73+
PowerAI	ORGANIZATION	0.73+
One extra	QUANTITY	0.71+
DataWorks	ORGANIZATION	0.7+
day two	QUANTITY	0.69+
HDP 3.0	TITLE	0.68+
Watson Analytics	ORGANIZATION	0.65+
Power	ORGANIZATION	0.58+
NVLink	OTHER	0.57+
YARN	ORGANIZATION	0.55+
Hadoop	TITLE	0.55+
theCUBE	EVENT	0.53+
Moore	ORGANIZATION	0.45+
Analytics	ORGANIZATION	0.43+
Power9	ORGANIZATION	0.41+
Host	TITLE	0.36+

Mike McNamara, NetApp | DataWorks Summit 2018

>> Live, from San Jose, in the heart of Silicon Valley, it's theCUBE, covering DataWorks Summit 2018. Brought to you by Hortonworks. >> Welcome back everyone to theCUBE's live coverage of DataWorks here in San Jose, California. I'm your host, Rebecca Knight, along with my cohost James Kobielus. We are joined by Mike McNamara, he is the Senior Product and Solutions Marketing at NetApp. Thanks so much for coming on theCUBE. >> Thanks for having me. >> You're a first timer, >> Yes, >> So this is very exciting! >> Happy to be here. >> Welcome. >> Thanks. >> So, before the cameras were rolling, we were talking about how NetApp has been in this space for a while, but is really just starting to be recognized as a player. So, talk a little bit about your company's evolution. >> Sure. So, in the whole analytic space, is something NetApp was in a long time ago, and then sort got out of it, and then over the last several years, we've gotten back in, and we recognize it's a huge opportunity for data storage, data management, if you look at IDC Data, massive, massive market, but, the opportunity for us, is like you know what, they're mainly using a direct attached storage model where compute and storage is tied together. And now, with data just exploding, and growing like crazy, it's always been growing, but now it seems like it's just growing like crazy now, that, and customers wanting to have data on-prem, but also being able to move it off to the cloud, we're like, hey this is a great opportunity for us to come in with a solution that's, external storage solution that can come in and show them the benefits of have a more reliable, have an opportunity to move their data off to the cloud, we've got great solutions with that, so it's gone well, but it's been a little bit different, like at this show, a lot of the people, the data scientists, data engineers, some who know us, some still don't like, so, NetApp, what do you guys do, and so it's a little bit of an education, 'cause it's not a traditional buyer, if you will, we look at them as influencers, but it's only one influence than we traditionally have sold to say Vice President of Infrastructure, as an example, or maybe a Director of Storage Admin, but most of those folks are not here, so we're, this is just kind of a new market for us that we're making inroads. >> How do data scientists, or do they influence the purchase of storage solutions, or data management solutions? >> Sure, so they want to have access to the data, they want to be able analyze it quickly and effectively, they want to make sure it's always available, you know, at their fingertips so to speak. We can help them by giving them very fast, very reliable solutions, and specially with our software, they want to do for example, do some virtual clone of that data, and just do some testing on that without impacting their production data, we can do that in a snap, so we can make their lives a lot easier, so we can show them how, hey, mister data scientist, we can make your life a little easier-- >> Or miss data scientist. >> Or miss, we were talking about that, >> There are a lot of women in this field. >> Yeah, yeah. >> More than we realize, and they're great. >> So we can help you do your job better, and then, that, him or her can then influence who's making the purchase decisions. >> Yeah, training sets, test sets, validation sets of data for the machine learning and analytics development pipeline, yes, you need a solid storage infrastructure to do it right. >> Absolutely. >> So, when you're getting inside the head of your potential buyer here, the VP of Infrastructure, or data admin, what is it that you're hearing from those people most, what are their concerns, what keeps them up at night, and where do you come in? >> Yeah, so one of the concerns is, often times, you're, hey, how do I, do you have a cloud storage, connected to the cloud, you know, I'm doing things on-prem now, but is there a path, so that's a big one. And we, NetApp, pride ourselves on being the most cloud-connected, all flash storage in the industry. So, that's a big focus, big push for us. If you saw our marketing, it shows data authority for the hybrid cloud, so we really honestly do, whether it's with Google, or Azure, or AWS, we know our software runs in those environments, it also runs on-premises, but because it's the same on-tap software, we can move data between those environments. So, we get a real good storage, so we can you know, boom, check the box, we got you covered if you want to utilize the cloud, and I think the next piece of that is just from a protecting, protecting the data, you know, again I said data is just growing so much, I want to make sure it's always available, and we can back it up and all that, and that's been a core, core strength, versus like a lot of these traditional solutions they've been using, these direct attached models, they just don't have anywhere near the enterprise-grade data protection that NetApp has always prided itself on, over many decades now. And so, we can help them do that, and quite honestly, a lot of people think, well you know, you guys are external storage, how do you compare versus direct attached storage from our total cost, that's another one. I can tell you definitively, and we've got data to back it up from a total cost of ownership point of view, because of the fact that, of the advantages we bring from, up-time, and you know from RAID, but you know, in a Hadoop environment, often times there's three copies of data. With our solution, a good piece of software, there's only one copy of your data, so have three versus one is a big saving, but even what we do with the data, compressing it, and compacting it, a lot of benefits. So, we do have honest to goodness, outwards to 50% better total cost of ownership, versus a DAS model. >> Do you use machine learning within your portfolio? I'm hearing of more stories, >> Great question, yeah. >> Incorporating machine learning to automate or facilitate more of the functions in the data protection or data management life-cycle. >> Yeah, that's a great question, and we do use, so we've got a piece of software which we call Active IQ, it was referred to as Ace Update, you may have, it may ring a bell, but to answer your question, so we've got thousands of thousands of NetApp systems out there, and those customers that allow us, we have, think of it as kind of a call home feature, where we're getting data back from all our installed customers, and then we will go and do predictive analytics, and do some machine learning on that data, so then we can go back to those customers and say, hey you know what, you've got this volume that's unprotected, you should protect this, or we can show them, if you were to move that data off into our cloud environment, here's maybe performance you would see, so we do do a lot of that predictive-- >> Predictive performance assessment, it sounds like there's anomaly detection in there as well. >> Anomaly as well, letting them know, hey, you know, it's time for this drive, it may fail on you, let's ship you out a new drive now before it happens, so yeah, a lot of, from an analytics, predictive analysis going on. And you know, it's a huge benefit to our customers. Huge benefit. >> I know you're also doing a push toward artificial intelligence, so I'd like to hear more about that, and then also, if there's any best practices that have emerged. >> Sure, sure, so yes. That is another big area, so it's kind of a logical progression from where we were, if you will, in the analytics space, data lakes, but now moving into artificial intelligence, which has always been around, but it's really taking more of a more prominent role, I mean just a quick fun fact, I read that, you know that at the royal wedding that recently happened, did you know that Amazon used artificial intelligence to help us, the TV viewer, identify who the guests were. >> Ooh. >> So, you know it's like, it's everywhere, right? And so for us, we see that trend, a ton of data that needs to be managed, and so we kind of look at it from the edge to the core, to the cloud, those three, not pillars, but directional ways, taking data from IOT centers at the edge, bring it into the core, doing training, and then if the customer so chooses, out to the cloud. So, yeah it is a big push for us now, and we're going a lot with Nvidia, is a key partner with us. >> Really? This is a bit futuristic, but I can see a role going forward for AI to look into large data volumes, like video objects, to find things like faces, and poses and gestures and so forth, and see, to use that intelligence to be able to reduce the data sets down to where it's reduced, to de-duplicate, so that you can use less storage and then you can re-construct the original video objects or whatever going forward, I mean as a potential use of AI within the storage efficiency. >> Yep, yeah you're right, and that again, like in the analytic space, how we roll our in-line efficiency capabilities and data protection, is you know, very important, and then being able to move the data off into the cloud, if the customer so chooses, or just wants to use the cloud. So yeah, some of the same benefits from cloud connectivity, performance and efficiency that analytics apply certainly to AI. You know, another fun fact too about AI, which might help us, you and I living in the Boston area, is that I've read IBM has a patent out to use AI in traffic signaling, so in conjunction with cameras, to get AI, so hopefully that, you know, that works well it could alleviate-- >> Lead them out of the Tip O'Neill tunnel easy. (laughing) >> You got it maybe worse in D.C. (laughing) >> I'd like to hear though, if you have any best practices that with this moving into AI, how are you experimenting with it, and how are you finding it used most efficiently and effectively. >> Yeah, so I think one way we are eating our own dog food, so to speak, in that we're using it internally, we're using it on our customers' data, as I was explaining to help look at trends, and do analysis. So that's one, and then it's other things, just you know, partnering with companies like Nvidia as well and coming out with a joint solution, so we're doing work with them on different solution areas. >> Great, great. Well, Mike thanks so much for coming on theCUBE, >> Thanks for having me! >> It was fun having you. >> You survived! >> Yes! (laughs) >> We'll look forward to many more CUBE conversations. >> Great to hear from NetApp, you're very much in the game. >> Indeed, indeed. >> Alright, thank you very much. >> I'm Rebecca Knight for James Kobielus, we will have more from theCUBE's coverage of DataWorks coming up in just a little bit. (electronic music)

Published Date : Jun 20 2018

SUMMARY :

Brought to you by Hortonworks. he is the Senior Product and So, before the cameras were rolling, and we recognize it's a huge opportunity so we can show them how, More than we realize, So we can help you do your job better, yes, you need a solid storage boom, check the box, we got you covered more of the functions it sounds like there's anomaly And you know, it's a huge so I'd like to hear you know that at the royal from the edge to the core, so that you can use less so hopefully that, you Lead them out of the You got it maybe worse in D.C. that with this moving into AI, how are you so to speak, in that for coming on theCUBE, We'll look forward to Great to hear from NetApp, we will have more from theCUBE's coverage

ENTITIES

Entity	Category	Confidence
James Kobielus	PERSON	0.99+
Rebecca Knight	PERSON	0.99+
Mike McNamara	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Nvidia	ORGANIZATION	0.99+
Mike	PERSON	0.99+
50%	QUANTITY	0.99+
Amazon	ORGANIZATION	0.99+
San Jose	LOCATION	0.99+
Silicon Valley	LOCATION	0.99+
Google	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
San Jose, California	LOCATION	0.99+
D.C.	LOCATION	0.99+
Boston	LOCATION	0.99+
one copy	QUANTITY	0.99+
three	QUANTITY	0.99+
one	QUANTITY	0.98+
theCUBE	ORGANIZATION	0.98+
DataWorks Summit 2018	EVENT	0.98+
three copies	QUANTITY	0.98+
NetApp	ORGANIZATION	0.97+
Hortonworks	ORGANIZATION	0.94+
Ace Update	TITLE	0.91+
IDC	ORGANIZATION	0.88+
Azure	ORGANIZATION	0.86+
thousands of thousands	QUANTITY	0.86+
NetApp	TITLE	0.82+
RAID	TITLE	0.8+
DataWorks	EVENT	0.76+
Vice President of Infrastructure	PERSON	0.71+
Active IQ	TITLE	0.69+
one influence	QUANTITY	0.69+
a lot of the people	QUANTITY	0.66+
of women	QUANTITY	0.66+
last	DATE	0.65+
years	DATE	0.63+
CUBE	ORGANIZATION	0.56+
ton	QUANTITY	0.54+
first	QUANTITY	0.53+
DataWorks	TITLE	0.51+
NetApp	QUANTITY	0.4+

Scott Gnau, Hortonworks | DataWorks Summit 2018

>> Live from San Jose, in the heart of Silicone Valley, it's theCUBE. Covering Datawork Summit 2018. Brought to you by Hortonworks. >> Welcome back to theCUBE's live coverage of Dataworks Summit here in San Jose, California. I'm your host, Rebecca Knight, along with my cohost James Kobielus. We're joined by Scott Gnau, he is the chief technology officer at Hortonworks. Welcome back to theCUBE, Scott. >> Great to be here. >> It's always fun to have you on the show. So, you have really spent your entire career in the data industry. I want to start off at 10,000 feet, and just have you talk about where we are now, in terms of customer attitudes, in terms of the industry, in terms of where customers feel, how they're dealing with their data and how they're thinking about their approach in their business strategy. >> Well I have to say, 30 plus years ago starting in the data field, it wasn't as exciting as it is today. Of course, I always found it very exciting. >> Exciting means nerve-wracking. Keep going. >> Or nerve-wracking. But you know, we've been predicting it. I remember even you know, 10, 15 years ago before big data was a thing, it's like oh all this data's going to come, and it's going to be you know 10x what it is. And we were wrong. It was like 5000x, you know what it is. And I think the really exciting part is that data really used to be relegated frankly, to big companies as a derivative work of ERP systems, and so on and so forth. And while that's very interesting, and certainly enabled a whole level of productivity for industry, when you compare that to all of the data flying around everywhere today, whether it be Twitter feeds and even doing live polls, like we did in the opening session today. Data is just being created everywhere. And the same thing applies to that data that applied to the ERP data of old. And that is being able to harness, manage and understand that data is a new business creating opportunity. And you know, we were with some analysts the other day, and I think one of the more quoted things that came out of that when I was speaking with them, was really, like railroads and shipping in the 1800s and oil in the 1900s, data really is the wealth creator of this century. And so that creates a very nerve-wracking environment. It also creates an environment, a very agile and very important technological breakthroughs that enable those things to be turned into wealth. >> So thinking about that, in terms of where we are at this point in time and on the main stage this morning someone had likened it to the interstate highway system, that really revolutionized transportation, but also commerce. >> I love that actually. I may steal it in some of my future presentations. >> That's good but we'll know where you pilfered it. >> Well perhaps if data is oil the edge, in containerized applications and piping data, you know, microbursts of data across the internet of things, is sort of like the new fracking. You know, you're being able to extract more of this precious resource from the territory. >> Hopefully not quite as damaging to the environment. >> Maybe not. I'm sorry for environmentalist if I just offended you, I apologize. >> But I think you know, all of those analogies are very true, and I particularly like the interstate one this morning. Because when I think about what we've done in our core http platform, and I know Arun was here talking about all the great advances that we built into this, the kind of the core hadoop platform. Very traditional. Store data, analyze data but also bring in new kinds of algorithms, rapid innovation and so on. That's really great but that's kind of half of the story. In a device connected world, in a consumer centric world, capturing data at the edge, moving and processing data at the edge is the new normal, right? And so just like the interstate highway system actually created new ways of commerce because we could move people and things more efficiently, moving data and processing data more efficiently is kind of the second part of the opportunity that we have in this new deluge of data. And that's really where we've been with our Hortonworks data flow. And really saying that the complete package of managing data from origination at the edge all the way through analytic to decision that's triggered back at the edge is like the holy grail, right? And building a technology for that footprint, is why I'm certainly excited today. It's not the caffeine, it's just the opportunity of making all of that work. >> You know, one of the, I think the key announcement for me at this show, that you guys made on HDP 3.0 was containerization of more of the capabilities of your distributed environment so that these capabilities, in terms of processing. First of all, capturing and analyzing an moving that data, can be pushed closer to the end points. Can you speak a bit Scott, about this new capability or this containerization support? Within HDP 3.0 but really in your broader portfolio and where you're going with that in terms of addressing edge applications perhaps, autonomous vehicles or you know, whatever you might put into a new smart phone or whatever you put at the edge. Describe the potential containerizations to sort of break this ecosystem wide open. >> Yeah, I think there are a couple of aspects to containerization and by the way, we're like so excited about kind of the cloud first, containerized HDP 3.0 that we launched here today. There's a lot of great tech that our customers have been clamoring for that they can take advantage of. And it's really just the beginning, which again is part of the excitement of being in the technology space and certainly being part of Hortonworks. So containerization affords a couple of things. Certainly, agility. Agility in deploying applications. So, you know for 30 years we've built these enterprise software stacks that were very integrated, hugely complicated systems that could bring together multiple different applications, different workloads and manage all that in a multi-tendency kind of environment. And that was because we had to do that, right? Servers were getting bigger, they were more powerful but not particularly well distributed. Obviously in a containerized world, you now turn that whole paradigm on its head and you say, you know what? I'm just going to collect these three microservices that I need to do this job. I can isolate them. I can have them run in a server-less technology. I can actually allocate in the cloud servers to go run, and when they're done they go away. And I don't pay for them anymore. So thinking about kind of that from a software development deployment implementation perspective, there huge implications but the real value for customers is agility, right? I don't have to wait until next year to upgrade my enterprise software stack to take advantage of this new algorithm. I can simply isolate it inside of a container, have it run, and have it go away. And get the answer, right? And so when I think about, and a number of our keynotes this morning were talking about just kind of the exponential rate of change, this is really the net new norm. Because the only way we can do things faster, is in fact to be able to provide this. >> And it's not just microservices. Also orchestrating them through Kubernetes, and so forth, so they can be. >> Sure. That's the how versus yeah. >> Quickly deployed as an ensemble and then quickly de-provisioned when you don't need them anymore. >> Yeah so then there's obviously the cost aspect, right? >> Yeah. >> So if you're going to run a whole bunch of stuff or even if you have something as mundane as a really big merge join inside of hive. Let me spin up a thousand extra containers to go do that big thing, and then have them go away when it's done. >> And oh, by the way, you'll be deployed on. >> And only pay for it while I'm using it. >> And then you can possibly distribute those containers across different public clouds depending on what's most cost effective at any point in time Azure or AWS or whatever it might be. >> And I tease with Arun, you know the only thing that we haven't solved is for the speed of light, but we're working on it. >> In talking about how this warp speed change, being the new norm, can you talk about some of the most exciting use cases you've seen in terms of the customers and clients that are using Hortonworks in the coolest ways. >> Well I mean obviously autonomous vehicles is one that we all captured all of our imagination. 'Cause we understand how that works. But it's a perfect use case for this kind of technology. But the technology also applies in fraud detection and prevention. It applies in healthcare management, in proactive personalized medicine delivery, and in generating better outcomes for treatment. So, you know, all across. >> It will bind us in every aspect of our lives including the consumer realm increasingly, yeah. >> Yeah, all across the board. And you know one of the things that really changed, right, is well a couple things. A lot of bandwidth so you can start to connect these things. The devices themselves are particularly smart, so you don't any longer have to transfer all the data to a mainframe and then wait three weeks, sorry, wait three weeks for your answer and then come back. You can have analytic models running on and edge device. And think about, you know, that is really real time. And that actually kind of solves for the speed of light. 'Cause you're not waiting for those things to go back and forth. So there are a lot of new opportunities and those architectures really depend on some of the core tenets of ultimately containerization stateless application deployment and delivery. And they also depend on the ability to create feedback loops to do point-to-point and peer kinds of communication between devices. This is a whole new world of how data get moved and how the decisions around date movement get made. And certainly that's what we're excited about, building with the core components. The other implication of all of this, and we've know each other for a long time. Data has gravity. Data movements expensive. It takes time, frankly, you have to pay for the bandwidth and all that kind of stuff. So being able to play the data where it lies becomes a lot more interesting from an application portability perspective and with all of these new sensors, devices and applications out there, a lot more data is living its entire lifecycle in the cloud. And so being able to create that connective tissue. >> Or as being as terralexical on the edge. >> And even on the edge. >> In with machine learn, let me just say, butt in a second. One of the areas that we're focusing on increasingly in Wikibot in terms of our focus on machine learning at the edge, is more and more machine learning frameworks are coming into the browser world. Javascript for the most like tenser flow JS, you know more of this inferencing and training is going to happen inside your browser. That blows a lot of people's minds. It may not be heavy hitting machine learning, but it'll be good enough for a lot of things that people do in their normal life. Where you don't want to round trip back to the cloud. It's all happening right there, in you know, Chrome or whatever you happen to be using. >> Yeah and so the point being now, you know when I think about the early days, talking about scalability, I remember ship being my first one terabyte database. And then the first 10 terabyte database. Yeah, it doesn't sound very exciting. When I think about scalability of the future, it's really going to, scalability is not going to be defined as petabytes or exabytes under management. It's really going to be defined as petabytes or exabytes affected across a grid of storage and processing devices. And that's a whole new technology paradigm, and really that's kind of the driving force behind what we've been building and what we've been talking about at this conference. >> Excellent. >> So when you're talking about these things. I mean how much, are the companies themselves prepared, and do they have the right kind of talent to use the kinds of insights that you're able to extract? And then act on them in the real time. 'Cause you're talking about how this is saving a lot of the waiting around time. So is this really changing the way business gets done, and do companies have the talent to execute? >> Sure. I mean it's changing the way business gets done. We showed a quote on stage this morning from the CEO of Marriott, right? So, I think there a couple of pieces. One is business are increasingly data driven and business strategy is increasingly the data strategy. And so it starts from the top, kind of setting that strategy and understanding the value of that asset and how that needs to be leveraged to drive new business. So that's kind of one piece. And you know, obviously there are more and more folks kind of coming to the realization that that is important. The other thing that's been helpful is, you know, as with any new technology there's always kind of the startup shortage of resource and people start to spool up and learn. You know the really good news, and for the past 10 years I've been working with a number of different university groups. Parents are actually going to universities and demanding that the curriculum include data, and processing and big data and all of these technologies. Because they know that their children educated in that kind of a world, number one, they're going to have a fun job to go to everyday. 'Cause it's going to be something different everyday. But number two they're going to be employed for life. (laughing) >> Yeah. >> They will be solvent. >> Frankly the demand has actually created a catch up in supply that we're seeing. And of course, you know, as tools start to get more mature and more integrated, they also become a little bit easier to use. You know, less, there's a little bit easier deployment and so on. So a combination of, I'm seeing a really good supply, there really, obviously we invest in education through the community. And then frankly, the education system itself, and folks saying this is really the hot job of the next century. You know, I can be the new oil barren. Or I can be the new railroad captain. It's actually creating more supply which is also very helpful. >> Data's the heart of what I call the new stem cell. It's science, technology, engineering, mathematics that you want to implant in the brains of the young as soon as possible. I hear ya. >> Yeah, absolutely. >> Well Scott thanks so much for coming on. But I want to first also, we can't let you go without the fashion statement. You arrived on set wearing it. >> The elephants. >> I mean it was quite a look. >> Well I did it because then you couldn't see I was sweating on my brow. >> Oh please, no, no, no. >> 'Cause I was worried about this tough interview. >> You know one of the things I love about your logo, and I'll just you know, sounds like I'm fawning. The elephant is a very intelligent animal. >> It is indeed. >> My wife's from Indonesia. I remember going back one time they had Asian elephants at a one of these safari parks. And watching it perform, and then my son was very little then. The elephant is a very sensitive, intelligent animal. You don't realize 'till you're up close. They pick up all manner of social cues. I think it's an awesome symbol for a company that's all about data driven intelligence. >> The elephant never forgets. >> Yeah. >> That's what we know. >> That's right we never forget. >> Him forget 'cause he's got a brain. Or she, I'm sorry. He or she has a brain. >> And it's data driven. >> Yeah. >> Thanks very much. >> Great. Well thanks for coming on theCUBE. I'm Rebecca Knight for James Kobielus. We will have more coming up from Dataworks just after this. (upbeat music)

Published Date : Jun 20 2018

SUMMARY :

in the heart of Silicone Valley, he is the chief technology in terms of the industry, in the data field, Exciting means nerve-wracking. and shipping in the 1800s and on the main stage this I love that actually. where you pilfered it. is sort of like the new fracking. to the environment. I apologize. And really saying that the of more of the capabilities of the cloud servers to go run, and so forth, so they can be. and then quickly de-provisioned and then have them go away when it's done. And oh, by the way, And then you can possibly is for the speed of light, Hortonworks in the coolest ways. But the technology also including the consumer and how the decisions around terralexical on the edge. One of the areas that we're Yeah and so the point being now, the talent to execute? and demanding that the And of course, you know, in the brains of the young the fashion statement. then you couldn't see 'Cause I was worried and I'll just you know, and then my son was very little then. He or she has a brain. for coming on theCUBE.

ENTITIES

Entity	Category	Confidence
Rebecca Knight	PERSON	0.99+
James Kobielus	PERSON	0.99+
Scott	PERSON	0.99+
Hortonworks	ORGANIZATION	0.99+
Scott Gnau	PERSON	0.99+
Indonesia	LOCATION	0.99+
three weeks	QUANTITY	0.99+
30 years	QUANTITY	0.99+
10x	QUANTITY	0.99+
San Jose	LOCATION	0.99+
Marriott	ORGANIZATION	0.99+
San Jose, California	LOCATION	0.99+
1900s	DATE	0.99+
1800s	DATE	0.99+
10,000 feet	QUANTITY	0.99+
Silicone Valley	LOCATION	0.99+
one piece	QUANTITY	0.99+
Dataworks Summit	EVENT	0.99+
AWS	ORGANIZATION	0.99+
Chrome	TITLE	0.99+
theCUBE	ORGANIZATION	0.99+
next year	DATE	0.98+
next century	DATE	0.98+
today	DATE	0.98+
30 plus years ago	DATE	0.98+
Javascript	TITLE	0.98+
second part	QUANTITY	0.98+
Twitter	ORGANIZATION	0.98+
first	QUANTITY	0.97+
Dataworks	ORGANIZATION	0.97+
One	QUANTITY	0.97+
5000x	QUANTITY	0.97+
Datawork Summit 2018	EVENT	0.96+
HDP 3.0	TITLE	0.95+
one	QUANTITY	0.95+
this morning	DATE	0.95+
HDP 3.0	TITLE	0.94+
three microservices	QUANTITY	0.93+
first one terabyte	QUANTITY	0.93+
First	QUANTITY	0.92+
DataWorks Summit 2018	EVENT	0.92+
JS	TITLE	0.9+
Asian	OTHER	0.9+
3.0	TITLE	0.87+
one time	QUANTITY	0.86+
a thousand extra containers	QUANTITY	0.84+
this morning	DATE	0.83+
15 years ago	DATE	0.82+
Arun	PERSON	0.81+
this century	DATE	0.81+
10,	DATE	0.8+
first 10 terabyte	QUANTITY	0.79+
couple	QUANTITY	0.72+
Azure	ORGANIZATION	0.7+
Kubernetes	TITLE	0.7+
theCUBE	EVENT	0.66+
parks	QUANTITY	0.59+
a second	QUANTITY	0.58+
past 10 years	DATE	0.57+
number two	QUANTITY	0.56+
Wikibot	TITLE	0.55+
HDP	COMMERCIAL_ITEM	0.54+
rd.	QUANTITY	0.48+

Ram Venkatesh, Hortonworks & Sudhir Hasbe, Google | DataWorks Summit 2018

>> Live from San Jose, in the heart of Silicon Valley, it's theCUBE, covering DataWorks Summit 2018. Brought to you by HortonWorks. >> We are wrapping up Day One of coverage of Dataworks here in San Jose, California on theCUBE. I'm your host, Rebecca Knight, along with my co-host, James Kobielus. We have two guests for this last segment of the day. We have Sudhir Hasbe, who is the director of product management at Google and Ram Venkatesh, who is VP of Engineering at Hortonworks. Ram, Sudhir, thanks so much for coming on the show. >> Thank you very much. >> Thank you. >> So, I want to start out by asking you about a joint announcement that was made earlier this morning about using some Hortonworks technology deployed onto Google Cloud. Tell our viewers more. >> Sure, so basically what we announced was support for the Hortonworks DataPlatform and Hortonworks DataFlow, HDP and HDF, running on top of the Google Cloud Platform. So this includes deep integration with Google's cloud storage connector layer as well as it's a certified distribution of HDP to run on the Google Cloud Platform. >> I think the key thing is a lot of our customers have been telling us they like the familiar environment of Hortonworks distribution that they've been using on-premises and as they look at moving to cloud, like in GCP, Google Cloud, they want the similar, familiar environment. So, they want the choice to deploy on-premises or Google Cloud, but they want the familiarity of what they've already been using with Hortonworks products. So this announcement actually helps customers pick and choose like whether they want to run Hortonworks distribution on-premises, they want to do it in cloud, or they wat to build this hybrid solution where the data can reside on-premises, can move to cloud and build these common, hybrid architecture. So, that's what this does. >> So, HDP customers can store data in the Google Cloud. They can execute ephemeral workloads, analytic workloads, machine learning in the Google Cloud. And there's some tie-in between Hortonworks's real-time or low latency or streaming capabilities from HDF in the Google Cloud. So, could you describe, at a full sort of detail level, the degrees of technical integration between your two offerings here. >> You want to take that? >> Sure, I'll handle that. So, essentially, deep in the heart of HDP, there's the HDFS layer that includes Hadoop compatible file system which is a plug-able file system layer. So, what Google has done is they have provided an implementation of this API for the Google Cloud Storage Connector. So this is the GCS Connector. We've taken the connector and we've actually continued to refine it to work with our workloads and now Hortonworks has actually bundling, packaging, and making this connector be available as part of HDP. >> So bilateral data movement between them? Bilateral workload movement? >> No, think of this as being very efficient when our workloads are running on top of GCP. When they need to get at data, they can get at data that is in the Google Cloud Storage buckets in a very, very efficient manner. So, since we have fairly deep expertise on workloads like Apache Hive and Apache Spark, we've actually done work in these workloads to make sure that they can run efficiently, not just on HDFS, but also in the cloud storage connector. This is a critical part of making sure that the architecture is actually optimized for the cloud. So, at our skill and our customers are moving their workloads from on-premise to the cloud, it's not just functional parity, but they also need sort of the operational and the cost efficiency that they're looking for as they move to the cloud. So, to do that, we need to enable these fundamental disaggregated storage pattern. See, on-prem, the big win with Hadoop was we could bring the processing to where the data was. In the cloud, we need to make sure that we work well when storage and compute are disaggregated and they're scaled elastically, independent of each other. So this is a fairly fundamental architectural change. We want to make sure that we enable this in a first-class manner. >> I think that's a key point, right. I think what cloud allows you to do is scale the storage and compute independently. And so, with storing data in Google Cloud Storage, you can like scale that horizontally and then just leverage that as your storage layer. And the compute can independently scale by itself. And what this is allowing customers of HDP and HDF is store the data on GCP, on the cloud storage, and then just use the scale, the compute side of it with HDP and HDF. >> So, if you'll indulge me to a name, another Hortonworks partner for just a hypothetical. Let's say one of your customers is using IBM Data Science Experience to do TensorFlow modeling and training, can they then inside of HDP on GCP, can they use the compute infrastructure inside of GCP to do the actual modeling which is more compute intensive and then the separate decoupled storage infrastructure to do the training which is more storage intensive? Is that a capability that would available to your customers? With this integration with Google? >> Yeah, so where we are going with this is we are saying, IBM DSX and other solutions that are built on top of HDP, they can transparently take advantage of the fact that they have HDP compute infrastructure to run against. So, you can run your machine learning training jobs, you can run your scoring jobs and you can have the same unmodified DSX experience whether you're running against an on-premise HDP environment or an in-cloud HDP environment. Further, that's sort of the benefit for partners and partner solutions. From a customer standpoint, the big value prop here is that customers, they're used to securing and governing their data on-prem in their particular way with HDP, with Apache Ranger, Atlas, and so forth. So, when they move to the cloud, we want this experience to be seamless from a management standpoint. So, from a data management standpoint, we want all of their learning from a security and governance perspective to apply when they are running in Google Cloud as well. So, we've had this capability on Azure and on AWS, so with this partnership, we are announcing the same type of deep integration with GCP as well. >> So Hortonworks is that one pane of glass across all your product partners for all manner of jobs. Go ahead, Rebecca. >> Well, I just wanted to ask about, we've talked about the reason, the impetus for this. With the customer, it's more familiar for customers, it offers the seamless experience, But, can you delve a little bit into the business problems that you're solving for customers here? >> A lot of times, our customers are at various points on their cloud journey, that for some of them, it's very simple, they're like there's a broom coming by and the datacenter is going away in 12 months and I need to be in the cloud. So, this is where there is a wholesale movement of infrastructure from on-premise to the cloud. Others are exploring individual business use cases. So, for example, one of our large customers, a travel partner, so they are exploring their new pricing model and they want to roll out this pricing model in the cloud. They have on-premise infrastructure, they know they have that for a while. They are spinning up new use cases in the cloud typically for reasons of agility. So, if you, typically many of our customers, they operate large, multi-tenant clusters on-prem. That's nice for, so a very scalable compute for running large jobs. But, if you want to run, for example, a new version of Spark, you have to upgrade the entire cluster before you can do that. Whereas in this sort of model, what they can say is, they can bring up a new workload and just have the specific versions and dependency that it needs, independent of all of their other infrastructure. So this gives them agility where they can move as fast as... >> Through the containerization of the Spark jobs or whatever. >> Correct, and so containerization as well as even spinning up an entire new environment. Because, in the cloud, given that you have access to elastic compute resources, they can come and go. So, your workloads are much more independent of the underlying cluster than they are on-premise. And this is where sort of the core business benefits around agility, speed of deployment, things like that come into play. >> And also, if you look at the total cost of ownership, really take an example where customers are collecting all this information through the month. And, at month end, you want to do closing of books. And so that's a great example where you want ephemeral workloads. So this is like do it once in a month, finish the books and close the books. That's a great scenario for cloud where you don't have to on-premises create an infrastructure, keep it ready. So that's one example where now, in the new partnership, you can collect all the data through the on-premises if you want throughout the month. But, move that and leverage cloud to go ahead and scale and do this workload and finish the books and all. That's one, the second example I can give is, a lot of customers collecting, like they run their e-commerce platforms and all on-premises, let's say they're running it. They can still connect all these events through HDP that may be running on-premises with Kafka and then, what you can do is, in-cloud, in GCP, you can deploy HDP, HDF, and you can use the HDF from there for real-time stream processing. So, collect all these clickstream events, use them, make decisions like, hey, which products are selling better?, should we go ahead and give?, how many people are looking at that product?, or how many people have bought it?. That kind of aggregation and real-time at scale, now you can do in-cloud and build these hybrid architectures that are there. And enable scenarios where in past, to do that kind of stuff, you would have to procure hardware, deploy hardware, all of that. Which all goes away. In-cloud, you can do that much more flexibly and just use whatever capacity you have. >> Well, you know, ephemeral workloads are at the heart of what many enterprise data scientists do. Real-world experiments, ad-hoc experiments, with certain datasets. You build a TensorFlow model or maybe a model in Caffe or whatever and you deploy it out to a cluster and so the life of a data scientist is often nothing but a stream of new tasks that are all ephemeral in their own right but are part of an ongoing experimentation program that's, you know, they're building and testing assets that may be or may not be deployed in the production applications. That's you know, so I can see a clear need for that, well, that capability of this announcement in lots of working data science shops in the business world. >> Absolutely. >> And I think coming down to, if you really look at the partnership, right. There are two or three key areas where it's going to have a huge advantage for our customers. One is analytics at-scale at a lower cost, like total cost of ownership, reducing that, running at-scale analytics. That's one of the big things. Again, as I said, the hybrid scenarios. Most customers, enterprise customers have huge deployments of infrastructure on-premises and that's not going to go away. Over a period of time, leveraging cloud is a priority for a lot of customers but they will be in these hybrid scenarios. And what this partnership allows them to do is have these scenarios that can span across cloud and on-premises infrastructure that they are building and get business value out of all of these. And then, finally, we at Google believe that the world will be more and more real-time over a period of time. Like, we already are seeing a lot of these real-time scenarios with IoT events coming in and people making real-time decisions. And this is only going to grow. And this partnership also provides the whole streaming analytics capabilities in-cloud at-scale for customers to build these hybrid plus also real-time streaming scenarios with this package. >> Well it's clear from Google what the Hortonworks partnership gives you in this competitive space, in the multi-cloud space. It gives you that ability to support hybrid cloud scenarios. You're one of the premier public cloud providers and we all know about. And clearly now that you got, you've had the Hortonworks partnership, you have that ability to support those kinds of highly hybridized deployments for your customers, many of whom I'm sure have those requirements. >> That's perfect, exactly right. >> Well a great note to end on. Thank you so much for coming on theCUBE. Sudhir, Ram, that you so much. >> Thank you, thanks a lot. >> Thank you. >> I'm Rebecca Knight for James Kobielus, we will have more tomorrow from DataWorks. We will see you tomorrow. This is theCUBE signing off. >> From sunny San Jose. >> That's right.

Published Date : Jun 20 2018

SUMMARY :

in the heart of Silicon Valley, for coming on the show. So, I want to start out by asking you to run on the Google Cloud Platform. and as they look at moving to cloud, in the Google Cloud. So, essentially, deep in the heart of HDP, and the cost efficiency is scale the storage and to do the training which and you can have the same that one pane of glass With the customer, it's and just have the specific of the Spark jobs or whatever. of the underlying cluster and then, what you can and so the life of a data that the world will be And clearly now that you got, Sudhir, Ram, that you so much. We will see you tomorrow.

ENTITIES

Entity	Category	Confidence
James Kobielus	PERSON	0.99+
Rebecca Knight	PERSON	0.99+
Rebecca	PERSON	0.99+
two	QUANTITY	0.99+
Sudhir	PERSON	0.99+
Ram Venkatesh	PERSON	0.99+
San Jose	LOCATION	0.99+
HortonWorks	ORGANIZATION	0.99+
Sudhir Hasbe	PERSON	0.99+
Google	ORGANIZATION	0.99+
Hortonworks	ORGANIZATION	0.99+
Silicon Valley	LOCATION	0.99+
two guests	QUANTITY	0.99+
San Jose, California	LOCATION	0.99+
DataWorks	ORGANIZATION	0.99+
tomorrow	DATE	0.99+
Ram	PERSON	0.99+
AWS	ORGANIZATION	0.99+
one example	QUANTITY	0.99+
one	QUANTITY	0.99+
two offerings	QUANTITY	0.98+
12 months	QUANTITY	0.98+
One	QUANTITY	0.98+
Day One	QUANTITY	0.98+
DataWorks Summit 2018	EVENT	0.97+
IBM	ORGANIZATION	0.97+
second example	QUANTITY	0.97+
Google Cloud Platform	TITLE	0.96+
Atlas	ORGANIZATION	0.96+
Google Cloud	TITLE	0.94+
Apache Ranger	ORGANIZATION	0.92+
three key areas	QUANTITY	0.92+
Hadoop	TITLE	0.91+
Kafka	TITLE	0.9+
theCUBE	ORGANIZATION	0.88+
earlier this morning	DATE	0.87+
Apache Hive	ORGANIZATION	0.86+
GCP	TITLE	0.86+
one pane	QUANTITY	0.86+
IBM Data Science	ORGANIZATION	0.84+
Azure	TITLE	0.82+
Spark	TITLE	0.81+
first	QUANTITY	0.79+
HDF	ORGANIZATION	0.74+
once in a month	QUANTITY	0.73+
HDP	ORGANIZATION	0.7+
TensorFlow	OTHER	0.69+
Hortonworks DataPlatform	ORGANIZATION	0.67+
Apache Spark	ORGANIZATION	0.61+
GCS	OTHER	0.57+
HDP	TITLE	0.5+
DSX	TITLE	0.49+
Cloud Storage	TITLE	0.47+

Stephanie McReynolds, Alation | DataWorks Summit 2018

>> Live from San Jose, in the heart of Silicon Valley, it's theCUBE, covering DataWorks Summit 2018, brought to you by Hortonworks. >> Welcome back to theCUBE's live coverage of DataWorks here in San Jose, California. I'm your host, Rebecca Knight, along with my co-host, James Kobielus. We're joined by Stephanie McReynolds. She is the Vice President of Marketing at Alation. Thanks so much for, for returning to theCUBE, Stephanie. >> Thank you for having me again. >> So, before the cameras were rolling, we were talking about Kevin Slavin's talk on the main stage this morning, and talking about, well really, a background to sort of this concern about AI and automation coming to take people's jobs, but really, his overarching point was that we really, we shouldn't, we shouldn't let the algorithms take over, and that humans actually are an integral piece of this loop. So, riff on that a little bit. >> Yeah, what I found fascinating about what he presented were actual examples where having a human in the loop of AI decision-making had a more positive impact than just letting the algorithms decide for you, and turning it into kind of a black, a black box. And the issue is not so much that, you know, there's very few cases where the algorithms make the wrong decision. What happens the majority of the time is that the algorithms actually can't be understood by human. So if you have to roll back >> They're opaque, yeah. >> in your decision-making, or uncover it, >> I mean, who can crack what a convolutional neural network does, at a layer by layer, nobody can. >> Right, right. And so, his point was, if we want to avoid not just poor outcomes, but also make sure that the robots don't take over the world, right, which is where every like, media person goes first, right? (Rebecca and James laugh) That you really need a human in the loop of this process. And a really interesting example he gave was what happened with the 2015 storm, and he talked about 16 different algorithms that do weather predictions, and only one algorithm predicted, mis-predicted that there would be a huge weather storm on the east coast. So if there had been a human in the loop, we wouldn't have, you know, caused all this crisis, right? The human could've >> And this is the storm >> Easily seen. >> That shut down the subway system, >> That's right. That's right. >> And really canceled New York City for a few days there, yeah. >> That's right. So I find this pretty meaningful, because Alation is in the data cataloging space, and we have a lot of opportunity to take technical metadata and automate the collection of technical and business metadata and do all this stuff behind the scenes. >> And you make the discovery of it, and the analysis of it. >> We do the discovery of this, and leading to actual recommendations to users of data, that you could turn into automated analyses or automated recommendations. >> Algorithmic, algorithmically augmented human judgment is what it's all about, the way I see it. What do you think? >> Yeah, but I think there's a deeper insight that he was sharing, is it's not just human judgment that is required, but for humans to actually be in the loop of the analysis as it moves from stage to stage, that we can try to influence or at least understand what's happening with that algorithm. And I think that's a really interesting point. You know, there's a number of data cataloging vendors, you know, some analysts will say there's anywhere from 10 to 30 different vendors in the data cataloging space, and as vendors, we kind of have this debate. Some vendors have more advanced AI and machine learning capabilities, and other vendors haven't automated at all. And I think that the answer, if you really want humans to adopt analytics, and to be comfortable with the decision-making of those algorithms, you need to have a human in the loop, in the middle of that process, of not only making the decision, but actually managing the data that flows through these systems. >> Well, algorithmic transparency and accountability is an increasing requirement. It's a requirement for GDPR compliance, for example. >> That's right. >> That I don't see yet with Wiki, but we don't see a lot of solution providers offering solutions to enable more of an automated roll-up of a narrative of an algorithmic decision path. But that clearly is a capability as it comes along, and it will. That will absolutely depend on a big data catalog managing the data, the metadata, but also helping to manage the tracking of what models were used to drive what decision, >> That's right. >> And what scenario. So that, that plays into what Alation >> So we talk, >> And others in your space do. >> We call that data catalog, almost as if the data's the only thing that we're tracking, but in addition to that, that metadata or the data itself, you also need to track the business semantics, how the business is using or applying that data and that algorithmic logic, so that might be logic that's just being used to transform that data, or it might be logic to actually make and automate decision, like what they're talking about GDPR. >> It's a data artifact catalog. These are all artifacts that, they are derived in many ways, or supplement and complement the data. >> That's right. >> They're all, it's all the logic, like you said. >> And what we talk about is, how do you create transparency into all those artifacts, right? So, a catalog starts with this inventory that creates a foundation for transparency, but if you don't make those artifacts accessible to a business person, who might not understand what is metadata, what is a transformation script. If you can't make that, those artifacts accessible to a, what I consider a real, or normal human being, right, (James laughs) I love to geek out, but, (all laugh) at some point, not everyone is going to understand. >> She's the normal human being in this team. >> I'm normal. I'm normal. >> I'm the abnormal human being among the questioners here. >> So, yeah, most people in the business are just getting our arms around how do we trust the output of analytics, how do we understand enough statistics and know what to apply to solve a business problem or not, and then we give them this like, hairball of technical artifacts and say, oh, go at it. You know, here's your transparency. >> Well, I want to ask about that, that human that we're talking about, that needs to be in the loop at every stage. What, that, surely, we can make the data more accessible, and, but it also requires a specialized skill set, and I want to ask you about the talent, because I noticed on your LinkedIn, you said, hey, we're hiring, so let me know. >> That's right, we're always hiring. We're a startup, growing well. >> So I want to know from you, I mean, are you having difficulty with filling roles? I mean, what is at the pipeline here? Are people getting the skills that they need? >> Yeah, I mean, there's a wide, what I think is a misnomer is there's actually a wide variety of skills, and I think we're adding new positions to this pool of skills. So I think what we're starting to see is an expectation that true business people, if you are in a finance organization, or you're in a marketing organization, or you're in a sales organization, you're going to see a higher level of data literacy be expected of that, that business person, and that's, that doesn't mean that they have to go take a Python course and learn how to be a data scientist. It means that they have to understand statistics enough to realize what the output of an algorithm is, and how they should be able to apply that. So, we have some great customers, who have formally kicked off internal training programs that are data literacy programs. Munich Re Insurance is a good example. They spoke with James a couple of months ago in Berlin. >> Yeah, this conference in Berlin, yeah. >> That's right, that's right, and their chief data officer has kicked off a formal data literacy training program for their employees, so that they can get business people comfortable enough and trusting the data, and-- >> It's a business culture transformation initiative that's very impressive. >> Yeah. >> How serious they are, and how comprehensive they are. >> But I think we're going to see that become much more common. Pfizer has taken, who's another customer of ours, has taken on a similar initiative, and how do they make all of their employees be able to have access to data, but then also know when to apply it to particular decision-making use cases. And so, we're seeing this need for business people to get a little bit of training, and then for new roles, like information stewards, or data stewards, to come online, folks who can curate the data and the data assets, and help be kind of translators in the organization. >> Stephanie, will there be a need for a algorithm curator, or a model curator, to, you know, like a model whisperer, to explain how these AI, convolutional, recurrent, >> Yeah. >> Whatever, all these neural, how, what they actually do, you know. Would there be a need for that going forward? Another as a normal human being, who can somehow be bilingual in neural net and in standard language? >> I think, I think so. I mean, I think we've put this pressure on data scientists to be that person. >> Oh my gosh, they're so busy doing their job. How can we expect them to explain, and I mean, >> Right. >> And to spend 100% of their time explaining it to the rest of us? >> And this is the challenge with some of the regulations like GDPR. We aren't set up yet, as organizations, to accommodate this complexity of understanding, and I think that this part of the market is going to move very quickly, so as vendors, one of the things that we can do is continue to help by building out applications that make it easy for information stewardship. How do you lower the barrier for these specialist roles and make it easy for them to do their job by using AI and machine learning, where appropriate, to help scale the manual work, but keeping a human in the loop to certify that data asset, or to add additional explanation and then taking their work and using AI, machine learning, and automation to propagate that work out throughout the organization, so that everyone then has access to those explanations. So you're no longer requiring the data scientists to hold like, I know other organizations that hold office hours, and the data scientist like sits at a desk, like you did in college, and people can come in and ask them questions about neural nets. That's just not going to scale at today's pace of business. >> Right, right. >> You know, the term that I used just now, the algorithm or model whisperer, you know, the recommend-er function that is built into your environment, in similar data catalog, is a key piece of infrastructure to rank the relevance rank, you know, the outputs of the catalog or responses to queries that human beings might make. You know, the recommendation ranking is critically important to help human beings assess the, you know, what's going on in the system, and give them some advice about how to, what avenues to explore, I think, so. >> Yeah, yeah. And that's part of our definition of data catalog. It's not just this inventory of technical metadata. >> That would be boring, and dry, and useless. >> But that's where, >> For most human beings. >> That's where a lot of vendor solutions start, right? >> Yeah. >> And that's an important foundation. >> Yeah, for people who don't live 100% of their work day inside the big data catalog. I hear what you're saying, you know. >> Yeah, so people who want a data catalog, how you make that relevant to the business is you connect those technical assets, that technical metadata with how is the business actually using this in practice, and how can we have proactive recommendation or the recommendation engines, and certifications, and this information steward then communicating through this platform to others in the organization about how do you interpret this data and how do you use it to actually make business decisions. And I think that's how we're going to close the gap between technology adoption and actual data-driven decision-making, which we're not quite seeing yet. We're only seeing about 30, when they survey, only about 36% of companies are actually confident they're making data-driven decisions, even though there have been, you know, millions, if not billions of dollars that have gone into the data analytics market and investments, and it's because as a manager, I don't quite have the data literacy yet, and I don't quite have the transparency across the rest of the organization to close that trust gap on analytics. >> Here's my feeling, in terms of cultural transformations across businesses in general. I think the legal staff of every company is going to need to get real savvy on using those kinds of tools, like your catalog, with recommendation engines, to support e-discovery, or discovery of the algorithmic decision paths that were taken by their company's products, 'cause they're going to be called by judges and juries, under a subpoena and so forth, and so on, to explain all this, and they're human beings who've got law degrees, but who don't know data, and they need the data environment to help them frame up a case for what we did, and you know, so, we being the company that's involved. >> Yeah, and our politicians. I mean, anyone who's read Cathy's book, Weapons of Math Destruction, there are some great use cases of where, >> Math, M-A-T-H, yeah. >> Yes, M-A-T-H. But there are some great examples of where algorithms can go wrong, and many of our politicians and our representatives in government aren't quite ready to have that conversation. I think anyone who watched the Zuckerberg hearings you know, in congress saw the gap of knowledge that exists between >> Oh my gosh. >> The legal community, and you know, and the tech community today. So there's a lot of work to be done to get ready for this new future. >> But just getting back to the cultural transformation needed to be, to make data-driven decisions, one of the things you were talking about is getting the managers to trust the data, and we're hearing about what are the best practices to have that happen in the sense, of starting small, be willing to experiment, get out of the lab, try to get to insight right away. What are, what would your best advice be, to gain trust in the data? >> Yeah, I think the biggest gap is this issue of transparency. How do you make sure that everyone understands each step of the process and has access to be able to dig into that. If you have a foundation of transparency, it's a lot easier to trust, rather than, you know, right now, we have kind of like the high priesthood of analytics going on, right? (Rebecca laughs) And some believers will believe, but a lot of folks won't, and, you know, the origin story of Alation is really about taking these concepts of the scientific revolution and scientific process and how can we support, for data analysis, those same steps of scientific evaluation of a finding. That means that you need to publish your data set, you need to allow others to rework that data, and come up with their own findings, and you have to be open and foster conversations around data in your organization. One other customer of ours, Meijer, who's a grocery store in the mid-west, and if you're west coast or east coast-based, you might not have heard of them-- >> Oh, Meijers, thrifty acres. I'm from Michigan, and I know them, yeah. >> Gigantic. >> Yeah, there you go. Gigantic grocery chain in the mid-west, and, Joe Oppenheimer there actually introduced a program that he calls the social contract for analytics, and before anyone gets their license to use Tableau, or MicroStrategy, or SaaS, or any of the tools internally, he asks those individuals to sign a social contract, which basically says that I'll make my work transparent, I will document what I'm doing so that it's shareable, I'll use certain standards on how I format the data, so that if I come up with a, with a really insightful finding, it can be easily put into production throughout the rest of the organization. So this is a really simple example. His inspiration for that social contract was his high school freshman. He was entering high school and had to sign a social contract, that he wouldn't make fun of the teachers, or the students, you know, >> I love it. >> Very simple basics. >> Yeah, right, right, right. >> I wouldn't make fun of the teacher. >> We all need social contract. >> Oh my gosh, you have to make fun of the teacher. >> I think it was a little more formal than that, in the language, but that was the concept. >> That's violating your civil rights as a student. I'm sorry. (Stephanie laughs) >> Stephanie, always so much fun to have you here. Thank you so much for coming on. >> Thank you. It's a pleasure to be here. >> I'm Rebecca Knight, for James Kobielus. We'll have more of theCUBE's live coverage of DataWorks just after this.

Published Date : Jun 20 2018

SUMMARY :

brought to you by Hortonworks. She is the Vice President of Marketing Thank you for having me and that humans actually of the time is that yeah. I mean, who can crack but also make sure that the robots That's right. And really canceled because Alation is in the and the analysis of it. and leading to actual recommendations the way I see it. and to be comfortable with It's a requirement for GDPR compliance, the metadata, but also helping to manage that plays into what Alation that metadata or the data itself, or supplement and complement the data. it's all the logic, I love to geek out, but, She's the normal human being I'm normal. I'm the abnormal and know what to apply that needs to be in the That's right, we're always hiring. and how they should be able to apply that. Yeah, this conference It's a business culture and how comprehensive they are. in the organization. and in standard language? on data scientists to be to explain, and I mean, and the data scientist to rank the relevance rank, you know, definition of data catalog. and dry, and useless. And that's an important inside the big data catalog. and I don't quite have the transparency and so on, to explain all this, Yeah, and our politicians. and many of our politicians and the tech community today. is getting the managers to trust the data, and has access to be and I know them, yeah. or the students, you know, the teacher. the teacher. in the language, but that was That's violating much fun to have you here. It's a pleasure to be here. We'll have more of theCUBE's live coverage

ENTITIES

Entity	Category	Confidence
James Kobielus	PERSON	0.99+
Stephanie McReynolds	PERSON	0.99+
Rebecca Knight	PERSON	0.99+
Rebecca	PERSON	0.99+
Michigan	LOCATION	0.99+
Stephanie	PERSON	0.99+
Berlin	LOCATION	0.99+
James	PERSON	0.99+
100%	QUANTITY	0.99+
Kevin Slavin	PERSON	0.99+
San Jose	LOCATION	0.99+
millions	QUANTITY	0.99+
Cathy	PERSON	0.99+
Silicon Valley	LOCATION	0.99+
Pfizer	ORGANIZATION	0.99+
LinkedIn	ORGANIZATION	0.99+
Munich Re Insurance	ORGANIZATION	0.99+
San Jose, California	LOCATION	0.99+
congress	ORGANIZATION	0.99+
New York City	LOCATION	0.99+
Joe Oppenheimer	PERSON	0.99+
Python	TITLE	0.99+
10	QUANTITY	0.99+
Meijers	ORGANIZATION	0.99+
Zuckerberg	PERSON	0.99+
16 different algorithms	QUANTITY	0.99+
Weapons of Math Destruction	TITLE	0.99+
GDPR	TITLE	0.99+
One	QUANTITY	0.98+
each step	QUANTITY	0.98+
theCUBE	ORGANIZATION	0.98+
about 36%	QUANTITY	0.98+
DataWorks Summit 2018	EVENT	0.97+
Tableau	TITLE	0.97+
about 30	QUANTITY	0.97+
Hortonworks	ORGANIZATION	0.97+
Alation	ORGANIZATION	0.96+
one algorithm	QUANTITY	0.96+
30 different vendors	QUANTITY	0.95+
billions of dollars	QUANTITY	0.95+
2015	DATE	0.95+
SaaS	TITLE	0.94+
one	QUANTITY	0.94+
Gigantic	ORGANIZATION	0.93+
first	QUANTITY	0.9+
MicroStrategy	TITLE	0.88+
this morning	DATE	0.88+
couple of months ago	DATE	0.84+
today	DATE	0.81+
Meijer	ORGANIZATION	0.77+
Wiki	TITLE	0.74+
Vice President	PERSON	0.72+
DataWorks	ORGANIZATION	0.71+
Alation	PERSON	0.53+
DataWorks	EVENT	0.43+

Day Two Kickoff | DataWorks Summit 2018

>> Live from San Jose, in the heart of Silicon Valley, it's theCube. Covering DataWorks Summit 2018. Brought to you by Hortonworks. >> Welcome back to day two of theCube's live coverage of DataWorks here in San Jose, California. I'm your host, Rebecca Knight along with my co-host James Kobielus. James, it's great to be here with you in the hosting seat again. >> Day two, yes. >> Exactly. So here we are, this conference, 2,100 attendees from 32 countries, 23 industries. It's a relatively big show. They do three of them during the year. One of the things that I really-- >> It's a well-established show too. I think this is like the 11th year since Yahoo started up the first Hadoop summit in 2008. >> Right, right. >> So it's an established event, yeah go. >> Exactly, exactly. But I really want to talk about Hortonworks the company. This is something that you had brought up in an analyst report before the show started and that was talking about Hortonworks' cash flow positivity for the first time. >> Which is good. >> Which is good, which is a positive sign and yet what are the prospects for this company's financial health? We're still not seeing really clear signs of robust financial growth. >> I think the signs are good for the simple reason they're making significant investments now to prepare for the future that's almost inevitable. And the future that's almost inevitable, and when I say the future, the 2020s, the decade that's coming. Most of their customers will shift more of their workloads, maybe not entirely yet, to public cloud environments for everything they're doing, AI, machine learning, deep learning. And clearly the beneficiaries of that trend will be the public cloud providers, all of whom are Hortonworks' partners and established partners, AWS, Microsoft with Azure, Google with, you know, Google Cloud Platform, IBM with IBM Cloud. Hortonworks, and this is... You know, their partnerships with these cloud providers go back several years so it's not a new initiative for them. They've seen the writing on the wall practically from the start of Hortonworks' founding in 2011 and they now need to go deeper towards making their solution portfolio capable of being deployable on-prem, in cloud, public clouds, and in various and sundry funky combinations called hybrid multi-clouds. Okay, so, they've been making those investments in those partnerships and in public cloud enabling the Hortonworks Data Platform. Here at this show, DataWorks 2018 here in San Jose, they've released the latest major version, HDP 3.0 of their core platform with a lot of significant enhancements related to things that their customers are increasingly doing-- >> Well I want to ask you about those enhancements. >> But also they have partnership announcements, the deep ones of integration and, you know, lift and shift of the Hortonworks portfolio of HDP with Hortonworks DataFlow and DataPlane Services, so that those solutions can operate transparently on those public cloud environments as the customers, as and when the customers choose to shift their workloads. 'Cause Hortonworks really... You know, like Scott Gnau yesterday, I mean just laid it on the line, they know that the more of the public cloud workloads will predominate now in this space. They're just making these speculative investments that they absolutely have to now to prepare the way. So I think this cost that they're incurring now to prepare their entire portfolio for that inevitable future is the right thing to do and that's probably why they still have not attained massive rock and rollin' positive cash flow yet but I think that they're preparing the way for them to do so in the coming decade. >> So their financial future is looking brighter and they're doing the right things. >> Yeah, yes. >> So now let's talk tech. And this is really where you want to be, Jim, I know you. >> Oh I get sleep now and I don't think about tech constantly. >> So as you've said, they're really doing a lot of emphasis now on their public cloud partnerships. >> Yes. >> But they've also launched several new products and upgrades to existing products, what are you seeing that excites you and that you think really will be potential game changers? >> You know, this is geeky but this is important 'cause it's at the very heart of Hortonworks Data Platform 3.0, containerization of more... When you're a data scientist, and you're building a machine learning model using data that's maintained, and is persisted, and processed within Hortonworks Data Platform or any other big data platform, you want the ability increasingly for developing machine learning, deep learning, AI in general, to take that application you might build while you're using TensorFlow models, that you build on HDP, they will containerize it in Docker and, you know, orchestrate it all through Kubernetes and all that wonderful stuff, and deploy it out, those AI, out to increasingly edge computing, mobile computing, embedded computing environments where, you know, the real venture capital mania's happening, things like autonomous vehicles, and you know, drones, and you name it. So the fact is that Hortonworks has made that in many ways the premier new feature of HDP 3.0 announced here this week at the show. That very much harmonizes with what their partners, where their partners are going with containerization of AI. IBM, one of their premier partners, very recently, like last month, I think it was, announced the latest version of IBM, what do they call it, IBM Cloud Private, which has embedded as a core feature containerization within that environment which is a prem-based environment of AI and so forth. The fact that Hortonworks continues to maintain close alignment with the capabilities that its public cloud partners are building to their respective portfolios is important. But also Hortonworks with its, they call it, you know, a single pane of glass, the DataPlane Services for metadata and monitoring and governance and compliance across this sprawling hybrid multi-cloud, these scenarios. The fact that they're continuing to make, in fact, really focusing on deep investments in that portfolio, so that when an IBM introduces or, AWS, whoever, introduces some new feature in their respective platforms, Hortonworks has the ability to, as it were, abstract above and beyond all of that so that the customer, the developer, and the data administrator, all they need to do, if they're a Hortonworks customer, is stay within the DataPlane Services and environment to be able to deploy with harmonized metadata and harmonized policies, and harmonized schemas and so forth and so on, and query optimization across these sprawling environments. So Hortonworks, I think, knows where their bread is buttered and it needs to stay on the DPS, DataPlane Services, side which is why a couple months ago in Berlin, Hortonworks made a, I think, the most significant announcement of the year for them and really for the industry, was that they announced the Data Steward Studio in Berlin. Tech really clearly was who addressed the GDPR mandate that was coming up but really did a stewardship as an end-to-end workflow for lots of, you know, core enterprise applications, absolutely essential. Data Steward Studio is a DataPlane Service that can operate across multi-cloud environments. Hortonworks is going to keep on, you know... They didn't have a DPS, DataPlane Services, announcements here in San Jose this week but you can best believe that next year at this time at this show, and in the interim they'll probably have a number of significant announcements to deepen that portfolio. Once again it's to grease the wheels towards a more purely public cloud future in which there will be Hortonworks DNA inside most of their customers' environments going forward. >> I want to ask you about themes of this year's conference. The thing is is that you were in Berlin at the last big Hortonworks DataWorks Summit. >> (speaks in foreign language) >> And really GDPR dominated the conversations because the new rules and regulations hadn't yet taken effect and companies were sort of bracing for what life was going to be like under GDPR. Now the rules are here, they're here to stay, and companies are really grappling with it, trying to understand the changes and how they can exist in this new regime. What would you say are the biggest themes... We're still talking about GDPR, of course, but what would you say are the bigger themes that are this week's conference? Is it scalability, is it... I mean, what would you say we're going, what do you think has dominated the conversations here? >> Well scalability is not the big theme this week though there are significant scalability announcements this week in the context of HDP 3.0, the ability to persist in a scale-out fashion across multi-cloud, billions of files. Storage efficiency is an important piece of the overall announcement with support for erasure coding, blah blah blah. That's not, you know, that's... Already, Hortonworks, like all of their cloud providers and other big data providers, provide very scalable environments for storage, workload management. That was not the hugest, buzzy theme in terms of the announcements this week. The buzz of course was HDP 3.0. Containerization, that's important, but you know, we just came out of the day two keynote. AI is not a huge focus yet for a lot of the Hortonworks customers who are here, the developers. They're, you know, most of their customers are not yet that far along in their deep learning journeys and whatever but they're definitely going there. There's plenty of really cool keynote discussions including the guy with the autonomous vehicles or whatever that, the thing we just came out of. That was not the predominant theme this week here in terms of the HDP 3.0. I think what it comes down to is that with HDP 3.0... Hive, though you tend to take it for granted, it's been in Hadoop from the very start, practically, Hive is now a full enterprise database and that's the core, one of the cores, of HDP 3.0. Hive itself, Hive 3.0 now is its version, is ACID compliant and that may be totally geeky to the most of the world but that enables it to support transactional applications. So more big data in every environment is supporting more traditional enterprise application, transactional applications that require like two-phase commit and all that goodness. The fact is, you know, Hortonworks have, from what I can see, is the first of the big data vendors to incorporate those enhancements to Hive 3.0 because they're so completely tuned in to the Hive environment in terms of a committer. I think in many ways that is the predominant theme in terms of the new stuff that will actually resonate with the developers, their customers here at the show. And with the, you know, enterprises in general, they can put more of their traditional enterprise application workloads on big data environments and specifically, Hortonworks hopes, its HDP 3.0. >> Well I'm excited to learn more here at the on theCube with you today. We've got a lot of great interviews lined up and a lot of interesting content. We got a great crew too so this is a fun show to do. >> Sure is. >> We will have more from day two of the.

Published Date : Jun 20 2018

SUMMARY :

Live from San Jose, in the heart James, it's great to be here with you One of the things that I really-- I think this is like the So it's an This is something that you had brought up of robust financial growth. in public cloud enabling the Well I want to ask you is the right thing to do doing the right things. And this is really where you Oh I get sleep now and I don't think of emphasis now on their announcement of the year at the last big Hortonworks because the new rules of the announcements this week. this is a fun show to do.

ENTITIES

Entity	Category	Confidence
James Kobielus	PERSON	0.99+
Rebecca Knight	PERSON	0.99+
Hortonworks'	ORGANIZATION	0.99+
Hortonworks	ORGANIZATION	0.99+
2011	DATE	0.99+
Jim	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Berlin	LOCATION	0.99+
AWS	ORGANIZATION	0.99+
San Jose	LOCATION	0.99+
Microsoft	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Silicon Valley	LOCATION	0.99+
James	PERSON	0.99+
23 industries	QUANTITY	0.99+
Yahoo	ORGANIZATION	0.99+
San Jose, California	LOCATION	0.99+
Hive 3.0	TITLE	0.99+
2020s	DATE	0.99+
next year	DATE	0.99+
this week	DATE	0.99+
32 countries	QUANTITY	0.99+
Hive	TITLE	0.99+
11th year	QUANTITY	0.99+
yesterday	DATE	0.99+
first time	QUANTITY	0.99+
GDPR	TITLE	0.98+
last month	DATE	0.98+
DataPlane Services	ORGANIZATION	0.98+
One	QUANTITY	0.98+
Scott Gnau	PERSON	0.98+
2008	DATE	0.98+
three	QUANTITY	0.98+
2,100 attendees	QUANTITY	0.98+
HDP 3.0	TITLE	0.98+
today	DATE	0.98+
Data Steward Studio	ORGANIZATION	0.98+
two-phase	QUANTITY	0.98+
one	QUANTITY	0.97+
DataWorks Summit 2018	EVENT	0.96+
DataPlane	ORGANIZATION	0.96+
Day two	QUANTITY	0.96+
billions of files	QUANTITY	0.95+
first	QUANTITY	0.95+
day two	QUANTITY	0.95+
DPS	ORGANIZATION	0.95+
Data Platform 3.0	TITLE	0.94+
Hortonworks DataWorks Summit	EVENT	0.94+
DataWorks	EVENT	0.92+

Pandit Prasad, IBM | DataWorks Summit 2018

>> From San Jose, in the heart of Silicon Valley, it's theCube. Covering DataWorks Summit 2018. Brought to you by Hortonworks. (upbeat music) >> Welcome back to theCUBE's live coverage of Data Works here in sunny San Jose, California. I'm your host Rebecca Knight along with my co-host James Kobielus. We're joined by Pandit Prasad. He is the analytics, projects, strategy, and management at IBM Analytics. Thanks so much for coming on the show. >> Thanks Rebecca, glad to be here. >> So, why don't you just start out by telling our viewers a little bit about what you do in terms of in relationship with the Horton Works relationship and the other parts of your job. >> Sure, as you said I am in Offering Management, which is also known as Product Management for IBM, manage the big data portfolio from an IBM perspective. I was also working with Hortonworks on developing this relationship, nurturing that relationship, so it's been a year since the Northsys partnership. We announced this partnership exactly last year at the same conference. And now it's been a year, so this year has been a journey and aligning the two portfolios together. Right, so Hortonworks had HDP HDF. IBM also had similar products, so we have for example, Big Sequel, Hortonworks has Hive, so how Hive and Big Sequel align together. IBM has a Data Science Experience, where does that come into the picture on top of HDP, so it means before this partnership if you look into the market, it has been you sell Hadoop, you sell a sequel engine, you sell Data Science. So what this year has given us is more of a solution sell. Now with this partnership we go to the customers and say here is NTN experience for you. You start with Hadoop, you put more analytics on top of it, you then bring Big Sequel for complex queries and federation visualization stories and then finally you put Data Science on top of it, so it gives you a complete NTN solution, the NTN experience for getting the value out of the data. >> Now IBM a few years back released a Watson data platform for team data science with DSX, data science experience, as one of the tools for data scientists. Is Watson data platform still the core, I call it dev ops for data science and maybe that's the wrong term, that IBM provides to market or is there sort of a broader dev ops frame work within which IBM goes to market these tools? >> Sure, Watson data platform one year ago was more of a cloud platform and it had many components of it and now we are getting a lot of components on to the (mumbles) and data science experience is one part of it, so data science experience... >> So Watson analytics as well for subject matter experts and so forth. >> Yes. And again Watson has a whole suit of side business based offerings, data science experience is more of a a particular aspect of the focus, specifically on the data science and that's been now available on PRAM and now we are building this arm from stack, so we have HDP, HDF, Big Sequel, Data Science Experience and we are working towards adding more and more to that portfolio. >> Well you have a broader reference architecture and a stack of solutions AI and power and so for more of the deep learning development. In your relationship with Hortonworks, are they reselling more of those tools into their customer base to supplement, extend what they already resell DSX or is that outside of the scope of the relationship? >> No it is all part of the relationship, these three have been the core of what we announced last year and then there are other solutions. We have the whole governance solution right, so again it goes back to the partnership HDP brings with it Atlas. IBM has a whole suite of governance portfolio including the governance catalog. How do you expand the story from being a Hadoop-centric story to an enterprise data-like story, and then now we are taking that to the cloud that's what Truata is all about. Rob Thomas came out with a blog yesterday morning talking about Truata. If you look at it is nothing but a governed data-link hosted offering, if you want to simplify it. That's one way to look at it caters to the GDPR requirements as well. >> For GDPR for the IBM Hortonworks partnership is the lead solution for GDPR compliance, is it Hortonworks Data Steward Studio or is it any number of solutions that IBM already has for data governance and curation, or is it a combination of all of that in terms of what you, as partners, propose to customers for soup to nuts GDPR compliance? Give me a sense for... >> It is a combination of all of those so it has a HDP, its has HDF, it has Big Sequel, it has Data Science Experience, it had IBM governance catalog, it has IBM data quality and it has a bunch of security products, like Gaurdium and it has some new IBM proprietary components that are very specific towards data (cough drowns out speaker) and how do you deal with the personal data and sensitive personal data as classified by GDPR. I'm supposed to query some high level information but I'm not allowed to query deep into the personal information so how do you blog those queries, how do you understand those, these are not necessarily part of Data Steward Studio. These are some of the proprietary components that are thrown into the mix by IBM. >> One of the requirements that is not often talked about under GDPR, Ricky of Formworks got in to it a little bit in his presentation, was the notion that the requirement that if you are using an UE citizen's PII to drive algorithmic outcomes, that they have the right to full transparency. It's the algorithmic decision paths that were taken. I remember IBM had a tool under the Watson brand that wraps up a narrative of that sort. Is that something that IBM still, it was called Watson Curator a few years back, is that a solution that IBM still offers, because I'm getting a sense right now that Hortonworks has a specific solution, not to say that they may not be working on it, that addresses that side of GDPR, do you know what I'm referring to there? >> I'm not aware of something from the Hortonworks side beyond the Data Steward Studio, which offers basically identification of what some of the... >> Data lineage as opposed to model lineage. It's a subtle distinction. >> It can identify some of the personal information and maybe provide a way to tag it and hence, mask it, but the Truata offering is the one that is bringing some new research assets, after GDPR guidelines became clear and then they got into they are full of how do we cater to those requirements. These are relatively new proprietary components, they are not even being productized, that's why I am calling them proprietary components that are going in to this hosting service. >> IBM's got a big portfolio so I'll understand if you guys are still working out what position. Rebecca go ahead. >> I just wanted to ask you about this new era of GDPR. The last Hortonworks conference was sort of before it came into effect and now we're in this new era. How would you say companies are reacting? Are they in the right space for it, in the sense of they're really still understand the ripple effects and how it's all going to play out? How would you describe your interactions with companies in terms of how they're dealing with these new requirements? >> They are still trying to understand the requirements and interpret the requirements coming to terms with what that really means. For example I met with a customer and they are a multi-national company. They have data centers across different geos and they asked me, I have somebody from Asia trying to query the data so that the query should go to Europe, but the query processing should not happen in Asia, the query processing all should happen in Europe, and only the output of the query should be sent back to Asia. You won't be able to think in these terms before the GDPR guidance era. >> Right, exceedingly complicated. >> Decoupling storage from processing enables those kinds of fairly complex scenarios for compliance purposes. >> It's not just about the access to data, now you are getting into where the processing happens were the results are getting displayed, so we are getting... >> Severe penalties for not doing that so your customers need to keep up. There was announcement at this show at Dataworks 2018 of an IBM Hortonwokrs solution. IBM post-analytics with with Hortonworks. I wonder if you could speak a little bit about that, Pandit, in terms of what's provided, it's a subscription service? If you could tell us what subset of IBM's analytics portfolio is hosted for Hortonwork's customers? >> Sure, was you said, it is a a hosted offering. Initially we are starting of as base offering with three products, it will have HDP, Big Sequel, IBM DB2 Big Sequel and DSX, Data Science Experience. Those are the three solutions, again as I said, it is hosted on IBM Cloud, so customers have a choice of different configurations they can choose, whether it be VMs or bare metal. I should say this is probably the only offering, as of today, that offers bare metal configuration in the cloud. >> It's geared to data scientist developers and machine-learning models will build the models and train them in IBM Cloud, but in a hosted HDP in IBM Cloud. Is that correct? >> Yeah, I would rephrase that a little bit. There are several different offerings on the cloud today and we can think about them as you said for ad-hoc or ephemeral workloads, also geared towards low cost. You think about this offering as taking your on PRAM data center experience directly onto the cloud. It is geared towards very high performance. The hardware and the software they are all configured, optimized for providing high performance, not necessarily for ad-hoc workloads, or ephemeral workloads, they are capable of handling massive workloads, on sitcky workloads, not meant for I turned this massive performance computing power for a couple of hours and then switched them off, but rather, I'm going to run these massive workloads as if it is located in my data center, that's number one. It comes with the complete set of HDP. If you think about it there are currently in the cloud you have Hive and Hbase, the sequel engines and the stories separate, security is optional, governance is optional. This comes with the whole enchilada. It has security and governance all baked in. It provides the option to use Big Sequel, because once you get on Hadoop, the next experience is I want to run complex workloads. I want to run federated queries across Hadoop as well as other data storage. How do I handle those, and then it comes with Data Science Experience also configured for best performance and integrated together. As a part of this partnership, I mentioned earlier, that we have progress towards providing this story of an NTN solution. The next steps of that are, yeah I can say that it's an NTN solution but are the product's look and feel as if they are one solution. That's what we are getting into and I have featured some of those integrations. For example Big Sequel, IBM product, we have been working on baking it very closely with HDP. It can be deployed through Morey, it is integrated with Atlas and Granger for security. We are improving the integrations with Atlas for governance. >> Say you're building a Spark machine learning model inside a DSX on HDP within IH (mumbles) IBM hosting with Hortonworks on HDP 3.0, can you then containerize that machine learning Sparks and then deploy into an edge scenario? >> Sure, first was Big Sequel, the next one was DSX. DSX is integrated with HDP as well. We can run DSX workloads on HDP before, but what we have done now is, if you want to run the DSX workloads, I want to run a Python workload, I need to have Python libraries on all the nodes that I want to deploy. Suppose you are running a big cluster, 500 cluster. I need to have Python libraries on all 500 nodes and I need to maintain the versioning of it. If I upgrade the versions then I need to go and upgrade and make sure all of them are perfectly aligned. >> In this first version will you be able build a Spark model and a Tesorflow model and containerize them and deploy them. >> Yes. >> Across a multi-cloud and orchestrate them with Kubernetes to do all that meshing, is that a capability now or planned for the future within this portfolio? >> Yeah, we have that capability demonstrated in the pedestal today, so that is a new one integration. We can run virtual, we call it virtual Python environment. DSX can containerize it and run data that's foreclosed in the HDP cluster. Now we are making use of both the data in the cluster, as well as the infrastructure of the cluster itself for running the workloads. >> In terms of the layers stacked, is also incorporating the IBM distributed deep-learning technology that you've recently announced? Which I think is highly differentiated, because deep learning is increasingly become a set of capabilities that are across a distributed mesh playing together as is they're one unified application. Is that a capability now in this solution, or will it be in the near future? DPL distributed deep learning? >> No, we have not yet. >> I know that's on the AI power platform currently, gotcha. >> It's what we'll be talking about at next year's conference. >> That's definitely on the roadmap. We are starting with the base configuration of bare metals and VM configuration, next one is, depending on how the customers react to it, definitely we're thinking about bare metal with GPUs optimized for Tensorflow workloads. >> Exciting, we'll be tuned in the coming months and years I'm sure you guys will have that. >> Pandit, thank you so much for coming on theCUBE. We appreciate it. I'm Rebecca Knight for James Kobielus. We will have, more from theCUBE's live coverage of Dataworks, just after this.

Published Date : Jun 19 2018

SUMMARY :

Brought to you by Hortonworks. Thanks so much for coming on the show. and the other parts of your job. and aligning the two portfolios together. and maybe that's the wrong term, getting a lot of components on to the (mumbles) and so forth. a particular aspect of the focus, and so for more of the deep learning development. No it is all part of the relationship, For GDPR for the IBM Hortonworks partnership the personal information so how do you blog One of the requirements that is not often I'm not aware of something from the Hortonworks side Data lineage as opposed to model lineage. It can identify some of the personal information if you guys are still working out what position. in the sense of they're really still understand the and interpret the requirements coming to terms kinds of fairly complex scenarios for compliance purposes. It's not just about the access to data, I wonder if you could speak a little that offers bare metal configuration in the cloud. It's geared to data scientist developers in the cloud you have Hive and Hbase, can you then containerize that machine learning Sparks on all the nodes that I want to deploy. In this first version will you be able build of the cluster itself for running the workloads. is also incorporating the IBM distributed It's what we'll be talking next one is, depending on how the customers react to it, I'm sure you guys will have that. Pandit, thank you so much for coming on theCUBE.

ENTITIES

Entity	Category	Confidence
Rebecca	PERSON	0.99+
James Kobielus	PERSON	0.99+
Rebecca Knight	PERSON	0.99+
Europe	LOCATION	0.99+
IBM	ORGANIZATION	0.99+
Asia	LOCATION	0.99+
Rob Thomas	PERSON	0.99+
San Jose	LOCATION	0.99+
Silicon Valley	LOCATION	0.99+
Pandit	PERSON	0.99+
last year	DATE	0.99+
Python	TITLE	0.99+
yesterday morning	DATE	0.99+
Hortonworks	ORGANIZATION	0.99+
three solutions	QUANTITY	0.99+
Ricky	PERSON	0.99+
Northsys	ORGANIZATION	0.99+
Hadoop	TITLE	0.99+
Pandit Prasad	PERSON	0.99+
GDPR	TITLE	0.99+
IBM Analytics	ORGANIZATION	0.99+
first version	QUANTITY	0.99+
both	QUANTITY	0.99+
one year ago	DATE	0.98+
Hortonwork	ORGANIZATION	0.98+
three	QUANTITY	0.98+
today	DATE	0.98+
DSX	TITLE	0.98+
Formworks	ORGANIZATION	0.98+
this year	DATE	0.98+
Atlas	ORGANIZATION	0.98+
first	QUANTITY	0.98+
Granger	ORGANIZATION	0.97+
Gaurdium	ORGANIZATION	0.97+
one	QUANTITY	0.97+
Data Steward Studio	ORGANIZATION	0.97+
two portfolios	QUANTITY	0.97+
Truata	ORGANIZATION	0.96+
DataWorks Summit 2018	EVENT	0.96+
one solution	QUANTITY	0.96+
one way	QUANTITY	0.95+
next year	DATE	0.94+
500 nodes	QUANTITY	0.94+
NTN	ORGANIZATION	0.93+
Watson	TITLE	0.93+
Hortonworks	PERSON	0.93+

Cindy Maike, Hortonworks | DataWorks Summit 2018

Published Date : Jun 19 2018

SUMMARY :

brought to you by Hortonworks. She is the VP Industry Thank you, thank about the business case and your approach kind of like the operational reporting. the questions that I haven't asked yet. And then you know, the last goods, you explain it. before it expires you know, of the produce or are you also looking at you know, about data as the new oil but you know, you can make actually you know, use? actually you know, I mentioned that says, you know, if I have, the industry and you made accelerate the time to value business of the future. of the insurance industry. competitive differentiation of the whole Right, and you think Thank you Cindy.

ENTITIES

Entity	Category	Confidence
James Kobielus	PERSON	0.99+
Rebecca Knight	PERSON	0.99+
Rebecca Knight	PERSON	0.99+
Cindy	PERSON	0.99+
Hortonworks	ORGANIZATION	0.99+
Cindy Maike	PERSON	0.99+
General Motors	ORGANIZATION	0.99+
General Data Protection act	TITLE	0.99+
San Jose	LOCATION	0.99+
10	QUANTITY	0.99+
Silicon Valley	LOCATION	0.99+
San Jose, California	LOCATION	0.99+
58 percent	QUANTITY	0.99+
Arity	ORGANIZATION	0.99+
GDPR	TITLE	0.98+
20 years ago	DATE	0.98+
On Star	ORGANIZATION	0.98+
once a month	QUANTITY	0.98+
GM Insurance	ORGANIZATION	0.97+
theCUBE	ORGANIZATION	0.97+
Data Works Summit 2018	EVENT	0.96+
one	QUANTITY	0.96+
today	DATE	0.96+
DataWorks Summit 2018	EVENT	0.95+
both	QUANTITY	0.95+
10 years ago	DATE	0.94+
VP Industry Solutions	ORGANIZATION	0.94+
GMAC Insurance	ORGANIZATION	0.92+
this morning	DATE	0.9+
both data	QUANTITY	0.84+
five	QUANTITY	0.78+
20 years	QUANTITY	0.75+
10 years	QUANTITY	0.72+
Dataworks	ORGANIZATION	0.59+
Data Works	TITLE	0.59+
Best Review	TITLE	0.54+
theCUBE	EVENT	0.54+
State	ORGANIZATION	0.49+

Peter Smails, ImanisData | DataWorks Summit 2018

>> Live from San Jose in the heart of Silicon Valley, it's the Cube. Covering Dataworks Summit 2018 brought to you by Hortonworks. (upbeat music) >> Welcome back to The Cube's live coverage of Dataworks here in San Jose, California. I'm your host Rebecca Knight along with my co-host James Kobielus. We're joined by Peter Smails. He is the vice president of marketing at Imanis Data. Thanks so much for coming on The Cube. >> Thanks for having me, glad to be here. >> So you've been in the data storage solution industry for a long time, but you're new to Imanis, what made you jump? What was it about Imanis? >> Yep, so very easy to answer that. It's a hot market. So essentially what Imanis all about is we're an enterprise data management company. So the reason I jumped here is because if I put it in market context, if I take a small step back, I put it in market context, here's what happening. You've got your traditional application world, right? On prem typically already a mas based applications, that's the old world. New world is everybody's moving to the microservices based applications for IOT, for customer 360, for customer analysis, whatever you want. They're building these new modern applications. They're building those applications not in traditional RDMS, they're building them on microservices based architectures built on top of FEDOOP, or built on sequel databases. Those applications, as they go mainstream, and they go into production environments, they require data management. They require backup. They require backup and recovery. They require disaster recovery. They require archiving, etc. They require the whole plethora of data management capabilities. Nobody's touching that market. It's a blue ocean. So, that's why I'm here. >> Imanis as you were saying is one of the greatest little company no one's ever heard of. You've been around five years. (laughter) >> No, the company is not new. So, the thing that's exciting as a marketeer, what's exciting is that we're not sort of out there just pitching our wears untested technology. We have blue chip, we're getting into customers that people would die to get into. Big, blue chip companies because we're addressing a problem that's materialist. They roll out these new applications, they've got to have data management solutions for them. The company's been around five years. And I've only been on about a month, but what that's resulted is that over the last five years what they've had the opportunity, it's an enterprise product. And you don't build an enterprise product overnight. So they've had the last five years to really gestate the platform, gestate the technology, prove it in real world scenarios. And now, the opportunity for us as as a company is we're doubling down from a marketing standpoint. We're doubling down from the sales infrastructure standpoint. So the timing's right to essentially put this thing on the map, make sure everybody does know exactly what we do. Because we're solving a real world problem. >> Your backup and restore but much more. When you lay out the broad set of enterprise data and management capabilities, the mana state currently supports in your product portfolio on where you're going, on how you're going in terms of evolving in what you offer. >> Yeah, that's great. I love that question. So, think of us as the platform itself is this highly scalable distributed architecture. Okay, so we scale on multiple, and I'll come directly to your question. We scale on a number of different ways. One is we're infinitely scalable just in terms of computational power. So we're built for big data by definition. Number two is we're very, we scale very well from a storage efficiency standpoint. So we can store very large volumes of storage, which is a requirement. We also scale very much for the use case standpoint. So we support use cases throughout the life cycle. The one that gets all sort of the attention is obviously backup recovery. Because you have to protect your data. But if I look at it from a life cycle standpoint, our number use case is Test Def. So a lot of these organizations building these new apps now they want to spin up subsets of their data, cause they're supporting things like CICD. Okay, so they want to be able to do rapid testing and such. >> Develop Dev Opps and stuff like that. >> Yeah, Dev Opps and so worth. So, they need Test Def. So we help them automate the process and orchestrate the process of Test Def. Supporting things like sampling. I may have a one petabyte dataset, I'm not going to do Test Def against that. I want to do 10 percent of that and spin that up, and I want to do some masking of personal, PII data. So we can do masking and sampling against that Sport Test Def. We do backup and recovery. We do disaster recovery. So some customers, particularly in the big data space, they may for now say, well, I have replica so for some of this data it's not permanent data, it's transient data, but I do care about DR. So, DR is a key use case. We also do archiving. So if you just think of data through the life cycle, we support all of those. The piece in terms of where we're going is that what's truly unique, in addition to everything I just mentioned, is that we're the only data management platform that's machine learning based. Okay, so machine learning gets a lot of attention, and all that type of stuff, but we're actually delivering machine learning and abled capabilities today, so. >> And we discussed this before this interview. There's a bit of an anomaly detection. How exactly are you using machine learning? What value does it provide to a enterprise data administrator? They have ML inside your tool. >> Inside our platform, Great question. Very specifically, the product we're delivering today essentially there's a capability in the product called threat sets. Okay, so the number one use cases I mentioned is backup and recovery. So within backup and recovery, threat sense, what it will do with no user intervention whatsoever, what it will do is it will analyze your backups, as they go forward. And what it will do is it will learn what a normal pattern looks like across like 50 different metrics. The details of which I couldn't give you right now. But essentially, a whole bunch of different metrics that we look at to establish this is what a normal baseline looks like for you or for you, kind of thing. Great, that's number one. Number two is then we look and constantly analyze is anything occurring that is knocking things outside of that? Creating an anomaly, does something fall outside of that, and when it does, we're notifying the administrators. You might want to look at this, something could've happened. So the value very specifically is around ransomware typically one of the ways you're going to detect ransomware is you will see an anomaly in your backup set, because your data set will change materially. So we will be able to tell you, >> Cause somebody's holding for ransom is what you're saying. >> Correct, so something's going to happen in your data pattern. >> You've lost data that should be there, or whatever it might be. >> Correct, it could be that you lost data. Your change rate went way up, or something. >> Yeah, gotcha. >> There's any number of things that could trigger it. And then we let the administrator know, it happened here. So today we don't then turn around and just automatically solve that. But your point about where we're going. We've already broken the ice on delivering machine learning and abled data management. >> That might indicate you want to check point your backups to like a few days before this was detected. So the least you have, you know what data is most likely missing, so yeah, I understand. >> Bingo, that's exactly right now where we're going with that. As you could imagine, having a machine learning power data management platform at our core, how many different ways we can go with that. When do I backup? What data do I backup? How do I create the optimal RTO and IRPO? From a storage management standpoint, when do I put what data wear? There's all kinds of the whole library science of data management. The future of data management is machine learning based. There's too much data. There's too much complexity for humans to just be able to, you need to bring machine learning into the equation to help you harness the power of your data. We've broken the ice, we've got a long way to go. But we've got the platform to start with. And we've already introduced the first use case around this. And you can imagine all the places we can take this going forward. >> Very exciting. >> So you were the company that's using machine learning right now. What in your opinion will separate the winners from the losers? >> In terms of vendors, or in terms of the customers? >> Well, in terms of both. >> Yeah, let me answer that two ways. So, let me answer it sort of the inward/outward versus how we are unique. We are very unique, and since we're infinitely scalable, We are a single pane of glass for all of your distributed systems. We are very unique in terms of our multi-staged data reduction. And we're the only vendor that's doing, from a technology differentiation standpoint, we're the only vendor that's doing machine learning based stuff. >> Multi-stage data reduction, I want to break that down. What does that actually mean in practice? >> Sure, so we get the question frequently. Is that compression or duplication or is there something else in there? >> There's a couple different things actually. So why does that matter? So a lot of customers will ask a question, well by definition, no sequel or redo based environments, it's all based on replica, so how to back things up. First of all, replication isn't backup. So that's lesson number one. Point in time backup is very different than replication. Replication replicates bad data just as quickly as it replicates good. When you back up these very large data sets, you have to be incredibly efficient in how you do that. What we do with multi-stage data reduction is one, we will do de duplication, we'll do variable length, de duplication, we will do compression, we will do erasure coding, but the other thing that we'll also do in there, is what we call a global de plication pool. So when we're de duping your data, we're actually de duping it against a very large data set. So there's value in, this is where size matters. So the larger the data set, your data's all secured. But the larger the size of the data that I'm actually storing, the higher percentage I could get of de duplication. Because I've got a higher pool to reduce against. So the net result is we're incredibly efficient when you're talking about petabyte scale data management. We're incredibly efficient to the tune of 10 X easily 10 X over traditional de duplication, and multi X over technologies that are more current, if you will. So back to your question about, we are confident that we have a very strong head start. Our opportunity now is we got to drive why we're here. Cause we got to drive awareness. We got to make sure everybody knows who we are and how we're unique and how we're different. And you guys are great. Love being on The Cube. From a customer standpoint, the customers are going to win, and this is sort of a cliche, but it's true, the customer's the best harness of their data. They're the ones that are going to win. They're going to be more competitive, they're going to be able to find ways to be differentiated. And the only way they're going to do that is they're make the appropriate investments in their data infrastructure, in their data lakes, in their data management tool, so that they can harness all that data. >> Where do you see the future of your Hortonworks partnership going? So Hortonworks is, so we support a broad ecosystem. So, Hortonworks is just as important as any of our other data source partners. So, we are where we see that enfolding, is they're going to, we play an important part in, we feel our value, let me put it that way. We feel our value in helping Hortonworks, is as more and more organizations go mainstream with these applications. These are not corner cases anymore. This is not sort of in the lab. This is like the real deal. This is mainstream enterprises running business critical applications. The value we bring is you're not going to rely on those platforms without an enterprise data management solution that delivers what we deliver. So our value there is we can go to market, too. There's all kinds of ways we can go to market together. But net and that our value there is that we provide a very important enterprise data management capability that's important for customers that are deploying in these business critical environments. >> Great. >> Very good, as more of the data gets persisted out at the edge devices and the Internet of things, and so forth, what are the challenges in terms of protecting that data, backup and restore, de duplication, and so forth, and to what extent is your company's Imanis data maybe addressing those kinds of more distributed data management requirements going forward? Do you see that on the rise? Are you hearing that from customers? They want to do more of that? More of an edge cloud environment? Or is that way too far in the future? >> I don't think it's way too far in the future, but I do think there's an inside out. So my position on that is that it's not that there isn't edge work going on. What I would contend is that the big problem right now from an enterprise mainstreaming standpoint, is more getting the house is order, just your core house in order, from you move from sort of a traditional four wall data center to a hybrid cloud environment. Maybe not quite as edge. Combination of how do I leverage on prem and the cloud, so to speak. And how do I get the core data lake and the case of Hortonworks, how do I get that big data lake sorted out? You're touching on, I think, a longer discussion, which is where is the analysis going on? Where is the data going to persist? You know, where do you do some of that computational work? So you get all this information out at the edge. Does all that information end up going into the data lake? So, do you move the storage to where the lake is? Do you start pushing some of the lake functionale out to the edge where you have to then start doing some of the, so it's a much more complicated discussion. >> I know we had this discussion over lunch. This may be outside your wheelhouse, but let me just ask it anyway. We've seen more at Wikibon, I cover AI and distributed training and distributed inference and things so the edges are capturing the data and for more and more, there's a trend to where they're performing local training of their models, their embedded models, from the data they capture. But quite often, edge devices don't have a ton of storage and they're not going to retain that long. But some of that data will need to be archived. Will need to be persisted in a way and managed as a core resource, so we see that kind of requirement maybe not now, but in a few years time distributed training in persistence of that data, protection of that data, becoming a mainstream enterprise requirement. Where AI and machine learning, the whole pipeline is a concern. That's like I said, that's probably outside you guys wheelhouse. That's probably outside the realm for your customers But that kind of thing is coming out, as the likes of Hortonworks and IBM and everybody else, is starting to look at it and implement it, containerization of analytics and data management out to all these micro devices. >> Yes, and I think you're right there. And to your point about the, we're kind of going where the data is, if you will in volumes, kind of thing. And it's going that direction. And frankly, where we see that happening is, that's where the cloud plays a big role as well, because there's edge, but how do you get to the edge? You can get to the edge through the cloud. So, again, we run on AWS. We run on GCP, we run on Asher. So, to be clear, in terms of the data we can rotect, we got a broad portfolio, broad ecosystem of adute based big data, data sources that we support as well as no sequel. If they're running on AWS or GCP or Asher, we support ADLS, we support Asher's data lake stuff, HD Inside, we support a whole bunch of different things both from a cloud standpoint as on prem. Which is where we're seeing some of that edge work happening. >> Great, well Peter thank you so much for coming on The Cube. It's always a pleasure to have you on. >> Yes, thanks for having me and I look forward to being back sometime soon. >> We'll have you. >> Thank you both. >> When the time is right. >> Indeed, we will have more from The Cube's live coverage of Dataworks just after this. (upbeat music)

Published Date : Jun 19 2018

SUMMARY :

of Silicon Valley, it's the Cube. He is the vice president of So the reason I jumped here is because is one of the greatest little company So the timing's right to essentially evolving in what you offer. and I'll come directly to your question. and orchestrate the process of Test Def. And we discussed this So the value very specifically ransom is what you're saying. to happen in your data pattern. You've lost data that should be there, be that you lost data. So today we don't then turn around So the least you have, you know the power of your data. So you were the company the inward/outward What does that actually mean in practice? Sure, so we get the They're the ones that are going to win. This is not sort of in the lab. Where is the data going to persist? from the data they capture. of the data we can rotect, It's always a pleasure to have you on. and I look forward to Indeed, we will have more

ENTITIES

Entity	Category	Confidence
James Kobielus	PERSON	0.99+
Rebecca Knight	PERSON	0.99+
Peter Smails	PERSON	0.99+
Hortonworks	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
Peter	PERSON	0.99+
Imanis	ORGANIZATION	0.99+
10 percent	QUANTITY	0.99+
Silicon Valley	LOCATION	0.99+
San Jose	LOCATION	0.99+
today	DATE	0.99+
San Jose, California	LOCATION	0.99+
50 different metrics	QUANTITY	0.99+
both	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
two ways	QUANTITY	0.99+
one	QUANTITY	0.99+
Test Def	TITLE	0.98+
about a month	QUANTITY	0.98+
Asher	ORGANIZATION	0.98+
Imanis Data	ORGANIZATION	0.97+
Wikibon	ORGANIZATION	0.97+
around five years	QUANTITY	0.96+
10 X	QUANTITY	0.95+
One	QUANTITY	0.94+
Dataworks Summit 2018	EVENT	0.94+
Dev Opps	TITLE	0.94+
DataWorks Summit 2018	EVENT	0.94+
one petabyte	QUANTITY	0.93+
The Cube	ORGANIZATION	0.93+
First	QUANTITY	0.92+
Imanis	PERSON	0.91+
ImanisData	ORGANIZATION	0.89+
single pane	QUANTITY	0.87+
Number two	QUANTITY	0.86+
FEDOOP	TITLE	0.84+
first use case	QUANTITY	0.81+
last five years	DATE	0.76+
GCP	TITLE	0.65+
number one	QUANTITY	0.62+
couple	QUANTITY	0.6+
Dataworks	ORGANIZATION	0.59+
CICD	TITLE	0.55+
HD Inside	ORGANIZATION	0.55+
days	DATE	0.55+
ADLS	ORGANIZATION	0.5+
Test	TITLE	0.47+
IOT	TITLE	0.34+
Cube	ORGANIZATION	0.27+

Dan Potter, Attunity & Ali Bajwa, Hortonworks | DataWorks Summit 2018

Published Date : Jun 19 2018

SUMMARY :

to you by Hortonworks. He is the VP Product So I want to start with able to take those changes They are well known in this business. about taking the metadata that we capture Sure, for more of the into the play now, you at the DataWorks Berlin event but also all over the world, so the timing of our announcement of the Atlas integration. so the leading-edge work ISV of the Year as well, fact that we can come in, so it really, you know, that the data that they're using. right, so more and more the about the possibilities. that now they can, you know, is the way we structure that data in Hive, do the analytics in the spring, yeah. Yeah, the other side to forward of the Attunity CDC one of the key factors so that's definitely in the cards for us. It's fun to have you here. Kobielus, we will have more

ENTITIES

Entity	Category	Confidence
James Kobielus	PERSON	0.99+
Rebecca Knight	PERSON	0.99+
Dan Potter	PERSON	0.99+
Hortonworks	ORGANIZATION	0.99+
Ali Bajwah	PERSON	0.99+
Dan	PERSON	0.99+
Ali Bajwa	PERSON	0.99+
Ali	PERSON	0.99+
James Kobielus	PERSON	0.99+
Thursday morning	DATE	0.99+
San Jose	LOCATION	0.99+
Silicon Valley	LOCATION	0.99+
last year	DATE	0.99+
San Jose	LOCATION	0.99+
Attunity	ORGANIZATION	0.99+
Last year	DATE	0.99+
One	QUANTITY	0.99+
second piece	QUANTITY	0.99+
GDPR	TITLE	0.99+
Atlas	ORGANIZATION	0.99+
Thursday	DATE	0.99+
both	QUANTITY	0.99+
theCUBE	ORGANIZATION	0.98+
Ranger	ORGANIZATION	0.98+
second offering	QUANTITY	0.98+
DataWorks	ORGANIZATION	0.98+
Europe	LOCATION	0.98+
Atlas	TITLE	0.98+
Boston, Massachusetts	LOCATION	0.98+
today	DATE	0.97+
DataWorks Summit 2018	EVENT	0.96+
two main barriers	QUANTITY	0.95+
DataPlane Services	ORGANIZATION	0.95+
DataWorks Summit 2018	EVENT	0.94+
one	QUANTITY	0.93+
San Jose, California	LOCATION	0.93+
Docker	TITLE	0.9+
single glass	QUANTITY	0.87+
3.0	OTHER	0.85+
European	OTHER	0.84+
Attunity	PERSON	0.84+
Hive	LOCATION	0.83+
HDP 3.0	OTHER	0.82+
one nice thing	QUANTITY	0.82+
DataWorks Berlin	EVENT	0.81+
EU	ORGANIZATION	0.81+
first	QUANTITY	0.8+
DataPlane	TITLE	0.8+
EU	LOCATION	0.78+
EDW	TITLE	0.77+
Data Steward Studio	ORGANIZATION	0.73+
Hive	ORGANIZATION	0.73+
Data Steward Studio	TITLE	0.69+
single transaction	QUANTITY	0.68+
Ranger	TITLE	0.66+
Studio	COMMERCIAL_ITEM	0.63+
CDC	ORGANIZATION	0.58+
DataPlane	ORGANIZATION	0.55+
them	QUANTITY	0.53+
HDP 3.0	OTHER	0.52+

Eric Herzog, IBM | DataWorks Summit 2018

>> Live from San Jose in the heart of Silicon Valley, it's theCUBE, covering DataWorks Summit 2018, brought to you by Hortonworks. >> Welcome back to theCUBE's live coverage of DataWorks here in San Jose, California. I'm your host, Rebecca Knight, along with my co-host, James Kobielus. We have with us Eric Herzog. He is the Chief Marketing Officer and VP of Global Channels at the IBM Storage Division. Thanks so much for coming on theCUBE once again, Eric. >> Well, thank you. We always love to be on theCUBE and talk to all of theCUBE analysts about various topics, data, storage, multi-cloud, all the works. >> And before the cameras were rolling, we were talking about how you might be the biggest CUBE alum in the sense of you've been on theCUBE more times than anyone else. >> I know I'm in the top five, but I may be number one, I have to check with Dave Vellante and crew and see. >> Exactly and often wearing a Hawaiian shirt. >> Yes. >> Yes, I was on theCUBE last week from CISCO Live. I was not wearing a Hawaiian shirt. And Stu and John gave me a hard time about why was not I wearing a Hawaiian shirt? So I make sure I showed up to the DataWorks show- >> Stu, Dave, get a load. >> You're in California with a tan, so it fits, it's good. >> So we were talking a little bit before the cameras were rolling and you were saying one of the points that is sort of central to your professional life is it's not just about the storage, it's about the data. So riff on that a little bit. >> Sure, so at IBM we believe everything is data driven and in fact we would argue that data is more valuable than oil or diamonds or plutonium or platinum or silver to anything else. It is the most viable asset, whether you be a global Fortune 500, whether you be a midsize company or whether you be Herzogs Bar and Grill. So data is what you use with your suppliers, with your customers, with your partners. Literally everything around your company is really built around the data so most effectively managing it and make sure, A, it's always performant because when it's not performant they go away. As you probably know, Google did a survey that one, two, after one, two they go off your website, they click somewhere else so has to be performant. Obviously in today's 365, 7 by 24 company it needs to always be resilient and reliable and it always needs to be available, otherwise if the storage goes down, guess what? Your AI doesn't work, your Cloud doesn't work, whatever workload, if you're more traditional, your Oracle, Sequel, you know SAP, none of those workloads work if you don't have a solid storage foundation underneath your data driven enterprise. >> So with that ethos in mind, talk about the products that you are launching, that you newly launched and also your product roadmap going forward. >> Sure, so for us everything really is that storage is this critical foundation for the data driven, multi Cloud enterprise. And as I've said before on theCube, all of our storage software's now Cloud-ified so if you need to automatically tier out to IBM Cloud or Amazon or Azure, we automatically will move the data placement around from one premise out to a Cloud and for certain customers who may be multi Cloud, in this case using multiple private Cloud providers, which happens due to either legal reasons or procurement reasons or geographic reasons for the larger enterprises, we can handle that as well. That's part of it, second thing is we just announced earlier today an artificial intelligence, an AI reference architecture, that incorporates a full stack from the very bottom, both servers and storage, all the way up through the top layer, then the applications on top, so we just launched that today. >> AI for storage management or AI for run a range of applications? >> Regular AI, artificial intelligence from an application perspective. So we announced that reference architecture today. Basically think of the reference architecture as your recipe, your blueprint, of how to put it all together. Some of the components are from IBM, such as Spectrum Scale and Spectrum Computing from my division, our servers from our Cloud division. Some are opensource, Tensor, Caffe, things like that. Basic gives you what the stack needs to be, and what you need to do in various AI workloads, applications and use cases. >> I believe you have distributed deep learning as an IBM capability, that's part of that stack, is that correct? >> That is part of the stack, it's like in the middle of the stack. >> Is it, correct me if I'm wrong, that's containerization of AI functionality? >> Right. >> For distributed deployment? >> Right. >> In an orchestrated Kubernetes fabric, is that correct? >> Yeah, so when you look at it from an IBM perspective, while we clearly support the virtualized world, the VM wares, the hyper V's, the KVMs and the OVMs, and we will continue to do that, we're also heavily invested in the container environment. For example, one of our other divisions, the IBM Cloud Private division, has announced a solution that's all about private Clouds, you can either get it hosted at IBM or literally buy our stack- >> Rob Thomas in fact demoed it this morning, here. >> Right, exactly. And you could create- >> At DataWorks. >> Private Cloud initiative, and there are companies that, whether it be for security purposes or whether it be for legal reasons or other reasons, don't want to use public Cloud providers, be it IBM, Amazon, Azure, Google or any of the big public Cloud providers, they want a private Cloud and IBM either A, will host it or B, with IBM Cloud Private. All of that infrastructure is built around a containerized environment. We support the older world, the virtualized world, and the newer world, the container world. In fact, our storage, allows you to have persistent storage in a container's environment, Dockers and Kubernetes, and that works on all of our block storage and that's a freebie, by the way, we don't charge for that. >> You've worked in the data storage industry for a long time, can you talk a little bit about how the marketing message has changed and evolved since you first began in this industry and in terms of what customers want to hear and what assuages their fears? >> Sure, so nobody cares about speeds and feeds, okay? Except me, because I've been doing storage for 32 years. >> And him, he might care. (laughs) >> But when you look at it, the decision makers today, the CIOs, in 32 years, including seven start ups, IBM and EMC, I've never, ever, ever, met a CIO who used to be a storage guy, ever. So, they don't care. They know that they need storage and the other infrastructure, including servers and networking, but think about it, when the app is slow, who do they blame? Usually they blame the storage guy first, secondarily they blame the server guy, thirdly they blame the networking guy. They never look to see that their code stack is improperly done. Really what you have to do is talk applications, workloads and use cases which is what the AI reference architecture does. What my team does in non AI workloads, it's all about, again, data driven, multi Cloud infrastructure. They want to know how you're going to make a new workload fast AI. How you're going to make their Cloud resilient whether it's private or hybrid. In fact, IBM storage sells a ton of technology to large public Cloud providers that do not have the initials IBM. We sell gobs of storage to other public Cloud providers, both big, medium and small. It's really all about the applications, workloads and use cases, and that's what gets people excited. You basically need a position, just like I talked about with the AI foundations, storage is the critical foundation. We happen to be, knocking on wood, let's hope there's no earthquake, since I've lived here my whole life, and I've been in earthquakes, I was in the '89 quake. Literally fell down a bunch of stairs in the '89 quake. If there's an earthquake as great as IBM storage is, or any other storage or servers, it's crushed. Boom, you're done! Okay, well you need to make sure that your infrastructure, really your data, is covered by the right infrastructure and that it's always resilient, it's always performing and is always available. And that's what IBM drives is about, that's the message, not about how many gigabytes per second in bandwidth or what's the- Not that we can't spew that stuff when we talk to the right person but in general people don't care about it. What they want to know is, "Oh that SAP workload took 30 hours and now it takes 30 minutes?" We have public references that will say that. "Oh, you mean I can use eight to ten times less storage for the same money?" Yes, and we have public references that will say that. So that's what it's really about, so storage is really more from really a speeds and feeds Nuremberger sort of thing, and now all the Nurembergers are doing AI and Caffe and TensorFlow and all of that, they're all hackers, right? It used to be storage guys who used to do that and to a lesser extent server guys and definitely networking guys. That's all shifted to the software side so you got to talk the languages. What can we do with Hortonworks? By the way we were named in Q1 of 2018 as the Hortonworks infrastructure partner of the year. We work with Hortonworks all time, at all levels, whether it be with our channel partners, whether it be with our direct end users, however the customer wants to consume, we work with Hortonworks very closely and other providers as well in that big data analytics and the AI infrastructure world, that's what we do. >> So the containerizations side of the IBM AI stack, then the containerization capabilities in Hortonworks Data Platform 3.0, can you give us a sense for how you plan to, or do you plan at IBM, to work with Hortonworks to bring these capabilities, your reference architecture, into more, or bring their environment for that matter, into more of an alignment with what you're offering? >> So we haven't an exact decision of how we're going to do it, but we interface with Hortonworks on a continual basis. >> Yeah. >> We're working to figure out what's the right solution, whether that be an integrated solution of some type, whether that be something that we do through an adjunct to our reference architecture or some reference architecture that they have but we always make sure, again, we are their partner of the year for infrastructure named in Q1, and that's because we work very tightly with Hortonworks and make sure that what we do ties out with them, hits the right applications, workloads and use cases, the big data world, the analytic world and the AI world so that we're tied off, you know, together to make sure that we deliver the right solutions to the end user because that's what matters most is what gets the end users fired up, not what gets Hortonworks or IBM fired up, it's what gets the end users fired up. >> When you're trying to get into the head space of the CIO, and get your message out there, I mean what is it, what would you say is it that keeps them up at night? What are their biggest pain points and then how do you come in and solve them? >> I'd say the number one pain point for most CIOs is application delivery, okay? Whether that be to the line of business, put it this way, let's take an old workload, okay? Let's take that SAP example, that CIO was under pressure because they were trying, in this case it was a giant retailer who was shipping stuff every night, all over the world. Well guess what? The green undershirts in the wrong size, went to Paducah, Kentucky and then one of the other stores, in Singapore, which needed those green shirts, they ended up with shoes and the reason is, they couldn't run that SAP workload in a couple hours. Now they run it in 30 minutes. It used to take 30 hours. So since they're shipping every night, you're basically missing a cycle, essentially and you're not delivering the right thing from a retail infrastructure perspective to each of their nodes, if you will, to their retail locations. So they care about what do they need to do to deliver to the business the right applications, workloads and use cases on the right timeframe and they can't go down, people get fired for that at the CIO level, right? If something goes down, the CIO is gone and obviously for certain companies that are more in the modern mode, okay? People who are delivering stuff and their primary transactional vehicle is the internet, not retail, not through partners, not through people like IBM, but their primary transactional vehicle is a website, if that website is not resilient, performant and always reliable, then guess what? They are shut down and they're not selling anything to anybody, which is to true if you're Nordstroms, right? Someone can always go into the store and buy something, right, and figure it out? Almost all old retailers have not only a connection to core but they literally have a server and storage in every retail location so if the core goes down, guess what, they can transact. In the era of the internet, you don't do that anymore. Right? If you're shipping only on the internet, you're shipping on the internet so whether it be a new workload, okay? An old workload if you're doing the whole IOT thing. For example, I know a company that I was working with, it's a giant, private mining company. They have those giant, like three story dump trucks you see on the Discovery Channel. Those things cost them a hundred million dollars, so they have five thousand sensors on every dump truck. It's a fricking dump truck but guess what, they got five thousand sensors on there so they can monitor and make sure they take proactive action because if that goes down, whether these be diamond mines or these be Uranium mines or whatever it is, it costs them hundreds of millions of dollars to have a thing go down. That's, if you will, trying to take it out of the traditional, high tech area, which we all talk about, whether it be Apple or Google, or IBM, okay great, now let's put it to some other workload. In this case, this is the use of IOT, in a big data analytics environment with AI based infrastructure, to manage dump trucks. >> I think you're talking about what's called, "digital twins" in a networked environment for materials management, supply chain management and so forth. Are those requirements growing in terms of industrial IOT requirements of that sort and how does that effect the amount of data that needs to be stored, the sophistication of the AI and the stream competing that needs to be provisioned? Can you talk to that? >> The amount of data is growing exponentially. It's growing at yottabytes and zettabytes a year now, not at just exabytes anymore. In fact, everybody on their iPhone or their laptop, I've got a 10GB phone, okay? My laptop, which happens to be a Power Book, is two terabytes of flash, on a laptop. So just imagine how much data's being generated if you're doing in a giant factory, whether you be in the warehouse space, whether you be in healthcare, whether you be in government, whether you be in the financial sector and now all those additional regulations, such as GDPR in Europe and other regulations across the world about what you have to do with your healthcare data, what you have to do with your finance data, the amount of data being stored. And then on top of it, quite honestly, from an AI big data analytics perspective, the more data you have, the more valuable it is, the more you can mine it or the more oil, it's as if the world was just oil, forget the pollution side, let's assume oil didn't cause pollution. Okay, great, then guess what? You would be using oil everywhere and you wouldn't be using solar, you'd be using oil and by the way you need more and more and more, and how much oil you have and how you control that would be the power. That right now is the power of data and if anything it's getting more and more and more. So again, you always have to be able to be resilient with that data, you always have to interact with things, like we do with Hortonworks or other application workloads. Our AI reference architecture is another perfect example of the things you need to do to provide, you know, at the base infrastructure, the right foundation. If you have the wrong foundation to a building, it falls over. Whether it be your house, a hotel, this convention center, if it had the wrong foundation, it falls over. >> Actually to follow the oil analogy just a little bit further, the more of this data you have, the more PII there is and it usually, and the more the workloads need to scale up, especially for things like data masking. >> Right. >> When you have compliance requirements like GDPR, so you want to process the data but you need to mask it first, therefore you need clusters that conceivably are optimized for high volume, highly scalable masking in real time, to drive the downstream app, to feed the downstream applications and to feed the data scientist, you know, data lakes, whatever, and so forth and so on? >> That's why you need things like Incredible Compute which IBM offers with the Power Platform. And why you need storage that, again, can scale up. >> Yeah. >> Can get as big as you need it to be, for example in our reference architecture, we use both what we call Spectrum Scale, which is a big data analytics workload performance engine, it has multiple threaded, multi tasking. In fact one of the largest banks in the world, if you happen to bank with them, your credit card fraud is being done on our stuff, okay? But at the same time we have what's called IBM Cloud Object Storage which is an object store, you want to take every one of those searches for fraud and when they find out that no one stole my MasterCard or the Visa, you still want to put it in there because then you mine it later and see patterns of how people are trying to steal stuff because it's all being done digitally anyway. You want to be able to do that. So you A, want to handle it very quickly and resiliently but then you want to be able to mine it later, as you said, mining the data. >> Or do high value anomaly detection in the moment to be able to tag the more anomalous data that you can then sift through later or maybe in the moment for realtime litigation. >> Well that's highly compute intensive, it's AI intensive and it's highly storage intensive on a performance side and then what happens is you store it all for, lets say, further analysis so you can tell people, "When you get your Am Ex card, do this and they won't steal it." Well the only way to do that, is you use AI on this ocean of data, where you're analyzing all this fraud that has happened, to look at patterns and then you tell me, as a consumer, what to do. Whether it be in the financial business, in this case the credit card business, healthcare, government, manufacturing. One of our resellers actually developed an AI based tool that can scan boxes and cans for faults on an assembly line and actually have sold it to a beer company and to a soda company that instead of people looking at the cans, like you see on the Food Channel, to pull it off, guess what? It's all automatically done. There's no people pulling the can off, "Oh, that can is damaged" and they're looking at it and by the way, sometimes they slip through. Now, using cameras and this AI based infrastructure from IBM, with our storage underneath the hood, they're able to do this. >> Great. Well Eric thank you so much for coming on theCUBE. It's always been a lot of fun talking to you. >> Great, well thank you very much. We love being on theCUBE and appreciate it and hope everyone enjoys the DataWorks conference. >> We will have more from DataWorks just after this. (techno beat music)

Published Date : Jun 19 2018

SUMMARY :

in the heart of Silicon He is the Chief Marketing Officer and talk to all of theCUBE analysts in the sense of you've been on theCUBE I know I'm in the top five, Exactly and often And Stu and John gave me a hard time about You're in California with and you were saying one of the points and it always needs to be available, that you are launching, for the data driven, and what you need to do of the stack, it's like in in the container environment. Rob Thomas in fact demoed it And you could create- and that's a freebie, by the Sure, so nobody cares And him, he might care. and the AI infrastructure So the containerizations So we haven't an exact decision so that we're tied off, you know, together and the reason is, they of the AI and the stream competing and by the way you need more of this data you have, And why you need storage that, again, my MasterCard or the Visa, you still want anomaly detection in the moment at the cans, like you of fun talking to you. the DataWorks conference. We will have more from

ENTITIES

Entity	Category	Confidence
Diane Greene	PERSON	0.99+
Eric Herzog	PERSON	0.99+
James Kobielus	PERSON	0.99+
Jeff Hammerbacher	PERSON	0.99+
Diane	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Mark Albertson	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Rebecca Knight	PERSON	0.99+
Jennifer	PERSON	0.99+
Colin	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Cisco	ORGANIZATION	0.99+
Rob Hof	PERSON	0.99+
Uber	ORGANIZATION	0.99+
Tricia Wang	PERSON	0.99+
Facebook	ORGANIZATION	0.99+
Singapore	LOCATION	0.99+
James Scott	PERSON	0.99+
Scott	PERSON	0.99+
Ray Wang	PERSON	0.99+
Dell	ORGANIZATION	0.99+
Brian Walden	PERSON	0.99+
Andy Jassy	PERSON	0.99+
Verizon	ORGANIZATION	0.99+
Jeff Bezos	PERSON	0.99+
Rachel Tobik	PERSON	0.99+
Alphabet	ORGANIZATION	0.99+
Zeynep Tufekci	PERSON	0.99+
Tricia	PERSON	0.99+
Stu	PERSON	0.99+
Tom Barton	PERSON	0.99+
Google	ORGANIZATION	0.99+
Sandra Rivera	PERSON	0.99+
John	PERSON	0.99+
Qualcomm	ORGANIZATION	0.99+
Ginni Rometty	PERSON	0.99+
France	LOCATION	0.99+
Jennifer Lin	PERSON	0.99+
Steve Jobs	PERSON	0.99+
Seattle	LOCATION	0.99+
Brian	PERSON	0.99+
Nokia	ORGANIZATION	0.99+
Europe	LOCATION	0.99+
Peter Burris	PERSON	0.99+
Scott Raynovich	PERSON	0.99+
Radisys	ORGANIZATION	0.99+
HP	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Eric	PERSON	0.99+
Amanda Silver	PERSON	0.99+

Tendü Yogurtçu, Syncsort | DataWorks Summit 2018

>> Live from San Jose, in the heart of Silicon Valley, It's theCUBE, covering DataWorks Summit 2018. Brought to you by Hortonworks. >> Welcome back to theCUBE's live coverage of DataWorks here in San Jose, California, I'm your host, along with my cohost, James Kobielus. We're joined by Tendu Yogurtcu, she is the CTO of Syncsort. Thanks so much for coming on theCUBE, for returning to theCUBE I should say. >> Thank you Rebecca and James. It's always a pleasure to be here. >> So you've been on theCUBE before and the last time you were talking about Syncsort's growth. So can you give our viewers a company update? Where you are now? >> Absolutely, Syncsort has seen extraordinary growth within the last the last three year. We tripled our revenue, doubled our employees and expanded the product portfolio significantly. Because of this phenomenal growth that we have seen, we also embarked on a new initiative with refreshing our brand. We rebranded and this was necessitated by the fact that we have such a broad portfolio of products and we are actually showing our new brand here, articulating the value our products bring with optimizing existing infrastructure, assuring data security and availability and advancing the data by integrating into next generation analytics platforms. So it's very exciting times in terms of Syncsort's growth. >> So the last time you were on the show it was pre-GT prop PR but we were talking before the cameras were rolling and you were explaining the kinds of adoption you're seeing and what, in this new era, you're seeing from customers and hearing from customers. Can you tell our viewers a little bit about it? >> When we were discussing last time, I talked about four mega trends we are seeing and those mega trends were primarily driven by the advanced business and operation analytics. Data governance, cloud, streaming and data science, artificial intelligence. And we talked, we really made a lot of announcement and focus on the use cases around data governance. Primarily helping our customers for the GDPR Global Data Protection Regulation initiatives and how we can create that visibility in the enterprise through the data by security and lineage and delivering trust data sets. Now we are talking about cloud primarily and the keynotes, this event and our focus is around cloud, primarily driven by again the use cases, right? How the businesses are adopting to the new era. One of the challenges that we see with our enterprise customers, over 7000 customers by the way, is the ability to future-proof their applications. Because this is a very rapidly changing stack. We have seen the keynotes talking about the importance of how do you connect your existing infrastructure with the future modern, next generation platforms. How do you future-proof the platform, make a diagnostic about whether it's Amazon, Microsoft of Google Cloud. Whether it's on-premise in legacy platforms today that the data has to be available in the next generation platforms. So the challenge we are seeing is how do we keep the data fresh? How do we create that abstraction that applications are future-proofed? Because organizations, even financial services customers, banking, insurance, they now have at least one cluster running in the public cloud. And there's private implementations, hybrid becomes the new standard. So our focus and most recent announcements have been around really helping our customers with real-time resilient changes that capture, keeping the data fresh, feeding into the downstream applications with the streaming and messaging data frames, for example Kafka, Amazon Kinesis, as well as keeping the persistent stores and how to Data Lake on-premise in the cloud fresh. >> Puts you into great alignment with your partner Hortonworks so, Tendu I wonder if we are here at DataWorks, it's Hortonworks' show, if you can break out for our viewers, what is the nature, the levels of your relationship, your partnership with Hortonworks and how the Syncsort portfolio plays with HDP 3.0 with Hortonworks DataFlow and the data plan services at a high level. >> Absolutely, so we have been a longtime partner with Hortonworks and a couple of years back, we strengthened our partnership. Hortonworks is reselling Syncsort and we have actually a prescriptive solution for Hadoop and ETL onboarding in Hadoop jointly. And it's very complementary, our strategy is very complementary because what Hortonworks is trying and achieving, is creating that abstraction and future-proofing and interaction consistency around referred as this morning. Across the platform, whether it's on-premise or in the cloud or across multiple clouds. We are providing the data application layer consistency and future-proofing on top of the platform. Leveraging the tools in the platform for orchestration, integrating with HTP, certifying with Trange or HTP, all of the tools DataFlow and at last of course for lineage. >> The theme of this conference is ideas, insights and innovation and as a partner of Hortonworks, can you describe what it means for you to be at this conference? What kinds of community and deepening existing relationships, forming new ones. Can you talk about what happens here? >> This is one of the major events around data and it's DataWorks as opposed to being more specific to the Hadoop itself, right? Because stack is evolving and data challenges are evolving. For us, it means really the interactions with the customers, the organizations and the partners here. Because the dynamics of the use cases is also evolving. For example Data Lake implementations started in U.S. And we started MER European organizations moving to streaming, data streaming applications faster than U.S. >> Why is that? >> Yeah. >> Why are Europeans moving faster to streaming than we are in North America? >> I think a couple of different things might participate. The open sources really enabling organizations to move fast. When the Data Lake initiative started, we have seen a little bit slow start in Europe but more experimentation with the Open Source Stack. And by that the more transformative use cases started really evolving. Like how do I manage interactions of the users with the remote controls as they are watching live TV, type of transformative use cases became important. And as we move to the transformative use cases, streaming is also very critical because lots of data is available and being able to keep the cloud data stores as well as on-premise data stores and downstream applications with fresh data becomes important. We in fact in early June announced that Syncsort's now's a part of Microsoft One Commercial Partner Program. With that our integrate solutions with data integration and data quality are Azure gold certified and Azure ready. We are in co-sale agreement and we are helping jointly a lot of customers, moving data and workloads to Azure and keeping those data stores close to platforms in sync. >> Right. >> So lots of exciting things, I mean there's a lot happening with the application space. There's also lots still happening connected to the governance cases that we have seen. Feeding security and IT operations data into again modern day, next generation analytics platforms is key. Whether it's Splunk, whether it's Elastic, as part of the Hadoop Stack. So we are still focused on governance as part of this multi-cloud and on-premise the cloud implementations as well. We in fact launched our Ironstream for IBMI product to help customers, not just making this state available for mainframes but also from IBMI into Splunk, Elastic and other security information and event management platforms. And today we announced work flow optimization across on-premise and multi-cloud and cloud platforms. So lots of focus across to optimize, assure and integrate portfolio of products helping customers with the business use cases. That's really our focus as we innovate organically and also acquire technologies and solutions. What are the problems we are solving and how we can help our customers with the business and operation analytics, targeting those mega trends around data governance, cloud streaming and also data science. >> What is the biggest trend do you think that is sort of driving all of these changes? As you said, the data is evolving. The use cases are evolving. What is it that is keeping your customers up at night? >> Right now it's still governance, keeping them up at night, because this evolving architecture is also making governance more complex, right? If we are looking at financial services, banking, insurance, healthcare, there are lots of existing infrastructures, mission critical data stores on mainframe IBMI in addition to this gravity of data changing and lots of data with the online businesses generated in the cloud. So how to govern that also while optimizing and making those data stores available for next generation analytics, makes the governance quite complex. So that really keeps and creates a lot of opportunity for the community, right? All of us here to address those challenges. >> Because it sounds to me, I'm hearing Splunk, Advanced Machine did it, I think of the internet of things and sensor grids. I'm hearing IBM mainframes, that's transactional data, that's your customer data and so forth. It seems like much of this data that you're describing that customers are trying to cleanse and consolidate and provide strict governance on, is absolutely essential for them to drive more artificial intelligence into end applications and mobile devices that are being used to drive the customer experience. Do you see more of your customers using your tools to massage the data sets as it were than data scientists then use to build and train their models for deployment into edge applications. Is that an emerging area where your customers are deploying Syncsort? >> Thank you for asking that question. >> It's a complex question. (laughing) But thanks for impacting it... >> It is a complex question but it's very important question. Yes and in the previous discussions, we have seen, and this morning also, Rob Thomas from IBM mentioned it as well, that machine learning and artificial intelligence data science really relies on high-quality data, right? It's 1950s anonymous computer scientist says garbage in, garbage out. >> Yeah. >> When we are using artificial intelligence and machine learning, the implications, the impact of bad data multiplies. Multiplies with the training of historical data. Multiplies with the insights that we are getting out of that. So data scientists today are still spending significant time on preparing the data for the iPipeline, and the data science pipeline, that's where we shine. Because our integrate portfolio accesses the data from all enterprise data stores and cleanses and matches and prepares that in a trusted manner for use for advanced analytics with machine learning, artificial intelligence. >> Yeah 'cause the magic of machine learning for predictive analytics is that you build a statistical model based on the most valid data set for the domain of interest. If the data is junk, then you're going to be building a junk model that will not be able to do its job. So, for want of a nail, the kingdom was lost. For want of a Syncsort, (laughing) Data cleansing and you know governance tool, the whole AI superstructure will fall down. >> Yes, yes absolutely. >> Yeah, good. >> Well thank you so much Tendu for coming on theCUBE and for giving us a lot of background and information. >> Thank you for having me, thank you. >> Good to have you. >> Always a pleasure. >> I'm Rebecca Knight for James Kobielus. We will have more from theCUBE's live coverage of DataWorks 2018 just after this. (upbeat music)

Published Date : Jun 19 2018

SUMMARY :

in the heart of Silicon Valley, It's theCUBE, We're joined by Tendu Yogurtcu, she is the CTO of Syncsort. It's always a pleasure to be here. and the last time you were talking about Syncsort's growth. and expanded the product portfolio significantly. So the last time you were on the show it was pre-GT prop One of the challenges that we see with our enterprise and how the Syncsort portfolio plays with HDP 3.0 We are providing the data application layer consistency and innovation and as a partner of Hortonworks, can you Because the dynamics of the use cases is also evolving. When the Data Lake initiative started, we have seen a little What are the problems we are solving and how we can help What is the biggest trend do you think that is businesses generated in the cloud. massage the data sets as it were than data scientists It's a complex question. Yes and in the previous discussions, we have seen, and the data science pipeline, that's where we shine. If the data is junk, then you're going to be building and for giving us a lot of background and information. of DataWorks 2018 just after this.

ENTITIES

Entity	Category	Confidence
Rebecca	PERSON	0.99+
James Kobielus	PERSON	0.99+
James	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Rebecca Knight	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Tendu Yogurtcu	PERSON	0.99+
Hortonworks	ORGANIZATION	0.99+
Europe	LOCATION	0.99+
Rob Thomas	PERSON	0.99+
San Jose	LOCATION	0.99+
U.S.	LOCATION	0.99+
Silicon Valley	LOCATION	0.99+
Syncsort	ORGANIZATION	0.99+
1950s	DATE	0.99+
San Jose, California	LOCATION	0.99+
Hortonworks'	ORGANIZATION	0.99+
North America	LOCATION	0.99+
early June	DATE	0.99+
DataWorks	ORGANIZATION	0.99+
over 7000 customers	QUANTITY	0.99+
One	QUANTITY	0.98+
theCUBE	ORGANIZATION	0.98+
DataWorks Summit 2018	EVENT	0.97+
Elastic	TITLE	0.97+
one	QUANTITY	0.96+
today	DATE	0.96+
IBMI	TITLE	0.96+
four	QUANTITY	0.95+
Splunk	TITLE	0.95+
Tendü Yogurtçu	PERSON	0.95+
Kafka	TITLE	0.94+
this morning	DATE	0.94+
Data Lake	ORGANIZATION	0.93+
DataWorks	TITLE	0.92+
iPipeline	COMMERCIAL_ITEM	0.91+
DataWorks 2018	EVENT	0.91+
Splunk	PERSON	0.9+
ETL	ORGANIZATION	0.87+
Azure	TITLE	0.85+
Google Cloud	ORGANIZATION	0.83+
Hadoop	TITLE	0.82+
last three year	DATE	0.82+
couple of years back	DATE	0.81+
Syncsort	PERSON	0.8+
HTP	TITLE	0.78+
European	OTHER	0.77+
Tendu	PERSON	0.74+
Europeans	PERSON	0.72+
Data Protection Regulation	TITLE	0.71+
Kinesis	TITLE	0.7+
least one cluster	QUANTITY	0.7+
Ironstream	COMMERCIAL_ITEM	0.66+
Program	TITLE	0.61+
Azure	ORGANIZATION	0.54+
Commercial Partner	OTHER	0.54+
DataFlow	TITLE	0.54+
One	TITLE	0.54+
CTO	PERSON	0.53+
3.0	TITLE	0.53+
Trange	TITLE	0.53+
Stack	TITLE	0.51+

Arun Murthy, Hortonworks | DataWorks Summit 2018

>> Live from San Jose in the heart of Silicon Valley, it's theCUBE, covering DataWorks Summit 2018, brought to you by Hortonworks. >> Welcome back to theCUBE's live coverage of DataWorks here in San Jose, California. I'm your host, Rebecca Knight, along with my cohost, Jim Kobielus. We're joined by Aaron Murphy, Arun Murphy, sorry. He is the co-founder and chief product officer of Hortonworks. Thank you so much for returning to theCUBE. It's great to have you on >> Yeah, likewise. It's been a fun time getting back, yeah. >> So you were on the main stage this morning in the keynote, and you were describing the journey, the data journey that so many customers are on right now, and you were talking about the cloud saying that the cloud is part of the strategy but it really needs to fit into the overall business strategy. Can you describe a little bit about how you're approach to that? >> Absolutely, and the way we look at this is we help customers leverage data to actually deliver better capabilities, better services, better experiences, to their customers, and that's the business we are in. Now with that obviously we look at cloud as a really key part of it, of the overall strategy in terms of how you want to manage data on-prem and on the cloud. We kind of joke that we ourself live in a world of real-time data. We just live in it and data is everywhere. You might have trucks on the road, you might have drawings, you might have sensors and you have it all over the world. At that point, we've kind of got to a point where enterprise understand that they'll manage all the infrastructure but in a lot of cases, it will make a lot more sense to actually lease some of it and that's the cloud. It's the same way, if you're delivering packages, you don't got buy planes and lay out roads you go to FedEx and actually let them handle that view. That's kind of what the cloud is. So that is why we really fundamentally believe that we have to help customers leverage infrastructure whatever makes sense pragmatically both from an architectural standpoint and from a financial standpoint and that's kind of why we talked about how your cloud strategy, is part of your data strategy which is actually fundamentally part of your business strategy. >> So how are you helping customers to leverage this? What is on their minds and what's your response? >> Yeah, it's really interesting, like I said, cloud is cloud, and infrastructure management is certainly something that's at the foremost, at the top of the mind for every CIO today. And what we've consistently heard is they need a way to manage all this data and all this infrastructure in a hybrid multi-tenant, multi-cloud fashion. Because in some GEOs you might not have your favorite cloud renderer. You know, go to parts of Asia is a great example. You might have to use on of the Chinese clouds. You go to parts of Europe, especially with things like the GDPR, the data residency laws and so on, you have to be very, very cognizant of where your data gets stored and where your infrastructure is present. And that is why we fundamentally believe it's really important to have and give enterprise a fabric with which it can manage all of this. And hide the details of all of the underlying infrastructure from them as much as possible. >> And that's DataPlane Services. >> And that's DataPlane Services, exactly. >> The Hortonworks DataPlane Services we launched in October of last year. Actually I was on CUBE talking about it back then too. We see a lot of interest, a lot of excitement around it because now they understand that, again, this doesn't mean that we drive it down to the least common denominator. It is about helping enterprises leverage the key differentiators at each of the cloud renderers products. For example, Google, which we announced a partnership, they are really strong on AI and MO. So if you are running TensorFlow and you want to deal with things like Kubernetes, GKE is a great place to do it. And, for example, you can now go to Google Cloud and get DPUs which work great for TensorFlow. Similarly, a lot of customers run on Amazon for a bunch of the operational stuff, Redshift as an example. So the world we live in, we want to help the CIO leverage the best piece of the cloud but then give them a consistent way to manage and count that data. We were joking on stage that IT has just about learned how deal with Kerberos and Hadoob And now we're telling them, "Oh, go figure out IM on Google." which is also IM on Amazon but they are completely different. The only thing that's consistent is the name. So I think we have a unique opportunity especially with the open source technologies like Altas, Ranger, Knox and so on, to be able to draw a consistent fabric over this and secured occurrence. And help the enterprise leverage the best parts of the cloud to put a best fit architecture together, but which also happens to be a best of breed architecture. >> So the fabric is everything you're describing, all the Apache open source projects in which HortonWorks is a primary committer and contributor, are able to scheme as in policies and metadata and so forth across this distributed heterogeneous fabric of public and private cloud segments within a distributed environment. >> Exactly. >> That's increasingly being containerized in terms of the applications for deployment to edge nodes. Containerization is a big theme in HTP3.0 which you announced at this show. >> Yeah. >> So, if you could give us a quick sense for how that containerization capability plays into more of an edge focus for what your customers are doing. >> Exactly, great point, and again, the fabric is obviously, the core parts of the fabric are the open source projects but we've also done a lot of net new innovation with data plans which, by the way, is also open source. Its a new product and a new platform that you can actually leverage, to lay it out over the open source ones you're familiar with. And again, like you said, containerization, what is actually driving the fundamentals of this, the details matter, the scale at which we operate, we're talking about thousands of nodes, terabytes of data. The details really matter because a 5% improvement at that scale leads to millions of dollars in optimization for capex and opex. So that's why all of that, the details are being fueled and driven by the community which is kind of what we tell over HDP3 Until the key ones, like you said, are containerization because now we can actually get complete agility in terms of how you deploy the applications. You get isolation not only at the resource management level with containers but you also get it at the software level, which means, if two data scientists wanted to use a different version of Python or Scala or Spark or whatever it is, they get that consistently and holistically. That now they can actually go from the test dev cycle into production in a completely consistent manner. So that's why containers are so big because now we can actually leverage it across the stack and the things like MiNiFi showing up. We can actually-- >> Define MiNiFi before you go further. What is MiNiFi for our listeners? >> Great question. Yeah, so we've always had NiFi-- >> Real-time >> Real-time data flow management and NiFi was still sort of within the data center. What MiNiFi does is actually now a really, really small layer, a small thin library if you will that you can throw on a phone, a doorbell, a sensor and that gives you all the capabilities of NiFi but at the edge. >> Mmm Right? And it's actually not just data flow but what is really cool about NiFi it's actually command and control. So you can actually do bidirectional command and control so you can actually change in real-time the flows you want, the processing you do, and so on. So what we're trying to do with MiNiFi is actually not just collect data from the edge but also push the processing as much as possible to the edge because we really do believe a lot more processing is going to happen at the edge especially with the A6 and so on coming out. There will be custom hardware that you can throw and essentially leverage that hardware at the edge to actually do this processing. And we believe, you know, we want to do that even if the cost of data not actually landing up at rest because at the end of the day we're in the insights business not in the data storage business. >> Well I want to get back to that. You were talking about innovation and how so much of it is driven by the open source community and you're a veteran of the big data open source community. How do we maintain that? How does that continue to be the fuel? >> Yeah, and a lot of it starts with just being consistent. From day one, James was around back then, in 2011 we started, we've always said, "We're going to be open source." because we fundamentally believed that the community is going to out innovate any one vendor regardless of how much money they have in the bank. So we really do believe that's the best way to innovate mostly because their is a sense of shared ownership of that product. It's not just one vendor throwing some code out there try to shove it down the customers throat. And we've seen this over and over again, right. Three years ago, we talk about a lot of the data plane stuff comes from Atlas and Ranger and so on. None of these existed. These actually came from the fruits of the collaboration with the community with actually some very large enterprises being a part of it. So it's a great example of how we continue to drive it6 because we fundamentally believe that, that's the best way to innovate and continue to believe so. >> Right. And the community, the Apache community as a whole so many different projects that for example, in streaming, there is Kafka, >> Okay. >> and there is others that address a core set of common requirements but in different ways, >> Exactly. >> supporting different approaches, for example, they are doing streaming with stateless transactions and so forth, or stateless semantics and so forth. Seems to me that HortonWorks is shifting towards being more of a streaming oriented vendor away from data at rest. Though, I should say HDP3.0 has got great scalability and storage efficiency capabilities baked in. I wonder if you could just break it down a little bit what the innovations or enhancements are in HDP3.0 for those of your core customers, which is most of them who are managing massive multi-terabyte, multi-petabyte distributed, federated, big data lakes. What's in HDP3.0 for them? >> Oh for lots. Again, like I said, we obviously spend a lot of time on the streaming side because that's where we see. We live in a real-time world. But again, we don't do it at the cost of our core business which continues to be HDP. And as you can see, the community trend is drive, we talked about continuization massive step up for the Hadoob Community. We've also added support for GPUs. Again, if you think about Trove's at scale machine learning. >> Graphing processing units, >> Graphical-- >> AI, deep learning >> Yeah, it's huge. Deep learning, intensive flow and so on, really, really need a custom, sort of GPU, if you will. So that's coming. That's an HDP3. We've added a whole bunch of scalability improvements with HDFS. We've added federation because now we can go from, you can go over a billion files a billion objects in HDFS. We also added capabilities for-- >> But you indicated yesterday when we were talking that very few of your customers need that capacity yet but you think they will so-- >> Oh for sure. Again, part of this is as we enable more source of data in real-time that's the fuel which drives and that was always the strategy behind the HDF product. It was about, can we leverage the synergies between the real-time world, feed that into what you do today, in your classic enterprise with data at rest and that is what is driving the necessity for scale. >> Yes. >> Right. We've done that. We spend a lot of work, again, loading the total cost of ownership the TCO so we added erasure coding. >> What is that exactly? >> Yeah, so erasure coding is a classic sort of storage concept which allows you to actually in sort of, you know HTFS has always been three replicas So for redundancy, fault tolerance and recovery. Now, it sounds okay having three replicas because it's cheap disk, right. But when you start to think about our customers running 70, 80 hundred terabytes of data those three replicas add up because you've now gone from 80 terabytes of effective data where actually two 1/4 of an exobyte in terms of raw storage. So now what we can do with erasure coding is actually instead of storing the three blocks we actually store parody. We store the encoding of it which means we can actually go down from three to like two, one and a half, whatever we want to do. So, if we can get from three blocks to one and a half especially for your core data, >> Yeah >> the ones you're not accessing every day. It results in a massive savings in terms of your infrastructure costs. And that's kind of what we're in the business doing, helping customers do better with the data they have whether it's on-prem or on the cloud, that's sort of we want to help customers be comfortable getting more data under management along with secured and the lower TCO. The other sort of big piece I'm really excited about HDP3 is all the work that's happened to Hive Community for what we call the real-time database. >> Yes. >> As you guys know, you follow the whole sequel of ours in the Doob Space. >> And hive has changed a lot in the last several years, this is very different from what it was five years ago. >> The only thing that's same from five years ago is the name (laughing) >> So again, the community has done a phenomenal job, kind of, really taking sort of a, we used to call it like a sequel engine on HDFS. From there, to drive it with 3.0, it's now like, with Hive 3 which is part of HDP3 it's a full fledged database. It's got full asset support. In fact, the asset support is so good that writing asset tables is at least as fast as writing non-asset tables now. And you can do that not only on-- >> Transactional database. >> Exactly. Now not only can you do it on prem, you can do it on S3. So you can actually drive the transactions through Hive on S3. We've done a lot of work to actually, you were there yesterday when we were talking about some of the performance work we've done with LAP and so on to actually give consistent performance both on-prem and the cloud and this is a lot of effort simply because the performance characteristics you get from the storage layer with HDFS versus S3 are significantly different. So now we have been able to bridge those with things with LAP. We've done a lot of work and sort of enhanced the security model around it, governance and security. So now you get things like account level, masking, row-level filtering, all the standard stuff that you would expect and more from an Enprise air house. We talked to a lot of our customers, they're doing, literally tens of thousands of views because they don't have the capabilities that exist in Hive now. >> Mmm-hmm 6 And I'm sitting here kind of being amazed that for an open source set of tools to have the best security and governance at this point is pretty amazing coming from where we started off. >> And it's absolutely essential for GDPR compliance and compliance HIPA and every other mandate and sensitivity that requires you to protect personally identifiable information, so very important. So in many ways HortonWorks has one of the premier big data catalogs for all manner of compliance requirements that your customers are chasing. >> Yeah, and James, you wrote about it in the contex6t of data storage studio which we introduced >> Yes. >> You know, things like consent management, having--- >> A consent portal >> A consent portal >> In which the customer can indicate the degree to which >> Exactly. >> they require controls over their management of their PII possibly to be forgotten and so forth. >> Yeah, it's going to be forgotten, it's consent even for analytics. Within the context of GDPR, you have to allow the customer to opt out of analytics, them being part of an analytic itself, right. >> Yeah. >> So things like those are now something we enable to the enhanced security models that are done in Ranger. So now, it's sort of the really cool part of what we've done now with GDPR is that we can get all these capabilities on existing data an existing applications by just adding a security policy, not rewriting It's a massive, massive, massive deal which I cannot tell you how much customers are excited about because they now understand. They were sort of freaking out that I have to go to 30, 40, 50 thousand enterprise apps6 and change them to take advantage, to actually provide consent, and try to be forgotten. The fact that you can do that now by changing a security policy with Ranger is huge for them. >> Arun, thank you so much for coming on theCUBE. It's always so much fun talking to you. >> Likewise. Thank you so much. >> I learned something every time I listen to you. >> Indeed, indeed. I'm Rebecca Knight for James Kobeilus, we will have more from theCUBE's live coverage of DataWorks just after this. (Techno music)

Published Date : Jun 19 2018

SUMMARY :

brought to you by Hortonworks. It's great to have you on Yeah, likewise. is part of the strategy but it really needs to fit and that's the business we are in. And hide the details of all of the underlying infrastructure for a bunch of the operational stuff, So the fabric is everything you're describing, in terms of the applications for deployment to edge nodes. So, if you could give us a quick sense for Until the key ones, like you said, are containerization Define MiNiFi before you go further. Yeah, so we've always had NiFi-- and that gives you all the capabilities of NiFi the processing you do, and so on. and how so much of it is driven by the open source community that the community is going to out innovate any one vendor And the community, the Apache community as a whole I wonder if you could just break it down a little bit And as you can see, the community trend is drive, because now we can go from, you can go over a billion files the real-time world, feed that into what you do today, loading the total cost of ownership the TCO sort of storage concept which allows you to actually is all the work that's happened to Hive Community in the Doob Space. And hive has changed a lot in the last several years, And you can do that not only on-- the performance characteristics you get to have the best security and governance at this point and sensitivity that requires you to protect possibly to be forgotten and so forth. Within the context of GDPR, you have to allow The fact that you can do that now Arun, thank you so much for coming on theCUBE. Thank you so much. we will have more from theCUBE's live coverage of DataWorks

ENTITIES

Entity	Category	Confidence
Jim Kobielus	PERSON	0.99+
Rebecca Knight	PERSON	0.99+
James	PERSON	0.99+
Aaron Murphy	PERSON	0.99+
Arun Murphy	PERSON	0.99+
Arun	PERSON	0.99+
2011	DATE	0.99+
Google	ORGANIZATION	0.99+
5%	QUANTITY	0.99+
80 terabytes	QUANTITY	0.99+
FedEx	ORGANIZATION	0.99+
two	QUANTITY	0.99+
Silicon Valley	LOCATION	0.99+
Hortonworks	ORGANIZATION	0.99+
San Jose	LOCATION	0.99+
Amazon	ORGANIZATION	0.99+
Arun Murthy	PERSON	0.99+
HortonWorks	ORGANIZATION	0.99+
yesterday	DATE	0.99+
San Jose, California	LOCATION	0.99+
three replicas	QUANTITY	0.99+
James Kobeilus	PERSON	0.99+
three blocks	QUANTITY	0.99+
GDPR	TITLE	0.99+
Python	TITLE	0.99+
Europe	LOCATION	0.99+
millions of dollars	QUANTITY	0.99+
Scala	TITLE	0.99+
Spark	TITLE	0.99+
theCUBE	ORGANIZATION	0.99+
five years ago	DATE	0.99+
one and a half	QUANTITY	0.98+
Enprise	ORGANIZATION	0.98+
three	QUANTITY	0.98+
Hive 3	TITLE	0.98+
Three years ago	DATE	0.98+
both	QUANTITY	0.98+
Asia	LOCATION	0.97+
50 thousand	QUANTITY	0.97+
TCO	ORGANIZATION	0.97+
MiNiFi	TITLE	0.97+
Apache	ORGANIZATION	0.97+
40	QUANTITY	0.97+
Altas	ORGANIZATION	0.97+
Hortonworks DataPlane Services	ORGANIZATION	0.96+
DataWorks Summit 2018	EVENT	0.96+
30	QUANTITY	0.95+
thousands of nodes	QUANTITY	0.95+
A6	COMMERCIAL_ITEM	0.95+
Kerberos	ORGANIZATION	0.95+
today	DATE	0.95+
Knox	ORGANIZATION	0.94+
one	QUANTITY	0.94+
hive	TITLE	0.94+
two data scientists	QUANTITY	0.94+
each	QUANTITY	0.92+
Chinese	OTHER	0.92+
TensorFlow	TITLE	0.92+
S3	TITLE	0.91+
October of last year	DATE	0.91+
Ranger	ORGANIZATION	0.91+
Hadoob	ORGANIZATION	0.91+
HIPA	TITLE	0.9+
CUBE	ORGANIZATION	0.9+
tens of thousands	QUANTITY	0.9+
one vendor	QUANTITY	0.89+
last several years	DATE	0.88+
a billion objects	QUANTITY	0.86+
70, 80 hundred terabytes of data	QUANTITY	0.86+
HTP3.0	TITLE	0.86+
two 1/4 of an exobyte	QUANTITY	0.86+
Atlas and	ORGANIZATION	0.85+
DataPlane Services	ORGANIZATION	0.84+
Google Cloud	TITLE	0.82+

Nutanix .Next | NOLA | Day 1 | AM Keynote

>> PA Announcer: Off the plastic tab, and we'll turn on the colors. Welcome to New Orleans. ♪ This is it ♪ ♪ The part when I say I don't want ya ♪ ♪ I'm stronger than I've been before ♪ ♪ This is the part when I set your free ♪ (New Orleans jazz music) ("When the Saints Go Marching In") (rock music) >> PA Announcer: Ladies and gentleman, would you please welcome state of Louisiana chief design officer Matthew Vince and Choice Hotels director of infrastructure services Stacy Nigh. (rock music) >> Well good morning New Orleans, and welcome to my home state. My name is Matt Vince. I'm the chief design office for state of Louisiana. And it's my pleasure to welcome you all to .Next 2018. State of Louisiana is currently re-architecting our cloud infrastructure and Nutanix is the first domino to fall in our strategy to deliver better services to our citizens. >> And I'd like to second that warm welcome. I'm Stacy Nigh director of infrastructure services for Choice Hotels International. Now you may think you know Choice, but we don't own hotels. We're a technology company. And Nutanix is helping us innovate the way we operate to support our franchisees. This is my first visit to New Orleans and my first .Next. >> Well Stacy, you're in for a treat. New Orleans is known for its fabulous food and its marvelous music, but most importantly the free spirit. >> Well I can't wait, and speaking of free, it's my pleasure to introduce the Nutanix Freedom video, enjoy. ♪ I lose everything, so I can sing ♪ ♪ Hallelujah I'm free ♪ ♪ Ah, ah, ♪ ♪ Ah, ah, ♪ ♪ I lose everything, so I can sing ♪ ♪ Hallelujah I'm free ♪ ♪ I lose everything, so I can sing ♪ ♪ Hallelujah I'm free ♪ ♪ I'm free, I'm free, I'm free, I'm free ♪ ♪ Gritting your teeth, you hold onto me ♪ ♪ It's never enough, I'm never complete ♪ ♪ Tell me to prove, expect me to lose ♪ ♪ I push it away, I'm trying to move ♪ ♪ I'm desperate to run, I'm desperate to leave ♪ ♪ If I lose it all, at least I'll be free ♪ ♪ Ah, ah ♪ ♪ Ah, ah ♪ ♪ Hallelujah, I'm free ♪ >> PA Announcer: Ladies and gentlemen, please welcome chief marketing officer Ben Gibson ♪ Ah, ah ♪ ♪ Ah, ah ♪ ♪ Hallelujah, I'm free ♪ >> Welcome, good morning. >> Audience: Good morning. >> And welcome to .Next 2018. There's no better way to open up a .Next conference than by hearing from two of our great customers. And Matthew, thank you for welcoming us to this beautiful, your beautiful state and city. And Stacy, this is your first .Next, and I know she's not alone because guess what It's my first .Next too. And I come properly attired. In the front row, you can see my Nutanix socks, and I think my Nutanix blue suit. And I know I'm not alone. I think over 5,000 people in attendance here today are also first timers at .Next. And if you are here for the first time, it's in the morning, let's get moving. I want you to stand up, so we can officially welcome you into the fold. Everyone stand up, first time. All right, welcome. (audience clapping) So you are all joining not just a conference here. This is truly a community. This is a community of the best and brightest in our industry I will humbly say that are coming together to share best ideas, to learn what's happening next, and in particular it's about forwarding not only your projects and your priorities but your careers. There's so much change happening in this industry. It's an opportunity to learn what's coming down the road and learn how you can best position yourself for this whole new world that's happening around cloud computing and modernizing data center environments. And this is not just a community, this is a movement. And it's a movement that started quite awhile ago, but the first .Next conference was in the quiet little town of Miami, and there was about 800 of you in attendance or so. So who in this hall here were at that first .Next conference in Miami? Let me hear from you. (audience members cheering) Yep, well to all of you grizzled veterans of the .Next experience, welcome back. You have started a movement that has grown and this year across many different .Next conferences all over the world, over 20,000 of your community members have come together. And we like to do it in distributed architecture fashion just like here in Nutanix. And so we've spread this movement all over the world with .Next conferences. And this is surging. We're also seeing just today the current count 61,000 certifications and climbing. Our Next community, close to 70,000 active members of our online community because .Next is about this big moment, and it's about every other day and every other week of the year, how we come together and explore. And my favorite stat of all. Here today in this hall amongst the record 5,500 registrations to .Next 2018 representing 71 countries in whole. So it's a global movement. Everyone, welcome. And you know when I got in Sunday night, I was looking at the tweets and the excitement was starting to build and started to see people like Adile coming from Casablanca. Adile wherever you are, welcome buddy. That's a long trip. Thank you so much for coming and being here with us today. I saw other folks coming from Geneva, from Denmark, from Japan, all over the world coming together for this moment. And we are accomplishing phenomenal things together. Because of your trust in us, and because of some early risk candidly that we have all taken together, we've created a movement in the market around modernizing data center environments, radically simplifying how we operate in the services we deliver to our businesses everyday. And this is a movement that we don't just know about this, but the industry is really taking notice. I love this chart. This is Gartner's inaugural hyperconvergence infrastructure magic quadrant chart. And I think if you see where Nutanix is positioned on there, I think you can agree that's a rout, that's a homerun, that's a mic drop so to speak. What do you guys think? (audience clapping) But here's the thing. It says Nutanix up there. We can honestly say this is a win for this hall here. Because, again, without your trust in us and what we've accomplished together and your partnership with us, we're not there. But we are there, and it is thanks to everyone in this hall. Together we have created, expanded, and truly made this market. Congratulations. And you know what, I think we're just getting started. The same innovation, the same catalyst that we drove into the market to converge storage network compute, the next horizon is around multi-cloud. The next horizon is around whether by accident or on purpose the strong move with different workloads moving into public cloud, some into private cloud moving back and forth, the promise of application mobility, the right workload on the right cloud platform with the right economics. Economics is key here. If any of you have a teenager out there, and they have a hold of your credit card, and they're doing something online or the like. You get some surprises at the end of the month. And that surprise comes in the form of spiraling public cloud costs. And this isn't to say we're not going to see a lot of workloads born and running in public cloud, but the opportunity is for us to take a path that regains control over infrastructure, regain control over workloads and where they're run. And the way I look at it for everyone in this hall, it's a journey we're on. It starts with modernizing those data center environments, continues with embracing the full cloud stack and the compelling opportunity to deliver that consumer experience to rapidly offer up enterprise compute services to your internal clients, lines of businesses and then out into the market. It's then about how you standardize across an enterprise cloud environment, that you're not just the infrastructure but the management, the automation, the control, and running any tier one application. I hear this everyday, and I've heard this a lot already this week about customers who are all in with this approach and running those tier one applications on Nutanix. And then it's the promise of not only hyperconverging infrastructure but hyperconverging multiple clouds. And if we do that, this journey the way we see it what we are doing is building your enterprise cloud. And your enterprise cloud is about the private cloud. It's about expanding and managing and taking back control of how you determine what workload to run where, and to make sure there's strong governance and control. And you're radically simplifying what could be an awfully complicated scenario if you don't reclaim and put your arms around that opportunity. Now how do we do this different than anyone else? And this is going to be a big theme that you're going to see from my good friend Sunil and his good friends on the product team. What are we doing together? We're taking all of that legacy complexity, that friction, that inability to be able to move fast because you're chained to old legacy environments. I'm talking to folks that have applications that are 40 years old, and they are concerned to touch them because they're not sure if they can react if their infrastructure can meet the demands of a new, modernized workload. We're making all that complexity invisible. And if all of that is invisible, it allows you to focus on what's next. And that indeed is the spirit of this conference. So if the what is enterprise cloud, and the how we do it different is by making infrastructure invisible, data centers, clouds, then why are we all here today? What is the binding principle that spiritually, that emotionally brings us all together? And we think it's a very simple, powerful word, and that word is freedom. And when we think about freedom, we think about as we work together the freedom to build the data center that you've always wanted to build. It's about freedom to run the applications where you choose based on the information and the context that wasn't available before. It's about the freedom of choice to choose the right cloud platform for the right application, and again to avoid a lot of these spiraling costs in unanticipated surprises whether it be around security, whether it be around economics or governance that come to the forefront. It's about the freedom to invent. It's why we got into this industry in the first place. We want to create. We want to build things not keep the lights on, not be chained to mundane tasks day by day. And it's about the freedom to play. And I hear this time and time again. My favorite tweet from a Nutanix customer to this day is just updated a lot of nodes at 38,000 feed on United Wifi, on my way to spend vacation with my family. Freedom to play. This to me is emotionally what brings us all together and what you saw with the Freedom video earlier, and what you see here is this new story because we want to go out and spread the word and not only talk about the enterprise cloud, not only talk about how we do it better, but talk about why it's so compelling to be a part of this hall here today. Now just one note of housekeeping for everyone out there in case I don't want anyone to take a wrong turn as they come to this beautiful convention center here today. A lot of freedom going on in this convention center. As luck may have it, there's another conference going on a little bit down that way based on another high growth, disruptive industry. Now MJBizCon Next, and by coincidence it's also called next. And I have to admire the creativity. I have to admire that we do share a, hey, high growth business model here. And in case you're not quite sure what this conference is about. I'm the head of marketing here. I have to show the tagline of this. And I read the tagline from license to launch and beyond, the future of the, now if I can replace that blank with our industry, I don't know, to me it sounds like a new, cool Sunil product launch. Maybe launching a new subscription service or the like. Stay tuned, you never know. I think they're going to have a good time over there. I know we're going to have a wonderful week here both to learn as well as have a lot of fun particularly in our customer appreciation event tonight. I want to spend a very few important moments on .Heart. .Heart is Nutanix's initiative to promote diversity in the technology arena. In particular, we have a focus on advancing the careers of women and young girls that we want to encourage to move into STEM and high tech careers. You have the opportunity to engage this week with this important initiative. Please role the video, and let's learn more about how you can do so. >> Video Plays (electronic music) >> So all of you have received these .Heart tokens. You have the freedom to go and choose which of the four deserving charities can receive donations to really advance our cause. So I thank you for your engagement there. And this community is behind .Heart. And it's a very important one. So thank you for that. .Next is not the community, the moment it is without our wonderful partners. These are our amazing sponsors. Yes, it's about sponsorship. It's also about how we integrate together, how we innovate together, and we're about an open community. And so I want to thank all of these names up here for your wonderful sponsorship of this event. I encourage everyone here in this room to spend time, get acquainted, get reacquainted, learn how we can make wonderful music happen together, wonderful music here in New Orleans happen together. .Next isn't .Next with a few cool surprises. Surprise number one, we have a contest. This is a still shot from the Freedom video you saw right before I came on. We have strategically placed a lucky seven Nutanix Easter eggs in this video. And if you go to Nutanix.com/freedom, watch the video. You may have to use the little scrubbing feature to slow down 'cause some of these happen quickly. You're going to find some fun, clever Easter eggs. List all seven, tweet that out, or as many as you can, tweet that out with hashtag nextconf, C, O, N, F, and we'll have a random drawing for an all expenses paid free trip to .Next 2019. And just to make sure everyone understands Easter egg concept. There's an eighth one here that's actually someone that's quite famous in our circles. If you see on this still shot, there's someone in the back there with a red jacket on. That's not just anyone. We're targeting in here. That is our very own Julie O'Brien, our senior vice president of corporate marketing. And you're going to hear from Julie later on here at .Next. But Julie and her team are the engine and the creativity behind not only our new Freedom campaign but more importantly everything that you experience here this week. Julie and her team are amazing, and we can't wait for you to experience what they've pulled together for you. Another surprise, if you go and visit our Freedom booths and share your stories. So they're like video booths, you share your success stories, your partnerships, your journey that I talked about, you will be entered to win a beautiful Nutanix brand compliant, look at those beautiful colors, bicycle. And it's not just any bicycle. It's a beautiful bicycle made by our beautiful customer Trek. I actually have a Trek bike. I love cycling. Unfortunately, I'm not eligible, but all of you are. So please share your stories in the Freedom Nutanix's booths and put yourself in the running, or in the cycling to get this prize. One more thing I wanted to share here. Yesterday we had a great time. We had our inaugural Nutanix hackathon. This hackathon brought together folks that were in devops practices, many of you that are in this room. We sold out. We thought maybe we'd get four or five teams. We had to shutdown at 14 teams that were paired together with a Nutanix mentor, and you coded. You used our REST APIs. You built new apps that integrated in with Prism and Clam. And it was wonderful to see this. Everyone I talked to had a great time on this. We had three winners. In third place, we had team Copper or team bronze, but team Copper. Silver, Not That Special, they're very humble kind of like one of our key mission statements. And the grand prize winner was We Did It All for the Cookies. And you saw them coming in on our Mardi Gras float here. We Did It All for Cookies, they did this very creative job. They leveraged an Apple Watch. They were lighting up VMs at a moments notice utilizing a lot of their coding skills. Congratulations to all three, first, second, and third all receive $2,500. And then each of them, then were able to choose a charity to deliver another $2,500 including Ronald McDonald House for the winner, we did it all for the McDonald Land cookies, I suppose, to move forward. So look for us to do more of these kinds of events because we want to bring together infrastructure and application development, and this is a great, I think, start for us in this community to be able to do so. With that, who's ready to hear form Dheeraj? You ready to hear from Dheeraj? (audience clapping) I'm ready to hear from Dheeraj, and not just 'cause I work for him. It is my distinct pleasure to welcome on the stage our CEO, cofounder and chairman Dheeraj Pandey. ("Free" by Broods) ♪ Hallelujah, I'm free ♪ >> Thank you Ben and good morning everyone. >> Audience: Good morning. >> Thank you so much for being here. It's just such an elation when I'm thinking about the Mardi Gras crowd that came here, the partners, the customers, the NTCs. I mean there's some great NTCs up there I could relate to because they're on Slack as well. How many of you are in Slack Nutanix internal Slack channel? Probably 5%, would love to actually see this community grow from here 'cause this is not the only even we would love to meet you. We would love to actually do this in a real time bite size communication on our own internal Slack channel itself. Now today, we're going to talk about a lot of things, but a lot of hard things, a lot of things that take time to build and have evolved as the industry itself has evolved. And one of the hard things that I want to talk about is multi-cloud. Multi-cloud is a really hard problem 'cause it's full of paradoxes. It's really about doing things that you believe are opposites of each other. It's about frictionless, but it's also about governance. It's about being simple, and it's also about being secure at the same time. It's about delight, it's about reducing waste, it's about owning, and renting, and finally it's also about core and edge. How do you really make this big at a core data center whether it's public or private? Or how do you really shrink it down to one or two nodes at the edge because that's where your machines are, that's where your people are? So this is a really hard problem. And as you hear from Sunil and the gang there, you'll realize how we've actually evolved our solutions to really cater to some of these. One of the approaches that we have used to really solve some of these hard problems is to have machines do more, and I said a lot of things in those four words, have machines do more. Because if you double-click on that sentence, it really means we're letting design be at the core of this. And how do you really design data centers, how do you really design products for the data center that hush all the escalations, the details, the complexities, use machine-learning and AI and you know figure our anomaly detection and correlations and patter matching? There's a ton of things that you need to do to really have machines do more. But along the way, the important lesson is to make machines invisible because when machines become invisible, it actually makes something else visible. It makes you visible. It makes governance visible. It makes applications visible, and it makes services visible. A lot of things, it makes teams visible, careers visible. So while we're really talking about invisibility of machines, we're talking about visibility of people. And that's how we really brought all of you together in this conference as well because it makes all of us shine including our products, and your careers, and your teams as well. And I try to define the word customer success. You know it's one of the favorite words that I'm actually using. We've just hired a great leader in customer success recently who's really going to focus on this relatively hard problem, yet another hard problem of customer success. We think that customer success, true customer success is possible when we have machines tend towards invisibility. But along the way when we do that, make humans tend towards freedom. So that's the real connection, the yin-yang of machines and humans that Nutanix is really all about. And that's why design is at the core of this company. And when I say design, I mean reducing friction. And it's really about reducing friction. And everything we do, the most mundane of things which could be about migrating applications, spinning up VMs, self-service portals, automatic upgrades, and automatic scale out, and all the things we do is about reducing friction which really makes machines become invisible and humans gain freedom. Now one of the other convictions we have is how all of us are really tied at the hip. You know our success is tied to your success. If we make you successful, and when I say you, I really mean Main Street. Main Street being customers, and partners, and employees. If we make all of you successful, then we automatically become successful. And very coincidentally, Main Street and Wall Street are also tied in that very same relation as well. If we do a great job at Main Street, I think the Wall Street customer, i.e. the investor, will take care of itself. You'll have you know taken care of their success if we took care of Main Street success itself. And that's the narrative that our CFO Dustin Williams actually went and painted to our Wall Street investors two months ago at our investor day conference. We talked about a $3 billion number. We said look as a company, as a software company, we can go and achieve $3 billion in billings three years from now. And it was a telling moment for the company. It was really about talking about where we could be three years from now. But it was not based on a hunch. It was based on what we thought was customer success. Now realize that $3 billion in pure software. There's only 10 to 15 companies in the world that actually have that kind of software billings number itself. But at the core of this confidence was customer success, was the fact that we were doing a really good job of not over promising and under delivering but under promising starting with small systems and growing the trust of the customers over time. And this is one of the statistics we actually talk about is repeat business. The first dollar that a Global 2000 customer spends in Nutanix, and if we go and increase their trust 15 times by year six, and we hope to actually get 17 1/2 and 19 times more trust in the years seven and eight. It's very similar numbers for non Global 2000 as well. Again, we go and really hustle for customer success, start small, have you not worry about paying millions of dollars upfront. You know start with systems that pay as they grow, you pay as they grow, and that's the way we gain trust. We have the same non Global 2000 pay $6 1/2 for the first dollar they've actually spent on us. And with this, I think the most telling moment was when Dustin concluded. And this is key to this audience here as well. Is how the current cohorts which is this audience here and many of them were not here will actually carry the weight of $3 billion, more than 50% of it if we did a great job of customer success. If we were humble and honest and we really figured out what it meant to take care of you, and if we really understood what starting small was and having to gain the trust with you over time, we think that more than 50% of that billings will actually come from this audience here without even looking at new logos outside. So that's the trust of customer success for us, and it takes care of pretty much every customer not just the Main Street customer. It takes care of Wall Street customer. It takes care of employees. It takes care of partners as well. Now before I talk about technology and products, I want to take a step back 'cause many of you are new in this audience. And I think that it behooves us to really talk about the history of this company. Like we've done a lot of things that started out as science projects. In fact, I see some tweets out there and people actually laugh at Nutanix cloud. And this is where we were in 2012. So if you take a step back and think about where the company was almost seven, eight years ago, we were up against giants. There was a $30 billion industry around network attached storage, and storage area networks and blade servers, and hypervisors, and systems management software and so on. So what did we start out with? Very simple premise that we will collapse the architecture of the data center because three tier is wasteful and three tier is not delightful. It was a very simple hunch, we said we'll take rack mount servers, we'll put a layer of software on top of it, and that layer of software back then only did storage. It didn't do networks and security, and it ran on top of a well known hypervisor from VMware. And we said there's one non negotiable thing. The fact that the design must change. The control plane for this data center cannot be the old control plane. It has to be rethought through, and that's why Prism came about. Now we went and hustled hard to add more things to it. We said we need to make this diverse because it can't just be for one application. We need to make it CPU heavy, and memory heavy, and storage heavy, and flash heavy and so on. And we built a highly configurable HCI. Now all of them are actually configurable as you know of today. And this was not just innovation in technologies, it was innovation in business and sizing, capacity planning, quote to cash business processes. A lot of stuff that we had to do to make this highly configurable, so you can really scale capacity and performance independent of each other. Then in 2014, we did something that was very counterintuitive, but we've done this on, and on, and on again. People said why are you disrupting yourself? You know you've been doing a good job of shipping appliances, but we also had the conviction that HCI was not about hardware. It was about a form factor, but it was really about an operating system. And we started to compete with ourselves when we said you know what we'll do arm's length distribution, we'll do arm's length delivery of products when we give our software to our Dell partner, to Dell as a partner, a loyal partner. But at the same time, it was actually seen with a lot of skepticism. You know these guys are wondering how to really make themselves vanish because they're competing with themselves. But we also knew that if we didn't compete with ourselves someone else will. Now one of the most controversial decisions was really going and doing yet another hypervisor. In the year 2015, it was really preposterous to build yet another hypervisor. It was a very mature market. This was coming probably 15 years too late to the market, or at least 10 years too late to market. And most people said it shouldn't be done because hypervisor is a commodity. And that's the word we latched on to. That this commodity should not have to be paid for. It shouldn't have a team of people managing it. It should actually be part of your overall stack, but it should be invisible. Just like storage needs to be invisible, virtualization needs to be invisible. But it was a bold step, and I think you know at least when we look at our current numbers, 1/3rd of our customers are actually using AHV. At least every quarter that we look at it, our new deployments, at least 35% of it is actually being used on AHV itself. And again, a very preposterous thing to have said five years ago, four years ago to where we've actually come. Thank you so much for all of you who've believed in the fact that virtualization software must be invisible and therefore we should actually try out something that is called AHV today. Now we went and added Lenovo to our OEM mix, started to become even more of a software company in the year 2016. Went and added HP and Cisco in some of very large deals that we talk about in earnings call, our HP deals and Cisco deals. And some very large customers who have procured ELAs from us, enterprise license agreements from us where they want to mix and match hardware. They want to mix Dell hardware with HP hardware but have common standard Nutanix entitlements. And finally, I think this was another one of those moments where we say why should HCI be only limited to X86. You know this operating systems deserves to run on a non X86 architecture as well. And that gave birth to this idea of HCI and Power Systems from IBM. And we've done a great job of really innovating with them in the last three, four quarters. Some amazing innovation that has come out where you can now run AIX 7.x on Nutanix. And for the first time in the history of data center, you can actually have a single software not just a data plane but a control plane where you can manage an IBM farm, an Power farm, and open Power farm and an X86 farm from the same control plane and have you know the IBM farm feed storage to an Intel compute farm and vice versa. So really good things that we've actually done. Now along the way, something else was going on while we were really busy building the private cloud, we knew there was a new consumption model on computing itself. People were renting computing using credit cards. This is the era of the millennials. They were like really want to bypass people because at the end of the day, you know why can't computing be consumed the way like eCommerce is? And that devops movement made us realize that we need to add to our stack. That stack will now have other computing clouds that is AWS and Azure and GCP now. So similar to the way we did Prism. You know Prism was really about going and making hypervisors invisible. You know we went ahead and said we'll add Calm to our portfolio because Calm is now going to be what Prism was to us back when we were really dealing with multi hypervisor world. Now it's going to be multi-cloud world. You know it's one of those things we had a gut around, and we really come to expect a lot of feedback and real innovation. I mean yesterday when we had the hackathon. The center, the epicenter of the discussion was Calm, was how do you automate on multiple clouds without having to write a single line of code? So we've come a long way since the acquisition of Calm two years ago. I think it's going to be a strong pillar in our overall product portfolio itself. Now the word multi-cloud is going to be used and over used. In fact, it's going to be blurring its lines with the idea of hyperconvergence of clouds, you know what does it mean. We just hope that hyperconvergence, the way it's called today will morph to become hyperconverged clouds not just hyperconverged boxes which is a software defined infrastructure definition itself. But let's focus on the why of multi-cloud. Why do we think it can't all go into a public cloud itself? The one big reason is just laws of the land. There's data sovereignty and computing sovereignty, regulations and compliance because of which you need to be in where the government with the regulations where the compliance rules want you to be. And by the way, that's just one reason why the cloud will have to disperse itself. It can't just be 10, 20 large data centers around the world itself because you have 200 plus countries and half of computing actually gets done outside the US itself. So it's a really important, very relevant point about the why of multi-cloud. The second one is just simple laws of physics. You know if there're machines at the edge, and they're producing so much data, you can't bring all the data to the compute. You have to take the compute which is stateless, it's an app. You take the app to where the data is because the network is the enemy. The network has always been the enemy. And when we thought we've made fatter networks, you've just produced more data as well. So this just goes without saying that you take something that's stateless that's without gravity, that's lightweight which is compute and the application and push it close to where the data itself is. And the third one which is related is just latency reasons you know? And it's not just about machine latency and electrons transferring over the speed light, and you can't defy the speed of light. It's also about human latency. It's also about multiple teams saying we need to federate and delegate, and we need to push things down to where the teams are as opposed to having to expect everybody to come to a very large computing power itself. So all the ways, the way they are, there will be at least three different ways of looking at multi-cloud itself. There's a centralized core cloud. We all go and relate to this because we've seen large data centers and so on. And that's the back office workhorse. It will crunch numbers. It will do processing. It will do a ton of things that will go and produce results for you know how we run our businesses, but there's also the dispersal of the cloud, so ROBO cloud. And this is the front office server that's really serving. It's a cloud that's going to serve people. It's going to be closer to people, and that's what a ROBO cloud is. We have a ton of customers out here who actually use Nutanix and the ROBO environments themselves as one node, two node, three node, five node servers, and it just collapses the entire server closet room in these ROBOs into something really, really small and minuscule. And finally, there's going to be another dispersed edge cloud because that's where the machines are, that's where the data is. And there's going to be an IOT machine fog because we need to miniaturize computing to something even smaller, maybe something that can really land in the palm in a mini server which is a PC like server, but you need to run everything that's enterprise grade. You should be able to go and upgrade them and monitor them and analyze them. You know do enough computing up there, maybe event-based processing that can actually happen. In fact, there's some great innovation that we've done at the edge with IOTs that I'd love for all of you to actually attend some sessions around as well. So with that being said, we have a hole in the stack. And that hole is probably one of the hardest problems that we've been trying to solve for the last two years. And Sunil will talk a lot about that. This idea of hybrid. The hybrid of multi-cloud is one of the hardest problems. Why? Because we're talking about really blurring the lines with owning and renting where you have a single-tenant environment which is your data center, and a multi-tenant environment which is the service providers data center, and the two must look like the same. And the two must look like the same is that hard a problem not just for burst out capacity, not just for security, not just for identity but also for networks. Like how do you blur the lines between networks? How do you blur the lines for storage? How do you really blur the lines for a single pane of glass where you can think of availability zones that look highly symmetric even though they're not because one of 'em is owned by you, and it's single-tenant. The other one is not owned by you, that's multi-tenant itself. So there's some really hard problems in hybrid that you'll hear Sunil talk about and the team. And some great strides that we've actually made in the last 12 months of really working on Xi itself. And that completes the picture now in terms of how we believe the state of computing will be going forward. So what are the must haves of a multi-cloud operating system? We talked about marketplace which is catalogs and automation. There's a ton of orchestration that needs to be done for multi-cloud to come together because now you have a self-service portal which is providing an eCommerce view. It's really about you know getting to do a lot of requests and workflows without having people come in the way, without even having tickets. There's no need for tickets if you can really start to think like a self-service portal as if you're just transacting eCommerce with machines and portals themselves. Obviously the next one is networking security. You need to blur the lines between on-prem and off-prem itself. These two play a huge role. And there's going to be a ton of details that you'll see Sunil talk about. But finally, what I want to focus on the rest of the talk itself here is what governance and compliance. This is a hard problem, and it's a hard problem because things have evolved. So I'm going to take a step back. Last 30 years of computing, how have consumption models changed? So think about it. 30 years ago, we were making decisions for 10 plus years, you know? Mainframe, at least 10 years, probably 20 plus years worth of decisions. These were decisions that were extremely waterfall-ish. Make 10s of millions of dollars worth of investment for a device that we'd buy for at least 10 to 20 years. Now as we moved to client-server, that thing actually shrunk. Now you're talking about five years worth of decisions, and these things were smaller. So there's a little bit more velocity in our decisions. We were not making as waterfall-ish decision as we used to with mainframes. But still five years, talk about virtualized, three tier, maybe three to five year decisions. You know they're still relatively big decisions that we were making with computer and storage and SAN fabrics and virtualization software and systems management software and so on. And here comes Nutanix, and we said no, no. We need to make it smaller. It has to become smaller because you know we need to make more agile decisions. We need to add machines every week, every month as opposed to adding you know machines every three to five years. And we need to be able to upgrade them, you know any point in time. You can do the upgrades every month if you had to, every week if you had to and so on. So really about more agility. And yet, we were not complete because there's another evolution going on, off-prem in the public cloud where people are going and doing reserved instances. But more than that, they were doing on demand stuff which no the decision was days to weeks. Some of these things that unitive compute was being rented for days to weeks, not years. And if you needed something more, you'd shift a little to the left and use reserved instances. And then spot pricing, you could do spot pricing for hours and finally lambda functions. Now you could to function as a service where things could actually be running only for minutes not even hours. So as you can see, there's a wide spectrum where when you move to the right, you get more elasticity, and when you move to the left, you're talking about predictable decision making. And in fact, it goes from minutes on one side to 10s of years on the other itself. And we hope to actually go and blur the lines between where NTNX is today where you see Nutanix right now to where we really want to be with reserved instances and on demand. And that's the real ask of Nutanix. How do you take care of this discontinuity? Because when you're owning things, you actually end up here, and when you're renting things, you end up here. What does it mean to really blur the lines between these two because people do want to make decisions that are better than reserved instance in the public cloud. We'll talk about why reserved instances which looks like a proxy for Nutanix it's still very, very wasteful even though you might think it's delightful, it's very, very wasteful. So what does it mean for on-prem and off-prem? You know you talk about cost governance, there's security compliance. These high velocity decisions we're actually making you know where sometimes you could be right with cost but wrong on security, but sometimes you could be right in security but wrong on cost. We need to really figure out how machines make some of these decisions for us, how software helps us decide do we have the right balance between cost, governance, and security compliance itself? And to get it right, we have introduced our first SAS service called Beam. And to talk more about Beam, I want to introduce Vijay Rayapati who's the general manager of Beam engineering to come up on stage and talk about Beam itself. Thank you Vijay. (rock music) So you've been here a couple of months now? >> Yes. >> At the same time, you spent the last seven, eight years really handling AWS. Tell us more about it. >> Yeah so we spent a lot of time trying to understand the last five years at Minjar you know how customers are really consuming in this new world for their workloads. So essentially what we tried to do is understand the consumption models, workload patterns, and also build algorithms and apply intelligence to say how can we lower this cost and you know improve compliance of their workloads.? And now with Nutanix what we're trying to do is how can we converge this consumption, right? Because what happens here is most customers start with on demand kind of consumption thinking it's really easy, but the total cost of ownership is so high as the workload elasticity increases, people go towards spot or a scaling, but then you need a lot more automation that something like Calm can help them. But predictability of the workload increases, then you need to move towards reserved instances, right to lower costs. >> And those are some of the things that you go and advise with some of the software that you folks have actually written. >> But there's a lot of waste even in the reserved instances because what happens it while customers make these commitments for a year or three years, what we see across, like we track a billion dollars in public cloud consumption you know as a Beam, and customers use 20%, 25% of utilization of their commitments, right? So how can you really apply, take the data of consumption you know apply intelligence to essentially reduce their you know overall cost of ownership. >> You said something that's very telling. You said reserved instances even though they're supposed to save are still only 20%, 25% utilized. >> Yes, because the workloads are very dynamic. And the next thing is you can't do hot add CPU or hot add memory because you're buying them for peak capacity. There is no convergence of scaling that apart from the scaling as another node. >> So you actually sized it for peak, but then using 20%, 30%, you're still paying for the peak. >> That's right. >> Dheeraj: That can actually add up. >> That's what we're trying to say. How can we deliver visibility across clouds? You know how can we deliver optimization across clouds and consumption models and bring the control while retaining that agility and demand elasticity? >> That's great. So you want to show us something? >> Yeah absolutely. So this is Beam as just Dheeraj outlined, our first SAS service. And this is my first .Next. And you know glad to be here. So what you see here is a global consumption you know for a business across different clouds. Whether that's in a public cloud like Amazon, or Azure, or Nutanix. We kind of bring the consumption together for the month, the recent month across your accounts and services and apply intelligence to say you know what is your spent efficiency across these clouds? Essentially there's a lot of intelligence that goes in to detect your workloads and consumption model to say if you're spending $100, how efficiently are you spending? How can you increase that? >> So you have a centralized view where you're looking at multiple clouds, and you know you talk about maybe you can take an example of an account and start looking at it? >> Yes, let's go into a cloud provider like you know for this business, let's go and take a loot at what's happening inside an Amazon cloud. Here we get into the deeper details of what's happening with the consumption of a specific services as well as the utilization of both on demand and RI. You know what can you do to lower your cost and detect your spend efficiency of a dollar to see you know are there resources that are provisioned by teams for applications that are not being used, or are there resources that we should go and rightsize because you know we have all this monitoring data, configuration data that we crunch through to basically detect this? >> You think there's billions of events that you look at everyday. You're already looking at a billon dollars worth of AWS spend. >> Right, right. >> So billions of events, billing, metering events every year to really figure out and optimize for them. >> So what we have here is a very popular international government organization. >> Dheeraj: Wow, so it looks like Russians are everywhere, the cloud is everywhere actually. >> Yes, it's quite popular. So when you bring your master account into Beam, we kind of detect all the linked accounts you know under that. Then you can go and take a look at not just at the organization level within it an account level. >> So these are child objects, you know. >> That's right. >> You can think of them as ephemeral accounts that you create because you don't want to be on the record when you're doing spams on Facebook for example. >> Right, let's go and take a look at what's happening inside a Facebook ad spend account. So we have you know consumption of the services. Let's go deeper into compute consumption, and you kind of see a trendline. You can do a lot of computing. As you see, looks like one campaign has ended. They started another campaign. >> Dheeraj: It looks like they're not stopping yet, man. There's a lot of money being made in Facebook right now. (Vijay laughing) >> So not only just get visibility at you know compute as a service inside a cloud provider, you can go deeper inside compute and say you know what is a service that I'm really consuming inside compute along with the CPUs n'stuff, right? What is my data transfer? You know what is my network? What is my load blancers? So essentially you get a very deeper visibility you know as a service right. Because we have three goals for Beam. How can we deliver visibility across clouds? How can we deliver visibility across services? And how can we deliver, then optimization? >> Well I think one thing that I just want to point out is how this SAS application was an extremely teachable moment for me to learn about the different resources that people could use about the public cloud. So all of you who actually have not gone deep enough into the idea of public cloud. This could be a great app for you to learn about things, the resources, you know things that you could do to save and security and things of that nature. >> Yeah. And we really believe in creating the single pane view you know to mange your optimization of a public cloud. You know as Ben spoke about as a business, you need to have freedom to use any cloud. And that's what Beam delivers. How can you make the right decision for the right workload to use any of the cloud of your choice? >> Dheeraj: How 'about databases? You talked about compute as well but are there other things we could look at? >> Vijay: Yes, let's go and take a look at database consumption. What you see here is they're using inside Facebook ad spending, they're using all databases except Oracle. >> Dheeraj: Wow, looks like Oracle sales folks have been active in Russia as well. (Vijay laughing) >> So what we're seeing here is a global view of you know what is your spend efficiency and which is kind of a scorecard for your business for the dollars that you're spending. And the great thing is Beam kind of brings together you know through its intelligence and algorithms to detect you know how can you rightsize resources and how can you eliminate things that you're not using? And we deliver and one click fix, right? Let's go and take a look at resources that are maybe provisioned for storage and not being used. We deliver the seamless one-click philosophy that Nutanix has to eliminate it. >> So one click, you can actually just pick some of these wasteful things that might be looking delightful because using public cloud, using credit cards, you can go in and just say click fix, and it takes care of things. >> Yeah, and not only remove the resources that are unused, but it can go and rightsize resources across your compute databases, load balancers, even past services, right? And this is where the power of it kind of comes for a business whether you're using on-prem and off-prem. You know how can you really converge that consumption across both? >> Dheeraj: So do you have something for Nutanix too? >> Vijay: Yes, so we have basically been working on Nutanix with something that we're going to deliver you know later this year. As you can see here, we're bringing together the consumption for the Nutanix, you know the services that you're using, the licensing and capacity that is available. And how can you also go and optimize within Nutanix environments >> That's great. >> for the next workload. Now let me quickly show you what we have on the compliance side. This is an extremely powerful thing that we've been working on for many years. What we deliver here just like in cost governance, a global view of your compliance across cloud providers. And the most powerful thing is you can go into a cloud provider, get the next level of visibility across cloud regimes for hundreds of policies. Not just policies but those policies across different regulatory compliances like HIPA, PCI, CAS. And that's very powerful because-- >> So you're saying a lot of what you folks have done is codified these compliance checks in software to make sure that people can sleep better at night knowing that it's PCI, and HIPA, and all that compliance actually comes together? >> And you can build this not just by cloud accounts, you can build them across cloud accounts which is what we call security centers. Essentially you can go and take a deeper look at you know the things. We do a whole full body scan for your cloud infrastructure whether it's AWS Amazon or Azure, and you can go and now, again, click to fix things. You know that had been probably provisioned that are violating the security compliance rules that should be there. Again, we have the same one-click philosophy to say how can you really remove things. >> So again, similar to save, you're saying you can go and fix some of these security issues by just doing one click. >> Absolutely. So the idea is how can we give our people the freedom to get visibility and use the right cloud and take the decisions instantly through one click. That's what Beam delivers you know today. And you know get really excited, and it's available at beam.nutanix.com. >> Our first SAS service, ladies and gentleman. Thank you so much for doing this, Vijay. It looks like there's going to be a talk here at 10:30. You'll talk more about the midterm elections there probably? >> Yes, so you can go and write your own security compliances as well. You know within Beam, and a lot of powerful things you can do. >> Awesome, thank you so much, Vijay. I really appreciate it. (audience clapping) So as you see, there's a lot of work that we're doing to really make multi-cloud which is a hard problem. You know think about working the whole body of it and what about cost governance? What about security compliance? Obviously what about hybrid networks, and security, and storage, you know compute, many of the things that you've actually heard from us, but we're taking it to a level where the business users can now understand the implications. A CFO's office can understand the implications of waste and delight. So what does customer success mean to us? You know again, my favorite word in a long, long time is really go and figure out how do you make you, the customer, become operationally efficient. You know there's a lot of stuff that we deliver through software that's completely uncovered. It's so latent, you don't even know you have it, but you've paid for it. So you've got to figure out what does it mean for you to really become operationally efficient, organizationally proficient. And it's really important for training, education, stuff that you know you're people might think it's so awkward to do in Nutanix, but it could've been way simpler if you just told you a place where you can go and read about it. Of course, I can just use one click here as opposed to doing things the old way. But most importantly to make it financially accountable. So the end in all this is, again, one of the things that I think about all the time in building this company because obviously there's a lot of stuff that we want to do to create orphans, you know things above the line and top line and everything else. There's also a bottom line. Delight and waste are two sides of the same coin. You know when we're talking about developers who seek delight with public cloud at the same time you're looking at IT folks who're trying to figure out governance. They're like look you know the CFOs office, the CIOs office, they're trying to figure out how to curb waste. These two things have to go hand in hand in this era of multi-cloud where we're talking about frictionless consumption but also governance that looks invisible. So I think, at the end of the day, this company will do a lot of stuff around one-click delight but also go and figure out how do you reduce waste because there's so much waste including folks there who actually own Nutanix. There's so much software entitlement. There's so much waste in the public cloud itself that if we don't go and put our arms around, it will not lead to customer success. So to talk more about this, the idea of delight and the idea of waste, I'd like to bring on board a person who I think you know many of you actually have talked about it have delightful hair but probably wasted jokes. But I think has wasted hair and delightful jokes. So ladies and gentlemen, you make the call. You're the jury. Sunil R.M.J. Potti. ("Free" by Broods) >> So that was the first time I came out from the bottom of a screen on a stage. I actually now know what it feels to be like a gopher. Who's that laughing loudly at the back? Okay, do we have the... Let's see. Okay, great. We're about 15 minutes late, so that means we're running right on time. That's normally how we roll at this conference. And we have about three customers and four demos. Like I think there's about three plus six, about nine folks coming onstage. So we'll have our own version of the parade as well on the main stage for the next 70 minutes. So let's just jump right into it. I think we've been pretty consistent in terms of our longterm plans since we started the company. And it's become a lot more clearer over the last few years about our plans to essentially make computing invisible as Dheeraj mentioned. We're doing this across multiple acts. We started with HCI. We call it making infrastructure invisible. We extended that to making data centers invisible. And then now we're in this mode of essentially extending it to converging clouds so that you can actually converge your consumption models. And so today's conference and essentially the theme that you're going to be seeing throughout the breakout sessions is about a journey towards invisible clouds, but make sure that you internalize the fact that we're investing heavily in each of the three phases. It's just not about the hybrid cloud with Nutanix, it's about actually finishing the job about making infrastructure invisible, expanding that to kind of go after the full data center, and then of course embark on some real meaningful things around invisible clouds, okay? And to start the session, I think you know the part that I wanted to make sure that we are all on the same page because most of us in the room are still probably in this phase of the journey which is about invisible infrastructure. And there the three key products and especially two of them that most of you guys know are Acropolis and Prism. And they're sort of like the bedrock of our company. You know especially Acropolis which is about the web scale architecture. Prism is about consumer grade design. And with Acropolis now being really mature. It's in the seventh year of innovation. We still have more than half of our company in terms of R and D spend still on Acropolis and Prism. So our core product is still sort of where we think we have a significant differentiation on. We're not going to let our foot off the peddle there. You know every time somebody comes to me and says look there's a new HCI render popping out or an existing HCI render out there, I ask a simple question to our customers saying show me 100 customers with 100 node deployments, and it will be very hard to find any other render out there that does the same thing. And that's the power of Acropolis the code platform. And then it's you know the fact that the velocity associated with Acropolis continues to be on a fast pace. We came out with various new capabilities in 5.5 and 5.6, and one of the most complicated things to get right was the fact to shrink our three node cluster to a one node, two node deployment. Most of you actually had requirements on remote office, branch office, or the edge that actually allowed us to kind of give us you know sort of like the impetus to kind of go design some new capabilities into our core OS to get this out. And associated with Acropolis and expanding into Prism, as you will see, the first couple of years of Prism was all about refactoring the user interface, doing a good job with automation. But more and more of the investments around Prism is going to be based on machine learning. And you've seen some variants of that over the last 12 months, and I can tell you that in the next 12 to 24 months, most of our investments around infrastructure operations are going to be driven by AI techniques starting with most of our R and D spend also going into machine-learning algorithms. So when you talk about all the enhancements that have come on with Prism whether it be formed by you know the management console changing to become much more automated, whether now we give you automatic rightsizing, anomaly detection, or a series of functionality that have gone into it, the real core sort of capabilities that we're putting into Prism and Acropolis are probably best served by looking at the quality of the product. You probably have seen this slide before. We started showing the number of nodes shipped by Nutanix two years ago at this conference. It was about 35,000 plus nodes at that time. And since then, obviously we've you know continued to grow. And we would draw this line which was about enterprise class quality. That for the number of bugs found as a percentage of nodes shipped, there's a certain line that's drawn. World class companies do about probably 2% to 3%, number of CFDs per node shipped. And we were just broken that number two years ago. And to give you guys an idea of how that curve has shown up, it's now currently at .95%. And so along with velocity, you know this focus on being true to our roots of reliability and stability continues to be, you know it's an internal challenge, but it's also some of the things that we keep a real focus on. And so between Acropolis and Prism, that's sort of like our core focus areas to sort of give us the confidence that look we have this really high bar that we're sort of keeping ourselves accountable to which is about being the most advanced enterprise cloud OS on the planet. And we will keep it this way for the next 10 years. And to complement that, over a period of time of course, we've added a series of services. So these are services not just for VMs but also for files, blocks, containers, but all being delivered in that single one-click operations fashion. And to really talk more about it, and actually probably to show you the real deal there it's my great pleasure to call our own version of Moses inside the company, most of you guys know him as Steve Poitras. Come on up, Steve. (audience clapping) (rock music) >> Thanks Sunil. >> You barely fit in that door, man. Okay, so what are we going to talk about today, Steve? >> Absolutely. So when we think about when Nutanix first got started, it was really focused around VDI deployments, smaller workloads. However over time as we've evolved the product, added additional capabilities and features, that's grown from VDI to business critical applications as well as cloud native apps. So let's go ahead and take a look. >> Sunil: And we'll start with like Oracle? >> Yeah, that's one of the key ones. So here we can see our Prism central user interface, and we can see our Thor cluster obviously speaking to the Avengers theme here. We can see this is doing right around 400,000 IOPs at around 360 microseconds latency. Now obviously Prism central allows you to mange all of your Nutanix deployments, but this is just running on one single Nutanix cluster. So if we hop over here to our explore tab, we can see we have a few categories. We have some Kubernetes, some AFS, some Xen desktop as well as Oracle RAC. Now if we hope over to Oracle RAC, we're running a SLOB workload here. So obviously with Oracle enterprise applications performance, consistency, and extremely low latency are very critical. So with this SLOB workload, we're running right around 300 microseconds of latency. >> Sunil: So this is what, how many node Oracle RAC cluster is this? >> Steve: This is a six node Oracle RAC deployment. >> Sunil: Got it. And so what has gone into the product in recent releases to kind of make this happen? >> Yeah so obviously on the hardware front, there's been a lot of evolutions in storage mediums. So with the introduction of NVME, persistent memory technologies like 3D XPoint, that's meant storage media has become a lot faster. Now to allow you to full take advantage of that, that's where we've had to do a lot of optimizations within the storage stack. So with AHV, we have what we call AHV turbo mode which allows you to full take advantage of those faster storage mediums at that much lower latency. And then obviously on the networking front, technologies such as RDMA can be leveraged to optimize that network stack. >> Got it. So that was Oracle RAC running on a you know Nutanix cluster. It used to be a big deal a couple of years ago. Now we've got many customers doing that. On the same environment though, we're going to show you is the advent of actually putting file services in the same scale out environment. And you know many of you in the audience probably know about AFS. We released it about 12 to 14 months ago. It's been one of our most popular new products of all time within Nutanix's history. And we had SMB support was for user file shares, VDI deployments, and it took awhile to bake, to get to scale and reliability. And then in the last release, in the recent release that we just shipped, we now added NFS for support so that we can no go after the full scale file server consolidation. So let's take a look at some of that stuff. >> Yep, let's do it. So hopping back over to Prism, we can see our four cluster here. Overall cluster-wide latency right around 360 microseconds. Now we'll hop down to our file server section. So here we can see we have our Next A File Server hosting right about 16.2 million files. Now if you look at our shares and exports, we can see we have a mix of different shares. So one of the shares that you see there is home directories. This is an SMB share which is actually mapped and being leveraged by our VDI desktops for home folders, user profiles, things of that nature. We can also see this Oracle backup share here which is exposed to our rack host via NFS. So RMAN is actually leveraging this to provide native database backups. >> Got it. So Oracle VMs, backup using files, or for any other file share requirements with AFS. Do we have the cluster also showing, I know, so I saw some Kubernetes as well on it. Let's talk about what we're thinking of doing there. >> Yep, let's do it. So if we think about cloud, cloud's obviously a big buzz word, so is containers in Kubernetes. So with ACS 1.0 what we did is we introduced native support for Docker integration. >> And pause there. And we screwed up. (laughing) So just like the market took a left turn on Kubernetes, obviously we realized that, and now we're working on ACS 2.0 which is what we're going to talk about, right? >> Exactly. So with ACS 2.0, we've introduced native Kubernetes support. Now when I think about Kubernetes, there's really two core areas that come to mind. The first one is around native integration. So with that, we have our Kubernetes volume integration, we're obviously doing a lot of work on the networking front, and we'll continue to push there from an integration point of view. Now the other piece is around the actual deployment of Kubernetes. When we think about a lot of Nutanix administrators or IT admins, they may have never deployed Kubernetes before, so this could be a very daunting task. And true to the Nutanix nature, we not only want to make our platform simple and intuitive, we also want to do this for any ecosystem products. So with ACS 2.0, we've simplified the full Kubernetes deployment and switching over to our ACS two interface, we can see this create cluster button. Now this actually pops up a full wizard. This wizard will actually walk you through the full deployment process, gather the necessary inputs for you, and in a matter of a few clicks and a few minutes, we have a full Kubernetes deployment fully provisioned, the masters, the workers, all the networking fully done for you, very simple and intuitive. Now if we hop back over to Prism, we can see we have this ACS2 Kubernetes category. Clicking on that, we can see we have eight instances of virtual machines. And here are Kubernetes virtual machines which have actually been deployed as part of this ACS2 installer. Now one of the nice things is it makes the IT administrator's job very simple and easy to do. The deployment straightforward monitoring and management very straightforward and simple. Now for the developer, the application architect, or engineers, they interface and interact with Kubernetes just like they would traditionally on any platform. >> Got it. So the goal of ACS is to ensure that the developer ecosystem still uses whatever tools that they are you know preferring while at that same time allowing this consolidation of containers along with VMs all on that same, single runtime, right? So that's ACS. And then if you think about where the OS is going, there's still some open space at the end. And open space has always been look if you just look at a public cloud, you look at blocks, files, containers, the most obvious sort of storage function that's left is objects. And that's the last horizon for us in completing the storage stack. And we're going to show you for the first time a preview of an upcoming product called the Acropolis Object Storage Services Stack. So let's talk a little bit about it and then maybe show the demo. >> Yeah, so just like we provided file services with AFS, block services with ABS, with OSS or Object Storage Services, we provide native object storage, compatibility and capability within the Nutanix platform. Now this provides a very simply common S3 API. So any integrations you've done with S3 especially Kubernetes, you can actually leverage that out of the box when you've deployed this. Now if we hop back over to Prism, I'll go here to my object stores menu. And here we can see we have two existing object storage instances which are running. So you can deploy however many of these as you wanted to. Now just like the Kubernetes deployment, deploying a new object instance is very simple and easy to do. So here I'll actually name this instance Thor's Hammer. >> You do know he loses it, right? He hasn't seen the movies yet. >> Yeah, I don't want any spoilers yet. So once we specified the name, we can choose our capacity. So here we'll just specify a large instance or type. Obviously this could be any amount or storage. So if you have a 200 node Nutanix cluster with petabytes worth of data, you could do that as well. Once we've selected that, we'll select our expected performance. And this is going to be the number of concurrent gets and puts. So essentially how many operations per second we want this instance to be able to facilitate. Once we've done that, the platform will actually automatically determine how many virtual machines it needs to deploy as well as the resources and specs for those. And once we've done that, we'll go ahead and click save. Now here we can see it's actually going through doing the deployment of the virtual machines, applying any necessary configuration, and in the matter of a few clicks and a few seconds, we actually have this Thor's Hammer object storage instance which is up and running. Now if we hop over to one of our existing object storage instances, we can see this has three buckets. So one for Kafka-queue, I'm actually using this for my Kafka cluster where I have right around 62 million objects all storing ProtoBus. The second one there is Spark. So I actually have a Spark cluster running on our Kubernetes deployed instance via ACS 2.0. Now this is doing analytics on top of this data using S3 as a storage backend. Now for these objects, we support native versioning, native object encryption as well as worm compliancy. So if you want to have expiry periods, retention intervals, that sort of thing, we can do all that. >> Got it. So essentially what we've just shown you is with upcoming objects as well that the same OS can now support VMs, files, objects, containers, all on the same one click operational fabric. And so that's in some way the real power of Nutanix is to still keep that consistency, scalability in place as we're covering each and every workload inside the enterprise. So before Steve gets off stage though, I wanted to talk to you guys a little bit about something that you know how many of you been to our Nutanix headquarters in San Jose, California? A few. I know there's like, I don't know, 4,000 or 5,000 people here. If you do come to the office, you know when you land in San Jose Airport on the way to longterm parking, you'll pass our office. It's that close. And if you come to the fourth floor, you know one of the cubes that's where I sit. In the cube beside me is Steve. Steve sits in the cube beside me. And when I first joined the company, three or four years ago, and Steve's if you go to his cube, it no longer looks like this, but it used to have a lot of this stuff. It was like big containers of this. I remember the first time. Since I started joking about it, he started reducing it. And then Steve eventually got married much to our surprise. (audience laughing) Much to his wife's surprise. And then he also had a baby as a bigger surprise. And if you come over to our office, and we welcome you, and you come to the fourth floor, find my cube or you'll find Steve's Cube, it now looks like this. Okay, so thanks a lot, my man. >> Cool, thank you. >> Thanks so much. (audience clapping) >> So single OS, any workload. And like Steve who's been with us for awhile, it's my great pleasure to invite one of our favorite customers, CSC Karen who's also been with us for three to four years. And I'll share some fond memories about how she's been with the company for awhile, how as partners we've really done a lot together. So without any further ado, let me bring up Karen. Come on up, Karen. (rock music) >> Thank you for having me. >> Yeah, thank you. So I remember, so how many of you guys were with Nutanix first .Next in Miami? I know there was a question like that asked last time. Not too many. You missed it. We wished we could go back to that. We wouldn't fit 3/4s of this crowd. But Karen was our first customer in the keynote in 2015. And we had just talked about that story at that time where you're just become a customer. Do you want to give us some recap of that? >> Sure. So when we made the decision to move to hyperconverged infrastructure and chose Nutanix as our partner, we rapidly started to deploy. And what I mean by that is Sunil and some of the Nutanix executives had come out to visit with us and talk about their product on a Tuesday. And on a Wednesday after making the decision, I picked up the phone and said you know what I've got to deploy for my VDI cluster. So four nodes showed up on Thursday. And from the time it was plugged in to moving over 300 VDIs and 50 terabytes of storage and turning it over for the business for use was less than three days. So it was really excellent testament to how simple it is to start, and deploy, and utilize the Nutanix infrastructure. Now part of that was the delight that we experienced from our customers after that deployment. So we got phone calls where people were saying this report it used to take so long that I'd got out and get a cup of coffee and come back, and read an article, and do some email, and then finally it would finish. Those reports are running in milliseconds now. It's one click. It's very, very simple, and we've delighted our customers. Now across that journey, we have gone from the simple workloads like VDIs to the much more complex workloads around Splunk and Hadoop. And what's really interesting about our Splunk deployment is we're handling over a billion events being logged everyday. And the deployment is smaller than what we had with a three tiered infrastructure. So when you hear people talk about waste and getting that out and getting to an invisible environment where you're just able to run it, that's what we were able to achieve both with everything that we're running from our public facing websites to the back office operations that we're using which include Splunk and even most recently our Cloudera and Hadoop infrastructure. What it does is it's got 30 crawlers that go out on the internet and start bringing data back. So it comes back with over two terabytes of data everyday. And then that environment, ingests that data, does work against it, and responds to the business. And that again is something that's smaller than what we had on traditional infrastructure, and it's faster and more stable. >> Got it. And it covers a lot of use cases as well. You want to speak a few words on that? >> So the use cases, we're 90%, 95% deployed on Nutanix, and we're covering all of our use cases. So whether that's a customer facing app or a back office application. And what are business is doing is it's handling large portfolios of data for fortune 500 companies and law firms. And these applications are all running with improved stability, reliability, and performance on the Nutanix infrastructure. >> And the plan going forward? >> So the plan going forward, you actually asked me that in Miami, and it's go global. So when we started in Miami and that first deployment, we had four nodes. We now have 283 nodes around the world, and we started with about 50 terabytes of data. We've now got 3.8 petabytes of data. And we're deployed across four data centers and six remote offices. And people ask me often what is the value that we achieved? So simplification. It's all just easier, and it's all less expensive. Being able to scale with the business. So our Cloudera environment ended up with one day where it spiked to 1,000 times more load, 1,000 times, and it just responded. We had rally cries around improved productivity by six times. So 600% improved productivity, and we were able to actually achieve that. The numbers you just saw on the slide that was very, very fast was we calculated a 40% reduction in total cost of ownership. We've exceeded that. And when we talk about waste, that other number on the board there is when I saved the company one hour of maintenance activity or unplanned downtime in a month which we're now able to do the majority of our maintenance activities without disrupting any of our business solutions, I'm saving $750,000 each time I save that one hour. >> Wow. All right, Karen from CSE. Thank you so much. That was great. Thank you. I mean you know some of these data points frankly as I started talking to Karen as well as some other customers are pretty amazing in terms of the genuine value beyond financial value. Kind of like the emotional sort of benefits that good products deliver to some of our customers. And I think that's one of the core things that we take back into engineering is to keep ourselves honest on either velocity or quality even hiring people and so forth. Is to actually the more we touch customers lives, the more we touch our partner's lives, the more it allows us to ensure that we can put ourselves in their shoes to kind of make sure that we're doing the right thing in terms of the product. So that was the first part, invisible infrastructure. And our goal, as we've always talked about, our true North is to make sure that this single OS can be an exact replica, a truly modern, thoughtful but original design that brings the power of public cloud this AWS or GCP like architectures into your mainstream enterprises. And so when we take that to the next level which is about expanding the scope to go beyond invisible infrastructure to invisible data centers, it starts with a few things. Obviously, it starts with virtualization and a level of intelligent management, extends to automation, and then as we'll talk about, we have to embark on encompassing the network. And that's what we'll talk about with Flow. But to start this, let me again go back to one of our core products which is the bedrock of our you know opinionated design inside this company which is Prism and Acropolis. And Prism provides, I mentioned, comes with a ton of machine-learning based intelligence built into the product in 5.6 we've done a ton of work. In fact, a lot of features are coming out now because now that PC, Prism Central that you know has been decoupled from our mainstream release strain and will continue to release on its own cadence. And the same thing when you actually flip it to AHV on its own train. Now AHV, two years ago it was all about can I use AHV for VDI? Can I use AHV for ROBO? Now I'm pretty clear about where you cannot use AHV. If you need memory overcome it, stay with VMware or something. If you need, you know Metro, stay with another technology, else it's game on, right? And if you really look at the adoption of AHV in the mainstream enterprise, the customers now speak for themselves. These are all examples of large global enterprises with multimillion dollar ELAs in play that have now been switched over. Like I'll give you a simple example here, and there's lots of these that I'm sure many of you who are in the audience that are in this camp, but when you look at the breakout sessions in the pods, you'll get a sense of this. But I'll give you one simple example. If you look at the online payment company. I'm pretty sure everybody's used this at one time or the other. They had the world's largest private cloud on open stack, 21,000 nodes. And they were actually public about it three or four years ago. And in the last year and a half, they put us through a rigorous VOC testing scale, hardening, and it's a full blown AHV only stack. And they've started cutting over. Obviously they're not there yet completely, but they're now literally in hundreds of nodes of deployment of Nutanix with AHV as their primary operating system. So it is primetime from a deployment perspective. And with that as the base, no cloud is complete without actually having self-service provisioning that truly drives one-click automation, and can you do that in this consumer grade design? And Calm was acquired, as you guys know, in 2016. We had a choice of taking Calm. It was reasonably feature complete. It supported multiple clouds. It supported ESX, it supported Brownfield, It supported AHV. I mean they'd already done the integration with Nutanix even before the acquisition. And we had a choice. The choice was go down the path of dynamic ops or some other products where you took it for revenue or for acceleration, you plopped it into the ecosystem and sold it at this power sucking alien on top of our stack, right? Or we took a step back, re-engineered the product, kept some of the core essence like the workflow engine which was good, the automation, the object model and all, but refactored it to make it look like a natural extension of our operating system. And that's what we did with Calm. And we just launched it in December, and it's been one of our most popular new products now that's flying off the shelves. If you saw the number of registrants, I got a notification of this for the breakout sessions, the number one session that has been preregistered with over 500 people, the first two sessions are around Calm. And justifiably so because it just as it lives up to its promise, and it'll take its time to kind of get to all the bells and whistles, all the capabilities that have come through with AHV or Acropolis in the past. But the feature functionality, the product market fit associated with Calm is dead on from what the feedback that we can receive. And so Calm itself is on its own rapid cadence. We had AWS and AHV in the first release. Three or four months later, we now added ESX support. We added GCP support and a whole bunch of other capabilities, and I think the essence of Calm is if you can combine Calm and along with private cloud automation but also extend it to multi-cloud automation, it really sets Nutanix on its first genuine path towards multi-cloud. But then, as I said, if you really fixate on a software defined data center message, we're not complete as a full blown AWS or GCP like IA stack until we do the last horizon of networking. And you probably heard me say this before. You heard Dheeraj and others talk about it before is our problem in networking isn't the same in storage. Because the data plane in networking works. Good L2 switches from Cisco, Arista, and so forth, but the real problem networking is in the control plane. When something goes wrong at a VM level in Nutanix, you're able to identify whether it's a storage problem or a compute problem, but we don't know whether it's a VLAN that's mis-configured, or there've been some packets dropped at the top of the rack. Well that all ends now with Flow. And with Flow, essentially what we've now done is take the work that we've been working on to create built-in visibility, put some network automation so that you can actually provision VLANs when you provision VMs. And then augment it with micro segmentation policies all built in this easy to use, consume fashion. But we didn't stop there because we've been talking about Flow, at least the capabilities, over the last year. We spent significant resources building it. But we realized that we needed an additional thing to augment its value because the world of applications especially discovering application topologies is a heady problem. And if we didn't address that, we wouldn't be fulfilling on this ambition of providing one-click network segmentation. And so that's where Netsil comes in. Netsil might seem on the surface yet another next generation application performance management tool. But the innovations that came from Netsil started off at the research project at the University of Pennsylvania. And in fact, most of the team right now that's at Nutanix is from the U Penn research group. And they took a really original, fresh look at how do you sit in a network in a scale out fashion but still reverse engineer the packets, the flow through you, and then recreate this application topology. And recreate this not just on Nutanix, but do it seamlessly across multiple clouds. And to talk about the power of Flow augmented with Netsil, let's bring Rajiv back on stage, Rajiv. >> How you doing? >> Okay so we're going to start with some Netsil stuff, right? >> Yeah, let's talk about Netsil and some of the amazing capabilities this acquisition's bringing to Nutanix. First of all as you mentioned, Netsil's completely non invasive. So it installs on the network, it does all its magic from there. There're no host agents, non of the complexity and compatibility issues that entails. It's also monitoring the network at layer seven. So it's actually doing a deep packet inspection on all your application data, and can give you insights into services and APIs which is very important for modern applications and the way they behave. To do all this of course performance is key. So Netsil's built around a completely distributed architecture scaled to really large workloads. Very exciting technology. We're going to use it in many different ways at Nutanix. And to give you a flavor of that, let me show you how we're thinking of integrating Flow and Nestil together, so micro segmentation and Netsil. So to do that, we install Netsil in one of our Google accounts. And that's what's up here now. It went out there. It discovered all the VMs we're running on that account. It created a map essentially of all their interactions, and you can see it's like a Google Maps view. I can zoom into it. I can look at various things running. I can see lots of HTTP servers over here, some databases. >> Sunil: And it also has stats, right? You can go, it actually-- >> It does. We can take a look at that for a second. There are some stats you can look at right away here. Things like transactions per second and latencies and so on. But if I wanted to micro segment this application, it's not really clear how to do so. There's no real pattern over here. Taking the Google Maps analogy a little further, this kind of looks like the backstreets of Cairo or something. So let's do this step by step. Let me first filter down to one application. Right now I'm looking at about three or four different applications. And Netsil integrates with the metadata. So this is that the clouds provide. So I can search all the tags that I have. So by doing that, I can zoom in on just the financial application. And when I do this, the view gets a little bit simpler, but there's still no real pattern. It's not clear how to micro segment this, right? And this is where the power of Netsil comes in. This is a fairly naive view. This is what tool operating at layer four just looking at ports and TCP traffic would give you. But by doing deep packet inspection, Netsil can get into the services layer. So instead of grouping these interactions by hostname, let's group them by service. So you go service tier. And now you can see this is a much simpler picture. Now I have some patterns. I have a couple of load balancers, an HA proxy and an Nginx. I have a web application front end. I have some application servers running authentication services, search services, et cetera, a database, and a database replica. I could go ahead and micro segment at this point. It's quite possible to do it at this point. But this is almost too granular a view. We actually don't usually want to micro segment at individual service level. You think more in terms of application tiers, the tiers that different services belong to. So let me go ahead and group this differently. Let me group this by app tier. And when I do that, a really simple picture emerges. I have a load balancing tier talking to a web application front end tier, an API tier, and a database tier. Four tiers in my application. And this is something I can work with. This is something that I can micro segment fairly easily. So let's switch over to-- >> Before we dot that though, do you guys see how he gave himself the pseudonym called Dom Toretto? >> Focus Sunil, focus. >> Yeah, for those guys, you know that's not the Avengers theme, man, that's the Fast and Furious theme. >> Rajiv: I think a year ahead. This is next years theme. >> Got it, okay. So before we cut over from Netsil to Flow, do we want to talk a few words about the power of Flow, and what's available in 5.6? >> Sure so Flow's been around since the 5.6 release. Actually some of the functionality came in before that. So it's got invisibility into the network. It helps you debug problems with WLANs and so on. We had a lot of orchestration with other third party vendors with load balancers, with switches to make publishing much simpler. And then of course with our most recent release, we GA'ed our micro segmentation capabilities. And that of course is the most important feature we have in Flow right now. And if you look at how Flow policy is set up, it looks very similar to what we just saw with Netsil. So we have load blancer talking to a web app, API, database. It's almost identical to what we saw just a moment ago. So while this policy was created manually, it is something that we can automate. And it is something that we will do in future releases. Right now, it's of course not been integrated at that level yet. So this was created manually. So one thing you'll notice over here is that the database tier doesn't get any direct traffic from the internet. All internet traffic goes to the load balancer, only specific services then talk to the database. So this policy right now is in monitoring mode. It's not actually being enforced. So let's see what happens if I try to attack the database, I start a hack against the database. And I have my trusty brute force password script over here. It's trying the most common passwords against the database. And if I happen to choose a dictionary word or left the default passwords on, eventually it will log into the database. And when I go back over here in Flow what happens is it actually detects there's now an ongoing a flow, a flow that's outside of policy that's shown up. And it shows this in yellow. So right alongside the policy, I can visualize all the noncompliant flows. This makes it really easy for me now to make decisions, does this flow should it be part of the policy, should it not? In this particular case, obviously it should not be part of the policy. So let me just switch from monitoring mode to enforcement mode. I'll apply the policy, give it a second to propagate. The flow goes away. And if I go back to my script, you can see now the socket's timing out. I can no longer connect to the database. >> Sunil: Got it. So that's like one click segmentation and play right now? >> Absolutely. It's really, really simple. You can compare it to other products in the space. You can't get simpler than this. >> Got it. Why don't we got back and talk a little bit more about, so that's Flow. It's shipping now in 5.6 obviously. It'll come integrated with Netsil functionality as well as a variety of other enhancements in that next few releases. But Netsil does more than just simple topology discovery, right? >> Absolutely. So Netsil's actually gathering a lot of metrics from your network, from your host, all this goes through a data pipeline. It gets processed over there and then gets captured in a time series database. And then we can slice and dice that in various different ways. It can be used for all kinds of insights. So let's see how our application's behaving. So let me say I want to go into the API layer over here. And I instantly get a variety of metrics on how the application's behaving. I get the most requested endpoints. I get the average latency. It looks reasonably good. I get the average latency of the slowest endpoints. If I was having a performance problem, I would know exactly where to go focus on. Right now, things look very good, so we won't focus on that. But scrolling back up, I notice that we have a fairly high error rate happening. We have like 11.35% of our HTTP requests are generating errors, and that deserves some attention. And if I scroll down again, and I see the top five status codes I'm getting, almost 10% of my requests are generating 500 errors, HTTP 500 errors which are internal server errors. So there's something going on that's wrong with this application. So let's dig a little bit deeper into that. Let me go into my analytics workbench over here. And what I've plotted over here is how my HTTP requests are behaving over time. Let me filter down to just the 500 ones. That will make it easier. And I want the 500s. And I'll also group this by the service tier so that I can see which services are causing the problem. And the better view for this would be a bar graph. Yes, so once I do this, you can see that all the errors, all the 500 errors that we're seeing have been caused by the authentication service. So something's obviously wrong with that part of my application. I can go look at whether Active Directory is misbehaving and so on. So very quickly from a broad problem that I was getting a high HTTP error rate. In fact, usually you will discover there's this customer complaining about a lot of errors happening in your application. You can quickly narrow down to exactly what the cause was. >> Got it. This is what we mean by hyperconvergence of the network which is if you can truly isolate network related problems and associate them with the rest of the hyperconvergence infrastructure, then we've essentially started making real progress towards the next level of hyperconvergence. Anyway, thanks a lot, man. Great job. >> Thanks, man. (audience clapping) >> So to talk about this evolution from invisible infrastructure to invisible data centers is another customer of ours that has embarked on this journey. And you know it's not just using Nutanix but a variety of other tools to actually fulfill sort of like the ambition of a full blown cloud stack within a financial organization. And to talk more about that, let me call Vijay onstage. Come on up, Vijay. (rock music) >> Hey. >> Thank you, sir. So Vijay looks way better in real life than in a picture by the way. >> Except a little bit of gray. >> Unlike me. So tell me a little bit about this cloud initiative. >> Yeah. So we've won the best cloud initiative twice now hosted by Incisive media a large magazine. It's basically they host a bunch of you know various buy side, sell side, and you can submit projects in various categories. So we've won the best cloud twice now, 2015 and 2017. The 2017 award is when you know as part of our private cloud journey we were laying the foundation for our private cloud which is 100% based on hyperconverged infrastructure. So that was that award. And then 2017, we've kind of built on that foundation and built more developer-centric next gen app services like PAS, CAS, SDN, SDS, CICD, et cetera. So we've built a lot of those services on, and the second award was really related to that. >> Got it. And a lot of this was obviously based on an infrastructure strategy with some guiding principles that you guys had about three or four years ago if I remember. >> Yeah, this is a great slide. I use it very often. At the core of our infrastructure strategy is how do we run IT as a business? I talk about this with my teams, they were very familiar with this. That's the mindset that I instill within the teams. The mission, the challenge is the same which is how do we scale infrastructure while reducing total cost of ownership, improving time to market, improving client experience and while we're doing that not lose sight of reliability, stability, and security? That's the mission. Those are some of our guiding principles. Whenever we take on some large technology investments, we take 'em through those lenses. Obviously Nutanix went through those lenses when we invested in you guys many, many years ago. And you guys checked all the boxes. And you know initiatives change year on year, the mission remains the same. And more recently, the last few years, we've been focused on converged platforms, converged teams. We've actually reorganized our teams and aligned them closer to the platforms moving closer to an SRE like concept. >> And then you've built out a full stack now across computer storage, networking, all the way with various use cases in play? >> Yeah, and we're aggressively moving towards PAS, CAS as our method of either developing brand new cloud native applications or even containerizing existing applications. So the stack you know obviously built on Nutanix, SDS for software fine storage, compute and networking we've got SDN turned on. We've got, again, PAS and CAS built on this platform. And then finally, we've hooked our CICD tooling onto this. And again, the big picture was always frictionless infrastructure which we're very close to now. You know 100% of our code deployments into this environment are automated. >> Got it. And so what's the net, net in terms of obviously the business takeaway here? >> Yeah so at Northern we don't do tech for tech. It has to be some business benefits, client benefits. There has to be some outcomes that we measure ourselves against, and these are some great metrics or great ways to look at if we're getting the outcomes from the investments we're making. So for example, infrastructure scale while reducing total cost of ownership. We're very focused on total cost of ownership. We, for example, there was a build team that was very focus on building servers, deploying applications. That team's gone down from I think 40, 45 people to about 15 people as one example, one metric. Another metric for reducing TCO is we've been able to absorb additional capacity without increasing operating expenses. So you're actually building capacity in scale within your operating model. So that's another example. Another example, right here you see on the screen. Faster time to market. We've got various types of applications at any given point that we're deploying. There's a next gen cloud native which go directly on PAS. But then a majority of the applications still need the traditional IS components. The time to market to deploy a complex multi environment, multi data center application, we've taken that down by 60%. So we can deliver server same day, but we can deliver entire environments, you know add it to backup, add it to DNS, and fully compliant within a couple of weeks which is you know something we measure very closely. >> Great job, man. I mean that's a compelling I think results. And in the journey obviously you got promoted a few times. >> Yep. >> All right, congratulations again. >> Thank you. >> Thanks Vijay. >> Hey Vijay, come back here. Actually we forgot our joke. So razzled by his data points there. So you're supposed to wear some shoes, right? >> I know my inner glitch. I was going to wear those sneakers, but I forgot them at the office maybe for the right reasons. But the story behind those florescent sneakers, I see they're focused on my shoes. But I picked those up two years ago at a Next event, and not my style. I took 'em to my office. They've been sitting in my office for the last couple years. >> Who's received shoes like these by the way? I'm sure you guys have received shoes like these. There's some real fans there. >> So again, I'm sure many of you liked them. I had 'em in my office. I've offered it to so many of my engineers. Are you size 11? Do you want these? And they're unclaimed? >> So that's the only feature of Nutanix that you-- >> That's the only thing that hasn't worked, other than that things are going extremely well. >> Good job, man. Thanks a lot. >> Thanks. >> Thanks Vijay. So as we get to the final phase which is obviously as we embark on this multi-cloud journey and the complexity that comes with it which Dheeraj hinted towards in his session. You know we have to take a cautious, thoughtful approach here because we don't want to over set expectations because this will take us five, 10 years to really do a good job like we've done in the first act. And the good news is that the market is also really, really early here. It's just a fact. And so we've taken a tiered approach to it as we'll start the discussion with multi-cloud operations, and we've talked about the stack in the prior session which is about look across new clouds. So it's no longer Nutanix, Dell, Lenova, HP, Cisco as the new quote, unquote platforms. It's Nutanix, Xi, GCP, AWS, Azure as the new platforms. That's how we're designing the fabric going forward. On top of that, you obviously have the hybrid OS both on the data plane side and control plane side. Then what you're seeing with the advent of Calm doing a marketplace and automation as well as Beam doing governance and compliance is the fact that you'll see more and more such capabilities of multi-cloud operations burnt into the platform. And example of that is Calm with the new 5.7 release that they had. Launch supports multiple clouds both inside and outside, but the fundamental premise of Calm in the multi-cloud use case is to enable you to choose the right cloud for the right workload. That's the automation part. On the governance part, and this we kind of went through in the last half an hour with Dheeraj and Vijay on stage is something that's even more, if I can call it, you know first order because you get the provisioning and operations second. The first order is to say look whatever my developers have consumed off public cloud, I just need to first get our arm around to make sure that you know what am I spending, am I secure, and then when I get comfortable, then I am able to actually expand on it. And that's the power of Beam. And both Beam and Calm will be the yin and yang for us in our multi-cloud portfolio. And we'll have new products to complement that down the road, right? But along the way, that's the whole private cloud, public cloud. They're the two ends of the barbell, and over time, and we've been working on Xi for awhile, is this conviction that we've built talking to many customers that there needs to be another type of cloud. And this type of a cloud has to feel like a public cloud. It has to be architected like a public cloud, be consumed like a public cloud, but it needs to be an extension of my data center. It should not require any changes to my tooling. It should not require and changes to my operational infrastructure, and it should not require lift and shift, and that's a super hard problem. And this problem is something that a chunk of our R and D team has been burning the midnight wick on for the last year and a half. Because look this is not about taking our current OS which does a good job of scaling and plopping it into a Equinix or a third party data center and calling it a hybrid cloud. This is about rebuilding things in the OS so that we can deliver a true hybrid cloud, but at the same time, give those functionality back on premises so that even if you don't have a hybrid cloud, if you just have your own data centers, you'll still need new services like DR. And if you think about it, what are we doing? We're building a full blown multi-tenant virtual network designed in a modern way. Think about this SDN 2.0 because we have 10 years worth of looking backwards on how GCP has done it, or how Amazon has done it, and now sort of embodying some of that so that we can actually give it as part of this cloud, but do it in a way that's a seamless extension of the data center, and then at the same time, provide new services that have never been delivered before. Everyone obviously does failover and failback in DR it just takes months to do it. Our goal is to do it in hours or minutes. But even things such as test. Imagine doing a DR test on demand for you business needs in the middle of the day. And that's the real bar that we've set for Xi that we are working towards in early access later this summer with GA later in the year. And to talk more about this, let me invite some of our core architects working on it, Melina and Rajiv. (rock music) Good to see you guys. >> You're messing up the names again. >> Oh Rajiv, Vinny, same thing, man. >> You need to back up your memory from Xi. >> Yeah, we should. Okay, so what are we going to talk about, Vinny? >> Yeah, exactly. So today we're going to talk about how Xi is pushing the envelope and beyond the state of the art as you were saying in the industry. As part of that, there's a whole bunch of things that we have done starting with taking a private cloud, seamlessly extending it to the public cloud, and then creating a hybrid cloud experience with one-click delight. We're going to show that. We've done a whole bunch of engineering work on making sure the operations and the tooling is identical on both sides. When you graduate from a private cloud to a hybrid cloud environment, you don't want the environments to be different. So we've copied the environment for you with zero manual intervention. And finally, building on top of that, we are delivering DR as a service with unprecedented simplicity with one-click failover, one-click failback. We're going to show you one click test today. So Melina, why don't we start with showing how you go from a private cloud, seamlessly extend it to consume Xi. >> Sounds good, thanks Vinny. Right now, you're looking at my Prism interface for my on premises cluster. In one-click, I'm going to be able to extend that to my Xi cloud services account. I'm doing this using my my Nutanix credential and a password manager. >> Vinny: So here as you notice all the Nutanix customers we have today, we have created an account for them in Xi by default. So you don't have to log in somewhere and create an account. It's there by default. >> Melina: And just like that we've gone ahead and extended my data center. But let's go take a look at the Xi side and log in again with my my Nutanix credentials. We'll see what we have over here. We're going to be able to see two availability zones, one for on premises and one for Xi right here. >> Vinny: Yeah as you see, using a log in account that you already knew mynutanix.com and 30 seconds in, you can see that you have a hybrid cloud view already. You have a private cloud availability zone that's your own Prism central data center view, and then a Xi availability zone. >> Sunil: Got it. >> Melina: Exactly. But of course we want to extend my network connection from on premises to my Xi networks as well. So let's take a look at our options there. We have two ways of doing this. Both are one-click experience. With direct connect, you can create a dedicated network connection between both environments, or VPN you can use a public internet and a VPN service. Let's go ahead and enable VPN in this environment. Here we have two options for how we want to enable our VPN. We can bring our own VPN and connect it, or we will deploy a VPN for you on premises. We'll do the option where we deploy the VPN in one-click. >> And this is another small sign or feature that we're building net new as part of Xi, but will be burned into our core Acropolis OS so that we can also be delivering this as a stand alone product for on premises deployment as well, right? So that's one of the other things to note as you guys look at the Xi functionality. The goal is to keep the OS capabilities the same on both sides. So even if I'm building a quote, unquote multi data center cloud, but it's just a private cloud, you'll still get all the benefits of Xi but in house. >> Exactly. And on this second step of the wizard, there's a few inputs around how you want the gateway configured, your VLAN information and routing and protocol configuration details. Let's go ahead and save it. >> Vinny: So right now, you know what's happening is we're taking the private network that our customers have on premises and extending it to a multi-tenant public cloud such that our customers can use their IP addresses, the subnets, and bring their own IP. And that is another step towards making sure the operation and tooling is kept consistent on both sides. >> Melina: Exactly. And just while you guys were talking, the VPN was successfully created on premises. And we can see the details right here. You can track details like the status of the connection, the gateway, as well as bandwidth information right in the same UI. >> Vinny: And networking is just tip of the iceberg of what we've had to work on to make sure that you get a consistent experience on both sides. So Melina, why don't we show some of the other things we've done? >> Melina: Sure, to talk about how we preserve entities from my on-premises to Xi, it's better to use my production environment. And first thing you might notice is the log in screen's a little bit different. But that's because I'm logging in using my ADFS credentials. The first thing we preserved was our users. In production, I'm running AD obviously on-prem. And now we can log in here with the same set of credentials. Let me just refresh this. >> And this is the Active Directory credential that our customers would have. They use it on-premises. And we allow the setting to be set on the Xi cloud services as well, so it's the same set of users that can access both sides. >> Got it. There's always going to be some networking problem onstage. It's meant to happen. >> There you go. >> Just launching it again here. I think it maybe timed out. This is a good sign that we're running on time with this presentation. >> Yeah, yeah, we're running ahead of time. >> Move the demos quicker, then we'll time out. So essentially when you log into Xi, you'll be able to see what are the environment capabilities that we have copied to the Xi environment. So for example, you just saw that the same user is being used to log in. But after the use logs in, you'll be able to see their images, for example, copied to the Xi side. You'll be able to see their policies and categories. You know when you define these policies on premises, you spend a lot of effort and create them. And now when you're extending to the public cloud, you don't want to do it again, right? So we've done a whole lot of syncing mechanisms making sure that the two sides are consistent. >> Got it. And on top of these policies, the next step is to also show capabilities to actually do failover and failback, but also do integrated testing as part of this compatibility. >> So one is you know just the basic job of making the environments consistent on two sides, but then it's also now talking about the data part, and that's what DR is about. So if you have a workload running on premises, we can take the data and replicate it using your policies that we've already synced. Once the data is available on the Xi side, at that point, you have to define a run book. And the run book essentially it's a recovery plan. And that says okay I already have the backups of my VMs in case of disaster. I can take my recovery plan and hit you know either failover or maybe a test. And then my application comes up. First of all, you'll talk about the boot order for your VMs to come up. You'll talk about networking mapping. Like when I'm running on-prem, you're using a particular subnet. You have an option of using the same subnet on the Xi side. >> Melina: There you go. >> What happened? >> Sunil: It's finally working.? >> Melina: Yeah. >> Vinny, you can stop talking. (audience clapping) By the way, this is logging into a live Xi data center. We have two regions West Coat, two data centers East Coast, two data centers. So everything that you're seeing is essentially coming off the mainstream Xi profile. >> Vinny: Melina, why don't we show the recovery plan. That's the most interesting piece here. >> Sure. The recovery plan is set up to help you specify how you want to recover your applications in the event of a failover or a test failover. And it specifies all sorts of details like the boot sequence for the VMs as well as network mappings. Some of the network mappings are things like the production network I have running on premises and how it maps to my production network on Xi or the test network to the test network. What's really cool here though is we're actually automatically creating your subnets on Xi from your on premises subnets. All that's part of the recovery plan. While we're on the screen, take a note of the .100 IP address. That's a floating IP address that I have set up to ensure that I'm going to be able to access my three tier web app that I have protected with this plan after a failover. So I'll be able to access it from the public internet really easily from my phone or check that it's all running. >> Right, so given how we make the environment consistent on both sides, now we're able to create a very simple DR experience including failover in one-click, failback. But we're going to show you test now. So Melina, let's talk about test because that's one of the most common operations you would do. Like some of our customers do it every month. But usually it's very hard. So let's see how the experience looks like in what we built. >> Sure. Test and failover are both one-click experiences as you know and come to expect from Nutanix. You can see it's failing over from my primary location to my recovery location. Now what we're doing right now is we're running a series of validation checks because we want to make sure that you have your network configured properly, and there's other configuration details in place for the test to be successful. Looks like the failover was initiated successfully. Now while that failover's happening though, let's make sure that I'm going to be able to access my three tier web app once it fails over. We'll do that by looking at my network policies that I've configured on my test network. Because I want to access the application from the public internet but only port 80. And if we look here under our policies, you can see I have port 80 open to permit. So that's good. And if I needed to create a new one, I could in one click. But it looks like we're good to go. Let's go back and check the status of my recovery plan. We click in, and what's really cool here is you can actually see the individual tasks as they're being completed from that initial validation test to individual VMs being powered on as part of the recovery plan. >> And to give you guys an idea behind the scenes, the entire recovery plan is actually a set of workflows that are built on Calm's automation engine. So this is an example of where we're taking some of power of workflow and automation that Clam has come to be really strong at and burning that into how we actually operationalize many of these workflows for Xi. >> And so great, while you were explaining that, my three tier web app has restarted here on Xi right in front of you. And you can see here there's a floating IP that I mentioned early that .100 IP address. But let's go ahead and launch the console and make sure the application started up correctly. >> Vinny: Yeah, so that .100 IP address is a floating IP that's a publicly visible IP. So it's listed here, 206.80.146.100. And that's essentially anybody in the audience here can go use your laptop or your cell phone and hit that and start to work. >> Yeah so by the way, just to give you guys an idea while you guys maybe use the IP to kind of hit it, is a real set of VMs that we've just failed over from Nutanix's corporate data center into our West region. >> And this is running live on the Xi cloud. >> Yeah, you guys should all go and vote. I'm a little biased towards Xi, so vote for Xi. But all of them are really good features. >> Scroll up a little bit. Let's see where Xi is. >> Oh Xi's here. I'll scroll down a little bit, but keep the... >> Vinny: Yes. >> Sunil: You guys written a block or something? >> Melina: Oh good, it looks like Xi's winning. >> Sunil: Okay, great job, Melina. Thank you so much. >> Thank you, Melina. >> Melina: Thanks. >> Thank you, great job. Cool and calm under pressure. That's good. So that was Xi. What's something that you know we've been doing around you know in addition to taking say our own extended enterprise public cloud with Xi. You know we do recognize that there are a ton of workloads that are going to be residing on AWS, GCP, Azure. And to sort of really assist in the try and call it transformation of enterprises to choose the right cloud for the right workload. If you guys remember, we actually invested in a tool over last year which became actually quite like one of those products that took off based on you know groundswell movement. Most of you guys started using it. It's essentially extract for VMs. And it was this product that's obviously free. It's a tool. But it enables customers to really save tons of time to actually migrate from legacy environments to Nutanix. So we took that same framework, obviously re-platformed it for the multi-cloud world to kind of solve the problem of migrating from AWS or GCP to Nutanix or vice versa. >> Right, so you know, Sunil as you said, moving from a private cloud to the public cloud is a lift and shift, and it's a hard you know operation. But moving back is not only expensive, it's a very hard problem. None of the cloud vendors provide change block tracking capability. And what that means is when you have to move back from the cloud, you have an extended period of downtime because there's now way of figuring out what's changing while you're moving. So you have to keep it down. So what we've done with our app mobility product is we have made sure that, one, it's extremely simple to move back. Two, that the downtime that you'll have is as small as possible. So let me show you what we've done. >> Got it. >> So here is our app mobility capability. As you can see, on the left hand side we have a source environment and target environment. So I'm calling my AWS environment Asgard. And I can add more environments. It's very simple. I can select AWS and then put in my credentials for AWS. It essentially goes and discovers all the VMs that are running and all the regions that they're running. Target environment, this is my Nutanix environment. I call it Earth. And I can add target environment similarly, IP address and credentials, and we do the rest. Right, okay. Now migration plans. I have Bifrost one as my migration plan, and this is how migration works. First you create a plan and then say start seeding. And what it does is takes a snapshot of what's running in the cloud and starts migrating it to on-prem. Once it is an on-prem and the difference between the two sides is minimal, it says I'm ready to cutover. At that time, you move it. But let me show you how you'd create a new migration plan. So let me name it, Bifrost 2. Okay so what I have to do is select a region, so US West 1, and target Earth as my cluster. This is my storage container there. And very quickly you can see these are the VMs that are running in US West 1 in AWS. I can select SQL server one and two, go to next. Right now it's looking at the target Nutanix environment and seeing it had enough space or not. Once that's good, it gives me an option. And this is the step where it enables the Nutanix service of change block tracking overlaid on top of the cloud. There are two options one is automatic where you'll give us the credentials for your VMs, and we'll inject our capability there. Or manually you could do. You could copy the command either in a windows VM or Linux VM and run it once on the VM. And change block tracking since then in enabled. Everything is seamless after that. Hit next. >> And while Vinny's setting it up, he said a few things there. I don't know if you guys caught it. One of the hardest problems in enabling seamless migration from public cloud to on-prem which makes it harder than the other way around is the fact that public cloud doesn't have things like change block tracking. You can't get delta copies. So one of the core innovations being built in this app mobility product is to provide that overlay capability across multiple clouds. >> Yeah, and the last step here was to select the target network where the VMs will come up on the Nutanix environment, and this is a summary of the migration plan. You can start it or just save it. I'm saving it because it takes time to do the seeding. I have the other plan which I'll actually show the cutover with. Okay so now this is Bifrost 1. It's ready to cutover. We started it four hours ago. And here you can see there's a SQL server 003. Okay, now I would like to show the AWS environment. As you can see, SQL server 003. This VM is actually running in AWS right now. And if you go to the Prism environment, and if my login works, right? So we can go into the virtual machine view, tables, and you see the VM is not there. Okay, so we go back to this, and we can hit cutover. So this is essentially telling our system, okay now it the time. Quiesce the VM running in AWS, take the last bit of changes that you have to the database, ship it to on-prem, and in on-prem now start you know configure the target VM and start bringing it up. So let's go and look at AWS and refresh that screen. And you should see, okay so the SQL server is now stopping. So that means it has quiesced and stopping the VM there. If you go back and look at the migration plan that we had, it says it's completed. So it has actually migrated all the data to the on-prem side. Go here on-prem, you see the production SQL server is running already. I can click launch console, and let's see. The Windows VM is already booting up. >> So essentially what Vinny just showed was a live cutover of an AWS VM to Nutanix on-premises. >> Yeah, and what we have done. (audience clapping) So essentially, this is about making two things possible, making it simple to migrate from cloud to on-prem, and making it painless so that the downtime you have is very minimal. >> Got it, great job, Vinny. I won't forget your name again. So last step. So to really talk about this, one of our favorite partners and customers has been in the cloud environment for a long time. And you know Jason who's the CTO of Cyxtera. And he'll introduce who Cyxtera is. Most of you guys are probably either using their assets or not without knowing their you know the new name. But is someone that was in the cloud before it was called cloud as one of the original founders and technologists behind Terremark, and then later as one of the chief architects of VMware's cloud. And then they started this new company about a year or so ago which I'll let Jason talk about. This journey that he's going to talk about is how a partner, slash customer is working with us to deliver net new transformations around the traditional industry of colo. Okay, to talk more about it, Jason, why don't you come up on stage, man? (rock music) Thank you, sir. All right so Cyxtera obviously a lot of people don't know the name. Maybe just give a 10 second summary of why you're so big already. >> Sure, so Cyxtera was formed, as you said, about a year ago through the acquisition of the CenturyLink data centers. >> Sunil: Which includes Savvis and a whole bunch of other assets. >> Yeah, there's a long history of those data centers, but we have all of them now as well as the software companies owned by Medina capital. So we're like the world's biggest startup now. So we have over 50 data centers around the world, about 3,500 customers, and a portfolio of security and analytics software. >> Sunil: Got it, and so you have this strategy of what we're calling revolutionizing colo deliver a cloud based-- >> Yeah so, colo hasn't really changed a lot in the last 20 years. And to be fair, a lot of what happens in data centers has to have a person physically go and do it. But there are some things that we can simplify and automate. So we want to make things more software driven, so that's what we're doing with the Cyxtera extensible data center or CXD. And to do that, we're deploying software defined networks in our facilities and developing automations so customers can go and provision data center services and the network connectivity through a portal or through REST APIs. >> Got it, and what's different now? I know there's a whole bunch of benefits with the integrated platform that one would not get in the traditional kind of on demand data center environment. >> Sure. So one of the first services we're launching on CXD is compute on demand, and it's powered by Nutanix. And we had to pick an HCI partner to launch with. And we looked at players in the space. And as you mentioned, there's actually a lot of them, more than I thought. And we had a lot of conversations, did a lot of testing in the lab, and Nutanix really stood out as the best choice. You know Nutanix has a lot of focus on things like ease of deployment. So it's very simple for us to automate deploying compute for customers. So we can use foundation APIs to go configure the servers, and then we turn those over to the customer which they can then manage through Prism. And something important to keep in mind here is that you know this isn't a manged service. This isn't infrastructure as a service. The customer has complete control over the Nutanix platform. So we're turning that over to them. It's connected to their network. They're using their IP addresses, you know their tools and processes to operate this. So it was really important for the platform we picked to have a really good self-service story for things like you know lifecycle management. So with one-click upgrade, customers have total control over patches and upgrades. They don't have to call us to do it. You know they can drive that themselves. >> Got it. Any other final words around like what do you see of the partnership going forward? >> Well you know I think this would be a great platform for Xi, so I think we should probably talk about that. >> Yeah, yeah, we should talk about that separately. Thanks a lot, Jason. >> Thanks. >> All right, man. (audience clapping) So as we look at the full journey now between obviously from invisible infrastructure to invisible clouds, you know there is one thing though to take away beyond many updates that we've had so far. And the fact is that everything that I've talked about so far is about completing a full blown true IA stack from all the way from compute to storage, to vitualization, containers to network services, and so forth. But every public cloud, a true cloud in that sense, has a full blown layer of services that's set on top either for traditional workloads or for new workloads, whether it be machine-learning, whether it be big data, you know name it, right? And in the enterprise, if you think about it, many of these services are being provisioned or provided through a bunch of our partners. Like we have partnerships with Cloudera for big data and so forth. But then based on some customer feedback and a lot of attention from what we've seen in the industry go out, just like AWS, and GCP, and Azure, it's time for Nutanix to have an opinionated view of the past stack. It's time for us to kind of move up the stack with our own offering that obviously adds value but provides some of our core competencies in data and takes it to the next level. And it's in that sense that we're actually launching Nutanix Era to simplify one of the hardest problems in enterprise IT and short of saving you from true Oracle licensing, it solves various other Oracle problems which is about truly simplifying databases much like what RDS did on AWS, imagine enterprise RDS on demand where you can provision, lifecycle manage your database with one-click. And to talk about this powerful new functionality, let me invite Bala and John on stage to give you one final demo. (rock music) Good to see you guys. >> Yep, thank you. >> All right, so we've got lots of folks here. They're all anxious to get to the next level. So this demo, really rock it. So what are we going to talk about? We're going to start with say maybe some database provisioning? Do you want to set it up? >> We have one dream, Sunil, one single dream to pass you off, that is what Nutanix is today for IT apps, we want to recreate that magic for devops and get back those weekends and freedom to DBAs. >> Got it. Let's start with, what, provisioning? >> Bala: Yep, John. >> Yeah, we're going to get in provisioning. So provisioning databases inside the enterprise is a significant undertaking that usually involves a myriad of resources and could take days. It doesn't get any easier after that for the longterm maintence with things like upgrades and environment refreshes and so on. Bala and team have been working on this challenge for quite awhile now. So we've architected Nutanix Era to cater to these enterprise use cases and make it one-click like you said. And Bala and I are so excited to finally show this to the world. We think it's actually Nutanix's best kept secrets. >> Got it, all right man, let's take a look at it. >> So we're going to be provisioning a sales database today. It's a four-step workflow. The first part is choosing our database engine. And since it's our sales database, we want it to be highly available. So we'll do a two node rack configuration. From there, it asks us where we want to land this service. We can either land it on an existing service that's already been provisioned, or if we're starting net new or for whatever reason, we can create a new service for it. The key thing here is we're not asking anybody how to do the work, we're asking what work you want done. And the other key thing here is we've architected this concept called profiles. So you tell us how much resources you need as well as what network type you want and what software revision you want. This is actually controlled by the DBAs. So DBAs, and compute administrators, and network administrators, so they can set their standards without having a DBA. >> Sunil: Got it, okay, let's take a look. >> John: So if we go to the next piece here, it's going to personalize their database. The key thing here, again, is that we're not asking you how many data files you want or anything in that regard. So we're going to be provisioning this to Nutanix's best practices. And the key thing there is just like these past services you don't have to read dozens of pages of best practice guides, it just does what's best for the platform. >> Sunil: Got it. And so these are a multitude of provisioning steps that normally one would take I guess hours if not days to provision and Oracle RAC data. >> John: Yeah, across multiple teams too. So if you think about the lifecycle especially if you have onshore and offshore resources, I mean this might even be longer than days. >> Sunil: Got it. And then there are a few steps here, and we'll lead into potentially the Time Machine construct too? >> John: Yeah, so since this is a critical database, we want data protection. So we're going to be delivering that through a feature called Time Machines. We'll leave this at the defaults for now, but the key thing to not here is we've got SLAs that deliver both continuous data protection as well as telescoping checkpoints for historical recovery. >> Sunil: Got it. So that's provisioning. We've kicked off Oracle, what, two node database and so forth? >> John: Yep, two node database. So we've got a handful of tasks that this is going to automate. We'll check back in in a few minutes. >> Got it. Why don't we talk about the other aspects then, Bala, maybe around, one of the things that, you know and I know many of you guys have seen this, is the fact that if you look at database especially Oracle but in general even SQL and so forth is the fact that look if you really simplified it to a developer, it should be as simple as I copy my production database, and I paste it to create my own dev instance. And whenever I need it, I need to obviously do it the opposite way, right? So that was the goal that we set ahead for us to actually deliver this new past service around Era for our customers. So you want to talk a little bit more about it? >> Sure Sunil. If you look at most of the data management functionality, they're pretty much like flavors of copy paste operations on database entities. But the trouble is the seemingly simple, innocuous operations of our daily lives becomes the most dreaded, complex, long running, error prone operations in data center. So we actually planned to tame this complexity and bring consumer grade simplicity to these operations, also make these clones extremely efficient without compromising the quality of service. And the best part is, the customers can enjoy these services not only for databases running on Nutanix, but also for databases running on third party systems. >> Got it. So let's take a look at this functionality of I guess snapshoting, clone and recovery that you've now built into the product. >> Right. So now if you see the core feature of this whole product is something we call Time Machine. Time Machine lets the database administrators actually capture the database tape to the granularity of seconds and also lets them create clones, refresh them to any point in time, and also recover the databases if the databases are running on the same Nutanix platform. Let's take a look at the demo with the Time Machine. So here is our customer relationship database management database which is about 2.3 terabytes. If you see, the Time Machine has been active about four months, and SLA has been set for continuously code revision of 30 days and then slowly tapers off 30 days of daily backup and weekly backups and so on, so forth. On the right hand side, you will see different colors. The green color is pretty much your continuously code revision, what we call them. That lets you to go back to any point in time to the granularity of seconds within those 30 days. And then the discreet code revision lets you go back to any snapshot of the backup that is maintained there kind of stuff. In a way, you see this Time Machine is pretty much like your modern day car with self driving ability. All you need to do is set the goals, and the Time Machine will do whatever is needed to reach up to the goal kind of stuff. >> Sunil: So why don't we quickly do a snapshot? >> Bala: Yeah, some of these times you need to create a snapshot for backup purposes, Time Machine has manual controls. All you need to do is give it a snapshot name. And then you have the ability to actually persist this snapshot data into a third party or object store so that your durability and that global data access requirements are met kind of stuff. So we kick off a snapshot operation. Let's look at what it is doing. If you see what is the snapshot operation that this is going through, there is a step called quiescing the databases. Basically, we're using application-centric APIs, and here it's actually RMAN of Oracle. We are using the RMan of Oracle to quiesce the database and performing application consistent storage snapshots with Nutanix technology. Basically we are fusing application-centric and then Nutanix platform and quiescing it. Just for a data point, if you have to use traditional technology and create a backup for this kind of size, it takes over four to six hours, whereas on Nutanix it's going to be a matter of seconds. So it almost looks like snapshot is done. This is full sensitive backup. You can pretty much use it for database restore kind of stuff. Maybe we'll do a clone demo and see how it goes. >> John: Yeah, let's go check it out. >> Bala: So for clone, again through the simplicity of command Z command, all you need to do is pick the time of your choice maybe around three o'clock in the morning today. >> John: Yeah, let's go with 3:02. >> Bala: 3:02, okay. >> John: Yeah, why not? >> Bala: You select the time, all you need to do is click on the clone. And most of the inputs that are needed for the clone process will be defaulted intelligently by us, right? And you have to make two choices that is where do you want this clone to be created with a brand new VM database server, or do you want to place that in your existing server? So we'll go with a brand new server, and then all you need to do is just give the password for you new clone database, and then clone it kind of stuff. >> Sunil: And this is an example of personalizing the database so a developer can do that. >> Bala: Right. So here is the clone kicking in. And what this is trying to do is actually it's creating a database VM and then registering the database, restoring the snapshot, and then recoding the logs up to three o'clock in the morning like what we just saw that, and then actually giving back the database to the requester kind of stuff. >> Maybe one finally thing, John. Do you want to show us the provision database that we kicked off? >> Yeah, it looks like it just finished a few seconds ago. So you can see all the tasks that we were talking about here before from creating the virtual infrastructure, and provisioning the database infrastructure, and configuring data protection. So I can go access this database now. >> Again, just to highlight this, guys. What we just showed you is an Oracle two node instance provisioned live in a few minutes on Nutanix. And this is something that even in a public cloud when you go to RDS on AWS or anything like that, you still can't provision Oracle RAC by the way, right? But that's what you've seen now, and that's what the power of Nutanix Era is. Okay, all right? >> Thank you. >> Thanks. (audience clapping) >> And one final thing around, obviously when we're building this, it's built as a past service. It's not meant just for operational benefits. And so one of the core design principles has been around being API first. You want to show that a little bit? >> Absolutely, Sunil, this whole product is built on API fist architecture. Pretty much what we have seen today and all the functionality that we've been able to show today, everything is built on Rest APIs, and you can pretty much integrate with service now architecture and give you your devops experience for your customers. We do have a plan for full fledged self-service portal eventually, and then make it as a proper service. >> Got it, great job, Bala. >> Thank you. >> Thanks, John. Good stuff, man. >> Thanks. >> All right. (audience clapping) So with Nutanix Era being this one-click provisioning, lifecycle management powered by APIs, I think what we're going to see is the fact that a lot of the products that we've talked about so far while you know I've talked about things like Calm, Flow, AHV functionality that have all been released in 5.5, 5.6, a bunch of the other stuff are also coming shortly. So I would strongly encourage you guys to kind of space 'em, you know most of these products that we've talked about, in fact, all of the products that we've talked about are going to be in the breakout sessions. We're going to go deep into them in the demos as well as in the pods. So spend some quality time not just on the stuff that's been shipping but also stuff that's coming out. And so one thing to keep in mind to sort of takeaway is that we're doing this all obviously with freedom as the goal. But from the products side, it has to be driven by choice whether the choice is based on platforms, it's based on hypervisors, whether it's based on consumption models and eventually even though we're starting with the management plane, eventually we'll go with the data plane of how do I actually provide a multi-cloud choice as well. And so when we wrap things up, and we look at the five freedoms that Ben talked about. Don't forget the sixth freedom especially after six to seven p.m. where the whole goal as a Nutanix family and extended family make sure we mix it up. Okay, thank you so much, and we'll see you around. (audience clapping) >> PA Announcer: Ladies and gentlemen, this concludes our morning keynote session. Breakouts will begin in 15 minutes. ♪ To do what I want ♪

Published Date : May 9 2018

SUMMARY :

PA Announcer: Off the plastic tab, would you please welcome state of Louisiana And it's my pleasure to welcome you all to And I'd like to second that warm welcome. the free spirit. the Nutanix Freedom video, enjoy. And I read the tagline from license to launch You have the freedom to go and choose and having to gain the trust with you over time, At the same time, you spent the last seven, eight years and apply intelligence to say how can we lower that you go and advise with some of the software to essentially reduce their you know they're supposed to save are still only 20%, 25% utilized. And the next thing is you can't do So you actually sized it for peak, and bring the control while retaining that agility So you want to show us something? And you know glad to be here. to see you know are there resources that you look at everyday. So billions of events, billing, metering events So what we have here is a very popular are everywhere, the cloud is everywhere actually. So when you bring your master account that you create because you don't want So we have you know consumption of the services. There's a lot of money being made So not only just get visibility at you know compute So all of you who actually have not gone the single pane view you know to mange What you see here is they're using have been active in Russia as well. to detect you know how can you rightsize So one click, you can actually just pick Yeah, and not only remove the resources the consumption for the Nutanix, you know the services And the most powerful thing is you can go to say how can you really remove things. So again, similar to save, you're saying So the idea is how can we give our people It looks like there's going to be a talk here at 10:30. Yes, so you can go and write your own security So the end in all this is, again, one of the things And to start the session, I think you know the part You barely fit in that door, man. that's grown from VDI to business critical So if we hop over here to our explore tab, in recent releases to kind of make this happen? Now to allow you to full take advantage of that, On the same environment though, we're going to show you So one of the shares that you see there is home directories. Do we have the cluster also showing, So if we think about cloud, cloud's obviously a big So just like the market took a left turn on Kubernetes, Now for the developer, the application architect, So the goal of ACS is to ensure So you can deploy however many of these He hasn't seen the movies yet. And this is going to be the number And if you come over to our office, and we welcome you, Thanks so much. And like Steve who's been with us for awhile, So I remember, so how many of you guys And the deployment is smaller than what we had And it covers a lot of use cases as well. So the use cases, we're 90%, 95% deployed on Nutanix, So the plan going forward, you actually asked And the same thing when you actually flip it to AHV And to give you a flavor of that, let me show you And now you can see this is a much simpler picture. Yeah, for those guys, you know that's not the Avengers This is next years theme. So before we cut over from Netsil to Flow, And that of course is the most important So that's like one click segmentation and play right now? You can compare it to other products in the space. in that next few releases. And if I scroll down again, and I see the top five of the network which is if you can truly isolate (audience clapping) And you know it's not just using Nutanix than in a picture by the way. So tell me a little bit about this cloud initiative. and the second award was really related to that. And a lot of this was obviously based on an infrastructure And you know initiatives change year on year, So the stack you know obviously built on Nutanix, of obviously the business takeaway here? There has to be some outcomes that we measure And in the journey obviously you got So you're supposed to wear some shoes, right? for the last couple years. I'm sure you guys have received shoes like these. So again, I'm sure many of you liked them. That's the only thing that hasn't worked, Thanks a lot. is to enable you to choose the right cloud Yeah, we should. of the art as you were saying in the industry. that to my Xi cloud services account. So you don't have to log in somewhere and create an account. But let's go take a look at the Xi side that you already knew mynutanix.com and 30 seconds in, or we will deploy a VPN for you on premises. So that's one of the other things to note the gateway configured, your VLAN information Vinny: So right now, you know what's happening is And just while you guys were talking, of the other things we've done? And first thing you might notice is And we allow the setting to be set on the Xi cloud services There's always going to be some networking problem onstage. This is a good sign that we're running So for example, you just saw that the same user is to also show capabilities to actually do failover And that says okay I already have the backups is essentially coming off the mainstream Xi profile. That's the most interesting piece here. or the test network to the test network. So let's see how the experience looks like details in place for the test to be successful. And to give you guys an idea behind the scenes, And so great, while you were explaining that, And that's essentially anybody in the audience here Yeah so by the way, just to give you guys Yeah, you guys should all go and vote. Let's see where Xi is. I'll scroll down a little bit, but keep the... Thank you so much. What's something that you know we've been doing And what that means is when you have And very quickly you can see these are the VMs So one of the core innovations being built So that means it has quiesced and stopping the VM there. So essentially what Vinny just showed and making it painless so that the downtime you have And you know Jason who's the CTO of Cyxtera. of the CenturyLink data centers. bunch of other assets. So we have over 50 data centers around the world, And to be fair, a lot of what happens in data centers in the traditional kind of on demand is that you know this isn't a manged service. of the partnership going forward? Well you know I think this would be Thanks a lot, Jason. And in the enterprise, if you think about it, We're going to start with say maybe some to pass you off, that is what Nutanix is Got it. And Bala and I are so excited to finally show this And the other key thing here is we've architected And the key thing there is just like these past services if not days to provision and Oracle RAC data. So if you think about the lifecycle And then there are a few steps here, but the key thing to not here is we've got So that's provisioning. that this is going to automate. is the fact that if you look at database And the best part is, the customers So let's take a look at this functionality On the right hand side, you will see different colors. And then you have the ability to actually persist of command Z command, all you need to do Bala: You select the time, all you need the database so a developer can do that. back the database to the requester kind of stuff. Do you want to show us the provision database So you can see all the tasks that we were talking about here What we just showed you is an Oracle two node instance (audience clapping) And so one of the core design principles and all the functionality that we've been able Good stuff, man. But from the products side, it has to be driven by choice PA Announcer: Ladies and gentlemen,

ENTITIES

Entity	Category	Confidence
Karen	PERSON	0.99+
Julie	PERSON	0.99+
Melina	PERSON	0.99+
Steve	PERSON	0.99+
Matthew	PERSON	0.99+
Julie O'Brien	PERSON	0.99+
Vinny	PERSON	0.99+
Cisco	ORGANIZATION	0.99+
Dell	ORGANIZATION	0.99+
Nutanix	ORGANIZATION	0.99+
Dheeraj	PERSON	0.99+
Russia	LOCATION	0.99+
Lenovo	ORGANIZATION	0.99+
Miami	LOCATION	0.99+
Amazon	ORGANIZATION	0.99+
HP	ORGANIZATION	0.99+
2012	DATE	0.99+
Acropolis	ORGANIZATION	0.99+
Stacy Nigh	PERSON	0.99+
Vijay Rayapati	PERSON	0.99+
Stacy	PERSON	0.99+
Prism	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
Rajiv	PERSON	0.99+
$3 billion	QUANTITY	0.99+
2016	DATE	0.99+
Matt Vince	PERSON	0.99+
Geneva	LOCATION	0.99+
two	QUANTITY	0.99+
Thursday	DATE	0.99+
Vijay	PERSON	0.99+
one hour	QUANTITY	0.99+
100%	QUANTITY	0.99+
$100	QUANTITY	0.99+
Steve Poitras	PERSON	0.99+
15 times	QUANTITY	0.99+
Casablanca	LOCATION	0.99+
2014	DATE	0.99+
Choice Hotels International	ORGANIZATION	0.99+
Dheeraj Pandey	PERSON	0.99+
Denmark	LOCATION	0.99+
4,000	QUANTITY	0.99+
2015	DATE	0.99+
December	DATE	0.99+
three	QUANTITY	0.99+
3.8 petabytes	QUANTITY	0.99+
six times	QUANTITY	0.99+
40	QUANTITY	0.99+
New Orleans	LOCATION	0.99+
Lenova	ORGANIZATION	0.99+
Netsil	ORGANIZATION	0.99+
two sides	QUANTITY	0.99+
100 customers	QUANTITY	0.99+
20%	QUANTITY	0.99+

Kunal Agarwal, Unravel Data | Big Data SV 2018

>> Announcer: Live from San Jose, it's theCube! Presenting Big Data: Silicon Valley Brought to you by SiliconANGLE Media and its ecosystem partners. (techno music) >> Welcome back to theCube. We are live on our first day of coverage at our event BigDataSV. I am Lisa Martin with my co-host George Gilbert. We are at this really cool venue in downtown San Jose. We invite you to come by today, tonight for our cocktail party. It's called Forager Tasting Room and Eatery. Tasty stuff, really, really good. We are down the street from the Strata Data Conference, and we're excited to welcome to theCube a first-time guest, Kunal Agarwal, the CEO of Unravel Data. Kunal, welcome to theCube. >> Thank you so much for having me. >> So, I'm a marketing girl. I love the name Unravel Data. (Kunal laughs) >> Thank you. >> Two year old company. Tell us a bit about what you guys do and why that name... What's the implication there with respect to big data? >> Yeah, we are a application performance management company. And big data applications are just very complex. And the name Unravel is all about unraveling the mysteries of big data and understanding why things are not performing well and not really needing a PhD to do so. We're simplifying application performance management for the big data stack. >> Lisa: Excellent. >> So, so, um, you know, one of the things that a lot of people are talking about with Hadoop, originally it was this cauldron of innovation. Because we had the "let a thousand flowers bloom" in terms of all the Apache projects. But then once we tried to get it into operation, we discovered there's a... >> Kunal: There's a lot of problems. (Kunal laughs) >> There's an overhead, there's a downside to it. >> Maybe tell us, tell us why you both need to know, you need to know how people have done this many, many times. >> Yeah. >> How you need to learn from experience and then how you can apply that even in an environment where someone hasn't been doing it for that long. >> Right. So, if I back a little bit. Big data is powerful, right? It's giving companies an advantage that they never had, and data's an asset to all of these different companies. Now they're running everything from BI, machine learning, artificial intelligence, IOT, streaming applications on top of it for various reasons. Maybe it is to create a new product to understand the customers better, etc., But as you rightly pointed out, when you start to implement all of these different applications and jobs, it's very, very hard. It's because big data is very complex. With that great power comes a lot of complexity, and what we started to see is a lot of companies, while they want to create these applications and provide that differentiation to their company, they just don't have enough expertise as well in house to go and write good applications, maintain these applications, and even manage the underlying infrastructure and cluster that all these applications are running on. So we took it upon ourselves where we thought, Hey, if we simplify application performance management and if we simplify ongoing management challenges, then these companies would run more big data applications, they would be able to expand their use cases, and not really be fearful of, Hey, we don't know how to go and solve these problems. Do we actually rely on our system that is so complex and new? And that's the gap the Unravel fills, which is we monitor and manage not only one componenent of the big data ecosystem, but like you pointed out, it's a, it's a full zoo of all of these systems. You have Hadoop, and you have Spark, and you have Kafka for data injection. You may have some NoSQL systems and newer MPP platforms as well. So the vision of Unravel is really to be that one place where you can come in and understand what's happening with your applications and your system overall and be able to resolve those problems in an automatic, simple way. >> So, all right, let's start at the concrete level of what a developer might get out of >> Kunal: Right. >> something that's wrapped in Unravel and then tell us what the administrator experiences. >> Kunal: Absolutely. So if you are a big data developer you've got in a business requirement that, Hey, go and make this application that understands our customers better, right? They may choose a tool of their liking, maybe Hive, maybe Spark, maybe Kafka for data injection. And what they'll do is they'll write an app first in dev, in their dev environment or the QA environment. And they'll say, Hey, maybe this application is failing, or maybe this application is not performing as fast as I want it to, or even worse that this application is starting to hog a lot of resources, which may slow down my other applications. Now to understand what's causing these kind of problems today developers really need a PhD to go and decipher them. They have to look at tons of law rogs, uh, raw logs metrics, configuration settings and then try to stitch the story up in their head, trying to figure out what is the effect, what is the cause? Maybe it's this problem, maybe it's some other problem. And then do trial and error to try, you know to solving that particular issue. Now what we've seen is big data developers come in variety of flavors. You have the hardcore developers who truly understand Spark and Hadoop and everything, but then 80% of the people submitting these applications are data scientist or business analysts, who may understand SQL, who may know Python, but don't necessarily know what distributed computing and parallel processing and all of these things really are, and where can inefficiencies and problems really lie. So we give them this one view, which will connect all of these different data sources and then tell them in plain English, this is the problem, this is why this problem happened, and this is how you can go and resolve it, thereby getting them unstuck and making it very simple for them to go in and get the performance that they're getting. >> So, these, these, um, they're the developers up front and you're giving them a whole new, sort of, toolchain or environment to solve the operational issues. >> Kunal: Right. >> So that the, if it's DevOps, its really dev is much more sufficient. >> Yes, yes, I mean, all companies want to run fast. They don't want to be slowed down. If you have a problem today, they'll file a ticket, it'll go to the operations team, you wait a couple of days to get some more information back. That just means your business has slowed down. If things are simple enough where the application developers themselves can resolve a lot of these issues, that'll get the business unstuck and get them moving on further. Now, to the other point which you were asking, which is what about the operations and the app support people? So, Unravel's a great tool for them too because that helps them see what's happening holistically in the cluster. How are other applications behaving with each other? It's usually a multitenant, multiapplication environment that these big data jobs are running on. So, is my apps slowing down George's apps? Am I stealing resources from your applications? More so, not just about an individual application issue itself. So Unravel will give you visibility into each app, as well as the overall cluster to help you understand cluster-wide problems. >> Love to get at, maybe peel apart your target audience a little bit. You talked about DevOps. But also the business analysts, data scientists, and we talk about big data. Data is, has such tremendous power to fuel a company and, you know, like you said use it to deliver and, create and deliver new products. Are you talking with multiple audiences within a company? Do you start at DevOps and they bring in their peers? Or do you actually start, maybe, at the Chief Data Officer level? What's that kind of entrance for Unravel? >> So the word I use to describe this is DataOps, instead of DevOps, right? So in the older world you had developers, and you had operations people. Over here you have a data team and operations people, and that data team can comprise of the developers, the data scientists, the business analysts, etc., as well. But you're right. Although we first target the operations role because they have to manage and monitor the system and make sure everything is running like a well-oiled machine, they are now spreading it out to be end-users, meaning the developers themselves saying, "Don't come to me for every problem. "Look at Unravel, try solve it here, "and if you cannot, then come to me." This is all, again, improving agility within the company, making sure that people have the necessary tools and insights to carry on with their day. >> Sounds like an enabler, >> Yeah, absolutely. >> That operations would push down to the DevOp, the developers themselves. >> And even the managers and the CDOs, for example, they want to see their ROI that they're getting from their big data investments. They want to see, they have put in these millions of dollars, have got an infrastructure and these services set up, but how are we actually moving the needle forward? Are there any applications that we're actually putting in business, and is that driving any business value? So we will be able to give them a very nice dashboard helping them understand what kind of throughput are you getting from your system, how many applications were you able to develop last week and onboard to your production environment? And what's the rate of innovation that's really happening inside your company on those big data ecosystems? >> It sort of brings up an interesting question on two prongs. One is the well-known, but inexact number about how many big data projects, >> Kunal: Yeah, yeah. >> I don't know whether they fail or didn't pay off. So there's going in and saying, "Hey, we can help you manage this "because it was too complicated." But then there's also the, all the folks who decided, "Well, we really don't want "to run it all on-prem. "We're not going to throw away everything we did there, "but we're going to also put a lot of new investment >> Kunal: Exactly, exactly. >> in the cloud. Now, Wikibon has a term for that, which true private cloud, which is when you have the operational processes that you use in the public cloud and you can apply them on-prem. >> Right. >> George: But there's not many products that help you do that. How can Unravel work...? >> Kunal: That's a very good questions, George. We're seeing the world move more and more to a cloud environment, or I should say an on-demand environment where you're not so bothered about the infrastructure and the services, but you want Spark as a dial tone. You want Kafka as a dial tone. You want a machine-learning platform as a dial tone. You want to come in there, you want to put in your data, and you want to just start running it. Unravel has been designed from the ground up to monitor and manage any of these environments. So, Unravel can solve problems for your applications running on-premise and similarly all the applications that are running on cloud. Now, on the cloud there are other levels of problems as well so, of course, you'd have applications that are slow, applications that are failing; we can solve those problems. But if you look at a cloud environment, a lot of these now provide you an autoscaling capability, meaning, Hey, if this app doesn't run in the amount of time that we were hoping it to run, let's add extra hardware and run this application. Well, if you just keep throwing machines at the problem, it's not going to solve your issue. Now, it doesn't decrease the time that it will take linearly with how many servers that you're actually throwing in there, so what we can help companies understand is what is the resource requirement of a particular application? How should we be intelligently allocating resources to make sure that you're able to meet your time SLAs, your constraints of, here I need to finish this with x number of minutes, but at the same time be intelligent about how much cost you're spending over there. Do you actually need 500 containers to go and run this app? Well, you may have needed 200. How do you know that? So, Unravel will also help you get efficient with your run, not just faster, but also can it be a good multitenant citizen, can it use limited resources to actually run this applications as well? >> So, Kunal, some of the things I'm hearing from a customer's standpoint that are potential positive business outcomes are internal: performance boost. >> Kunal: Yeah. >> It also sounds like, sort of... productivity improvements internally. >> And then also the opportunity to have the insight to deliver new products, but even I'm thinking of, you know, helping make a retailer, for example, be able to do more targeted marketing, so >> the business outcomes and the impact that Unravel can make really seem to have pretty strong internal and external benefits. >> Kunal: Yes. >> Is there a favorite customer story, (Kunal laughs) don't have to mention names, that you really think speaks to your capabilities? >> So, 100% Improving performance is a very big factor of what Unravel can do. Decreasing costs by improving productivity, by limiting the amount of resources that you're using, is a very, very big factor. Now, amongst all of these companies that we work with, one key factor is improving reliability, which means, Hey, it's fine that he can speed up this application, but sometimes I know the latency that I expect from an app, maybe it's a second, maybe it's a minute, depending on the type of application. But what businesses cannot tolerate is this app taking five x amount more time today. If it's going to finish in a minute, tell me it'll finish in a minute and make sure it finishes in a minute. And this is a big use case for all of the big data vendors because a lot of the customers are moving from Teradata, or from Vertica, or from other relation databases, on to Hortonworks or Cloudera or Amazon EMR. Why? Because it's one tenth the amount of cost for running these workloads. But, all the customers get frustrated and say, "I don't mind paying 10 x more money, "but because over there it used to work. "Over here, there are just so many complications, "and I don't have reliability with these applications." So that's a big, big factor of, you know, how we actually help these customers get value out of the Unravel product. >> Okay, so, um... A question I'm, sort of... why aren't there so many other Unravels? >> Kunal: Yeah. (Kunal laughs) >> From what I understood from past conversations. >> Kunal: Yeah. >> You can only really build the models that are at the heart of your capabilities based on tons and tons of telemetry >> Kunal: Yeah. >> that cloud providers or, or, sort of, internet scale service providers have accumulated in that, because they all have sort of a well-known set of configurations and well-known kind of typology. In other words, there're not a million degrees of freedom on any particular side that you can, you have a well-scoped problem, and you have tons of data. So it's easier to build the models. So who, who else could do this? >> Yeah, so the difference between Unravel and other monitoring products is Unravel is not a monitoring product. It's an intelligent performance management suite. What that means is we don't just give you graphs and metrics and say, "Here are all the raw information, "you go figure it out." Instead, we have to take it a step further where we are actually giving people answers. In order to develop something like that, you need full stack information; that's number one. Meaning information from applications all the way down to infrastructure and everything in between. Why? Because problems can lie anywhere. And if you don't have that full stack info, you're blind-siding yourself, or limiting the scope of the problems that you can actually search for. Secondly is, like you were rightly pointing out, how do I create answers from all this raw data? So you have to think like how an expert with big data would think, which is if there is a problem what are the kinds of checks, balances, places that that person would look into, and how would that person establish that this is indeed the root cause of the problem today? And then, how would that person actually resolve this particular problem? So, we have a big team of scientists, researchers. In fact, my co-founder is a professor of computer science at Duke University who has been researching data-based optimization techniques for the last decade. We have about 80 plus publications in this area, Starfish being one of them. We have a bunch of other publications, which talk about how do you automate problem discovery, root cause analysis, as well as resolution, to get best performance out of these different databases? And you're right. A lot of work has gone on the research side, but a lot of work has gone in understanding the needs of the customers. So we worked with some of the biggest companies out there, which have some of the biggest big data clusters, to learn from them, what are some everyday, ongoing management challenges that you face, and then taking that problem to our datasets and figuring out, how can we automate problem discovery? How can we proactively spot a lot of these errors? I joke around and I tell people that we're big data for big data. Right? All these companies that we serve, they are gathering all of this data, and they're trying to find patterns, and they're trying to find, you know, some sort of an insight with their data. Our data is system generated data, performance data, application data, and we're doing the exact same thing, which is figuring out inefficiencies, problems, cause and effect of things, to be able to solve it in a more intelligent, smart way. >> Well, Kunal, thank you so much for stopping by theCube >> Kunal: Of course. >> And sharing how Unravel Data is helping to unravel the complexities of big data. (Kunal laughs) >> Thank you so much. Really appreciate it. >> Now you're a Cube almuni. (Kunal laughs) >> Absolutely. Thanks so much for having me. >> Kunal, thanks. >> Yeah, and we want to thank you for watching the Cube. I'm Lisa Martin with George Gilbert. We are live at our own event BigData SV in downtown San Jose, California. Stick around. George and I will be right back with our next guest. (quiet crowd noise) (techno music)

Published Date : Mar 8 2018

SUMMARY :

Brought to you by SiliconANGLE Media We invite you to come by today, I love the name Unravel Data. Tell us a bit about what you guys do and not really needing a PhD to do so. So, so, um, you know, one of the things that Kunal: There's a lot of problems. there's a downside to it. tell us why you both need to know, and then how you can apply that even in an environment of the big data ecosystem, but like you pointed out, and then tell us what the administrator experiences. and this is how you can go and resolve it, and you're giving them a whole new, sort of, So that the, if it's DevOps, Now, to the other point which you were asking, to fuel a company and, you know, like you said So in the older world you had developers, DevOp, the developers themselves. and is that driving any business value? One is the well-known, but inexact number "Hey, we can help you manage this and you can apply them on-prem. that help you do that. and you want to just start running it. So, Kunal, some of the things I'm hearing It also sounds like, sort of... that Unravel can make really seem to have So that's a big, big factor of, you know, A question I'm, sort of... and you have tons of data. What that means is we don't just give you graphs to unravel the complexities of big data. Thank you so much. Now you're a Cube almuni. Thanks so much for having me. Yeah, and we want to thank you

ENTITIES

Entity	Category	Confidence
George Gilbert	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Kunal Agarwal	PERSON	0.99+
George	PERSON	0.99+
Kunal	PERSON	0.99+
Lisa	PERSON	0.99+
80%	QUANTITY	0.99+
Hortonworks	ORGANIZATION	0.99+
100%	QUANTITY	0.99+
Vertica	ORGANIZATION	0.99+
Unravel Data	ORGANIZATION	0.99+
Teradata	ORGANIZATION	0.99+
today	DATE	0.99+
500 containers	QUANTITY	0.99+
One	QUANTITY	0.99+
Two year	QUANTITY	0.99+
two prongs	QUANTITY	0.99+
last week	DATE	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
tonight	DATE	0.99+
200	QUANTITY	0.99+
first day	QUANTITY	0.99+
San Jose	LOCATION	0.99+
Spark	TITLE	0.99+
Cloudera	ORGANIZATION	0.99+
each app	QUANTITY	0.99+
Python	TITLE	0.98+
a minute	QUANTITY	0.98+
English	OTHER	0.98+
one	QUANTITY	0.98+
Duke University	ORGANIZATION	0.98+
five	QUANTITY	0.98+
Kafka	TITLE	0.98+
Hadoop	TITLE	0.98+
BigData SV	EVENT	0.97+
first-time	QUANTITY	0.97+
Strata Data Conference	EVENT	0.97+
one key factor	QUANTITY	0.96+
millions of dollars	QUANTITY	0.95+
about 80 plus publications	QUANTITY	0.95+
SQL	TITLE	0.95+
DevOps	TITLE	0.94+
first	QUANTITY	0.94+
BigDataSV	EVENT	0.94+
tons and tons	QUANTITY	0.94+
both	QUANTITY	0.94+
Unravel	ORGANIZATION	0.93+
Secondly	QUANTITY	0.91+
million degrees	QUANTITY	0.91+
San Jose, California	LOCATION	0.91+
Hive	TITLE	0.91+
last decade	DATE	0.91+
Unravel	TITLE	0.9+

Hitesh Sheth, Vectra | CUBE Conversation, Feb 2018

(triumphant music) >> Hello and welcome to a special CUBE Conversation, exclusive content here in Palo Alto Studios, I'm John Furrier, the co-founder of SiliconANGLE Media, and cohost of theCUBE. We have exclusive news with Vectra Networks announcing new funding, new R and D facility. I'm here with the president and CEO, Hitesh Sheth, who's the president and CEO. Welcome to theCUBE Conversation, congratulations. >> Thank you John. glad to be here. >> So you've got some big news. >> Vectra Networks, you guys doing some pretty cool stuff with AI and cyber. >> Correct. >> But it's not just software, it's really kind of changing the game with IT operations, the entire Cloud movement, DevOps automations, all impacting the enterprise. >> Hitesh: Yes. >> And other companies. >> Hitesh: Yes. >> Before we dig into some of the exclusive news you guys have, take a minute to talk about, what is Vectra? What is Vectra Networks? >> Maybe it'd be useful to give you context of the way we see the security industry evolving. And if you think about the last 20 years, and if you were to speak to the security person in an enterprise, their primary concern would be around access banishment, who gets in, who gets out. The firewall industry was born to solve this problem. And you know, in many ways its been a gift that's kept on giving. You know, you've got companies with multi-billion dollar evaluations, Palo Alto, Checkpoint, Fortinet, you know, piece of Cisco, etc, right? There's roughly about 40 billion dollars on the market cap sitting in this industry today. Now, if you go back to the same enterprise today, and you look at the next 5-10 years and you ask them, "What is the number one issue that you care about?" Right? It's no longer who's getting in and out from an access policy standpoint, it's all about threat, management, and mitigation. So, the threat's signal is now the most important commodity inside the enterprise and the pervasive challenge for the customer, the enterprise customer, is, "How do I get my hands on this threat's signal in the most efficient way possible?" And we, at Vectra, are all about automating and helping our customers hunt for advanced cyber attacks using artificial intelligence. >> Where did you get the idea of AI's automation? I've always said in theCUBE, "Oh, AI's a bunch of b.s. Because real true AI is there. But again, AI is really kind of growing out of machine learning. >> Hitesh: Right >> Automating, and so this kind of loose definition but certainly is very sexy right now. People love AI. >> Hitesh: Correct. >> I mean, AI is awesome. But at a practical matter, it seems to be very important for good things, also for the enterprise, where'd you get the idea for using AI for cyber? >> Well, you know, I would go back to in my journey intersection with the notion of using AI for cyber security, Back in about 2010, there are major cyber events reported in the press. At that time, I was in the networking sector and in the networking sector, we all looked at it and said, "You know, we can do something about this," and being good networking company is, we thought we would build chips that would do DPI and do packet inspection. It was, too be blunt, old school thinking, okay? Fast forward to 2012 and I was sitting with Vinod Khosla of Khosla ventures and we were talking about the notion of security. How can you transform security dramatically >> Mhmm. >> Hitesh: And this is when we started talking about using artificial intelligence. It was very nascent and frankly, if you went up and down Sand Hill at that time, you know, most of the venture companies would have- and they did, because we were raising money at the time, they would look at us and said, "You guys are nuts. This is just not going to happen." You know, it's very experimental, it would take forever to come to pass. But that's usually the best time to go and build a new business and take a risk, right? And we said, you know what, AI has matured enough. >> By the way, at that time, they were also poo-pooing the Cloud. >> Absolutely. >> Amazon will be nothing. >> Yeah, exactly. Generally, a good time, a good time to go and do something revolutionary. But, here are the other things to know. Not only had the technology around AI and its applicability had advanced enough, but two other things have happened at the same time. The cost of compute had changed dramatically. The cost of storage had changed dramatically. And ultimately, if AI is going to be efficient, not only is the software got to be good, but the computer's got to be valid as well. Storage got to be valid as well. These three things were really coming together on their timeframe. >> Well, what's interesting, let's dig into that for a second because knowing what the scene was with networking at the time, you said, "old thinking," but the state of the art, you know, In the 90's and 2000's was, hardware got advanced, so you had wire speed capability. So, you can do some cool things like, you know, like still move through the network and do some inspection. >> Hitesh: Correct. >> And you said DPACK is recommended But that's the concept of looking at the data. >> Hitesh: That's correct. >> John: So, okay, now they might have been narrow view so now you take it back >> Hitesh: Yes. >> With AI, am I getting it right? You're thinking of zooming out saying, okay, >> Hitesh: A couple of things. >> You find that notion of inspection of data >> Right. >> With more storage, more compute >> But it comes down to also, you know, what data are you looking at, right? When you had wire spec in booties, you would apply your classic signature based approaches. So you could deal with known attacks, right? What is really happening, like 2011-2012 onwards is, the attack landscape is more stored dramatically. It changes so fast that the approach of just dealing with the known was never going to be enough. >> Yeah. >> So, how do you deal with the unknown? You need software that can learn. You need software that can adapt on the fly. And this is where machine learning comes into play. >> You got to assume everyone's a bad actor at that point. >> You got to assume everybody has been infiltrated in some way or fashion. >> Well, the Cloud, certainly, you guys were on the front end, kind of probably thought we're crazy with other VC's, you mentioned that. But at the time, I do remember when Cloud was kind of looked at as just nonsense. >> Yeah >> But if you then go look at what that impact has been, you're in the right side of history, congratulations,. What really happened? When was the C change? You mentioned 2012, was that because of the overall threat landscape change? Was that because of open source? Was that because of new state sponsored threats? >> Hitesh: Yeah. A couple things. >> What was the key flash point? >> Hitesh: A couple of things. We saw, at the time, that there was an emerging class of threats in the marketplace being sponsored by either state actors but we also saw that there was significant funding going into creating organized entities that were going to go and hack large enterprises. >> John: Not state sponsored directly, state sponsored, kind of, you know, >> On the side. >> Yeah, on the side. >> Let's call them, "For Profit Entities," okay? >> Sounds like Equifax to me. (laughter) >> That's a good point. And we saw that happening. Trend two was, there were enough public on the record, hacks are getting reported, right? Sony would be a really good example at the time. But just as fundamentally, it's not just enough that there's a market. The technology has got to be sufficiently ready to be transformative, and this is the whole point around what we saw in compute and storage and the fact that there was enough advancement in the machine learning itself that it was worth taking a risk and experimenting to see what's going to happen. And in our journey, I can tell you, it took us about 18 months, really, to kind of tune what we were doing because we tried and we failed for 18 months before we kind of came to an answer that was actually going to gel and work for the customers. >> And what's interesting is having a pattern oriented to look for the unknown >> Hitesh: Yeah >> Because it's, you know, in the old days was, "Hey, here's a bunch of threats, look for'em and be prepared to deploy." Here, you got to deal with a couple of the unknown potentially attack. But also I would say that we've observed the surface areas increased. So, you mention Checkpoint in these firewalls. >> Hitesh: Yes. Absolutely. >> Those are perimeter based security models. So you got a perimeter based environment. >> Hitesh: Correct. >> Everyday. >> Hitesh: And you got IOT. >> IOT. So it's a hacker's dream. >> It's absolutely. The way I like to think about it is you got an end by end probatational issue. You got an infinite possible, if you're a hacker, you're absolutely right, it's Nirvana. You've got endless opportunities to break into the enterprise today. It's just going to get better. It's absolutely going to get better for them. >> John: Well, let's get to the hard news. You guys have an announcement. You've got new funding >> Hitesh: Yeah. >> And an R and D facility, in your words, what is the announcement? Share the data. >> We're really excited to announced that we have raised closed a round of 36 million dollars, Series D funding, it's being led by Atlantic Bridge, they are a growth fund, and they've got significant European roots, and in addition to Atlantic Bridge, we're bringing on board two new investors, two additional investors. The Ireland's Strategic Investment Fund, number one, effectively the sovereign fund of Ireland, and then secondly, Nissho Electronics of Japan. This is going to bring our double funding to 123 millions dollars, today. What we're going to be using this funds for is to find things with. One is the classic expansion of sales and marketing. I think we've had very significance success in our business. From 2016 to 2017, our business grew 181% year end year, subscription based, all subscription revenue. So, we're going to use this, this new fuel, to drive business growth, but just as important, we're going to drive our needs growth significantly. And as part of this new funding, we are opening up a brand new R & D center in Dublin, Ireland. This is our fourth R & D center. We've got one here in San Jose, California. We've got one in Austin, Texas, Cambridge, Massachusetts, and so this is number four. >> John: So, you hired some really smart people. How many engineers do you guys have? >> So, we are about a 140% company, roughly half the company is in R and D. >> I see a lot of engineering going on and you need it, too. So let's talk about competitors. Darktrace is out there, heavily funded companies, >> Hitesh: Yes. >> Their competitor, how do you compare against the competition and why do you think you'll be winning? >> I can tell you, statistically, whether it is Darktrace or we run into barcoding with Cisco as well. We win into large enterprise. We win 90% of the time. [Overlapping Conversation] >> It's actually correct. And I'll describe to you why is it that we win. We look at people like Darktrace and there are other smaller players in the marketplace as well And I'll tell you one thing fundamentally true about the competitive landscape and that differentiates us. AI is on everybody's lips nowadays, right? As you pointed out. But what is generally true for most companies doing AI and I think this is true for our competition as well, it tends to be human augmented AI. It's not really AI, right? This is sort of like the Wizard of Oz, you know, somebody behind the curtain actually doing the work and that ultimately does not deliver the promise of AI and automation to the customer. The one thing we have been very - >> John: They're using AI to cover up essentially manual business models for all people added, is that what you're saying? >> Hitesh: That's correct. Effectively, it's still people oriented answer for the customer and if AI is really true, then automation has got to be the forefront and if automation is really going to be true, then the user experience of the software has got to be second to none >> John: So, I know Mike Lynch is on the board of that company, Darktrace, he was indicted or charged with fraud to front for HP for billions of dollars. So, is he involved? Is he a figurehead? How does he relate to that? >> I think you should talk to Mike. You should put him in this chair and have this conversation. I recommend it, that would be great. >> John: I don't think he'd come on. >> But my understanding is that he has a very heavy hand in the reign of Darktrace. Darktrace, if you go to their website, so this is all public data, if you look at their management chain, this is all Autonomy people. What that means, respect to how Autonomy was running and how Vectra is being run, is for them to speak about, what I can tell you is that, when we meet them competitively, we meet other competitors. >> John: I mean, if I'm a customer, I would have a lot of fear and certainty in doubt to work with an Autonomy led because they had such a head fake with the HP deal and how they handled that software and just software stack wasn't that great either. So, I mean, I would be concerned about that. [Overlapping Discussion] >> History may be repeating itself. >> Okay, so you won't answer the question. Okay, well, let's get back to Vectra. Some interesting, notable things I discovered was, you guys had been observing what's been reported in the press with the Olympics. >> Hitesh: Correct. >> You have information and insight on what's going on with the Olympics. Apparently, they were hacked. Obviously, it's in Korea, so it's Asia, there's no DNS that doesn't have certificates that have been hacked or whatever so, I mean, what's going on in South Korea with the Olympics? What's the impact? What's the data? >> Hitesh: Well, I'm going to think, what is really remarkable is that, despite the history of different kinds of attacks, Equifax, what have you, nation state events, political elections getting impacted and so forth, once again, a very public event. We have had a massive breach and they've been able to infiltrate their systems and the remarkable thing is they- >> John: There's proof on this? >> There's proof on this. This is in the press. There's no secret data in our part, which is, this very much out there, in the public arena, they have been sitting in the infrastructure of the Olympics, in Korea, for months and the remarkable thing is, why were they able to get in? Well, I can tell you, I'm pretty sure that the approach to security that these people took is no different than the approach of security most enterprises take. Right? The thing that should really concern us all is that they chose to attack, they chose to infiltrate, but they actually paused before really fundamentally damaging the infrastructure. It goes to show you that they are demonstrating control. I can come in. I can do what I want for as long as I want. I can stop when I want. >> John: They were undetected. >> They were undetected. Absolutely. >> John: And they realized that these attacks reflected that. >> Absolutely. And given the fact there seems to be a recent trend of going after public events, we have many other such public events coming to bear. >> How would you guys have helped? >> The way we would help them, most fundamentally is that, look, here's the fundamental reality, there are, as we've discussed just a second ago, there are infinite options as to break in, into the infrastructure, but once you're in, right? For people like you and I, who are networking people, you're on our turf and the things you can do inside the network are actually very visible. They're very visible, right? It's like somebody breaking through your door, once they get in, their footprints are everywhere, right? And if you had the ability to get your hands on those footprints, right? You can actually contain the attack at- as close to real time as possible, before any real damage is done. >> But then we're going to see where the action is, no doubt about it, you can actually roll that data up and that's where the computer- >> And then you could apply machine learning. You can extract the data, look at the network, extract the right data out of it, apply machine learning or AI and you can get your hands on the attack well before it does any real damage. >> John: And so to your point, if I get this right, if I hear ya properly, computers are much stronger now. >> Hitesh: Correct. >> And with software and AI techniques, you can move on this data quickly. >> Hitesh: Correct. But you have got to, you've got to have a fundamental mindset shift, which is, "I'm not in the business of stopping attacks anymore, I should try, but I recognize I will be breached every single time. So, then, I better have the mechanisms and the means to catch the attack once it's in my environment." And that mindset shift is not pervasive. I am 1,000% sure at the Olympics that people designed the security search have said, "We can stop this stuff, don't worry about it." You had that taught differently that would not be in this position today. >> This is the problem. In all society, whether it's a shooting at a school or Olympic hack event, the role of data is super critical. That's the focus, thanks for coming on and sharing the exclusive news at theCUBE with exclusive coverage of the breaking news of the new round of funding for Vectra Networks. I'm John Furrier. Thanks for watching. >> Hitesh: Thank you, John. (triumphant music)

Published Date : Feb 21 2018

SUMMARY :

I'm John Furrier, the co-founder of SiliconANGLE Media, Thank you John. Vectra Networks, you guys doing some pretty cool stuff it's really kind of changing the game with IT operations, "What is the number one issue that you care about?" Where did you get the idea of AI's automation? Automating, and so this kind of loose definition But at a practical matter, it seems to be very important and in the networking sector, we all looked at it And we said, you know what, AI has matured enough. By the way, at that time, they were also poo-pooing but the computer's got to be valid as well. but the state of the art, you know, But that's the concept of looking at the data. But it comes down to also, you know, You need software that can adapt on the fly. You got to assume everybody has been infiltrated Well, the Cloud, certainly, you guys But if you then go look at what that impact has been, We saw, at the time, that there was an emerging class Sounds like Equifax to me. in the machine learning itself that it was worth taking a risk of the unknown potentially attack. So you got a perimeter based environment. So it's a hacker's dream. break into the enterprise today. John: Well, let's get to the hard news. Share the data. and in addition to Atlantic Bridge, we're bringing on John: So, you hired some really smart people. So, we are about a 140% company, roughly half the company I see a lot of engineering going on and you need it, too. we run into barcoding with Cisco as well. This is sort of like the Wizard of Oz, you know, and if automation is really going to be true, John: So, I know Mike Lynch is on the board I think you should talk to Mike. and how Vectra is being run, is for them to speak about, a lot of fear and certainty in doubt to work with an reported in the press with the Olympics. What's the impact? and the remarkable thing is they- the approach to security that these people took They were undetected. John: And they realized that And given the fact there seems to be You can actually contain the attack at- as close to You can extract the data, look at the network, John: And so to your point, if I get this right, And with software and AI techniques, you can I am 1,000% sure at the Olympics that people designed and sharing the exclusive news at theCUBE with Hitesh: Thank you, John.

ENTITIES

Entity	Category	Confidence
Mike	PERSON	0.99+
John	PERSON	0.99+
Hitesh Sheth	PERSON	0.99+
Cisco	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
2016	DATE	0.99+
Hitesh	PERSON	0.99+
90%	QUANTITY	0.99+
Mike Lynch	PERSON	0.99+
Olympics	EVENT	0.99+
181%	QUANTITY	0.99+
2017	DATE	0.99+
Nissho Electronics	ORGANIZATION	0.99+
123 millions dollars	QUANTITY	0.99+
Fortinet	ORGANIZATION	0.99+
2012	DATE	0.99+
Vectra Networks	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Vectra	ORGANIZATION	0.99+
Olympic	EVENT	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
Korea	LOCATION	0.99+
Darktrace	ORGANIZATION	0.99+
1,000%	QUANTITY	0.99+
Sony	ORGANIZATION	0.99+
Checkpoint	ORGANIZATION	0.99+
Vinod Khosla	PERSON	0.99+
HP	ORGANIZATION	0.99+
South Korea	LOCATION	0.99+
Atlantic Bridge	ORGANIZATION	0.99+
San Jose, California	LOCATION	0.99+
Wizard of Oz	TITLE	0.99+
Feb 2018	DATE	0.99+
36 million dollars	QUANTITY	0.99+
one	QUANTITY	0.99+
Asia	LOCATION	0.99+
18 months	QUANTITY	0.99+
Dublin, Ireland	LOCATION	0.98+
Equifax	ORGANIZATION	0.98+
today	DATE	0.98+
about 40 billion dollars	QUANTITY	0.98+
two new investors	QUANTITY	0.98+
Palo Alto	ORGANIZATION	0.97+
two other things	QUANTITY	0.97+

John Rydning, IDC | Western Digital the Next Decade of Big Data 2017

>> Announcer: Live from San Jose, California, it's theCUBE covering innovating to fuel the next decade of big data. Brought to you by Western Digital. >> Hey, welcome back everybody. Jeff Frick, here with theCUBE. We are at the Western Digital Headquarters in San Jose, California. It's the Al-Mady Campus. A historic campus. It's had a lot of great innovation, especially in hard drives for years and years and years. This event's called Innovating to Fuel the Next Data Big Data. And we're excited to have a big brain on. We like to get smart people who's been watching this story for a while and will give us a little bit of historical perspective. It's John Rydning. He is the Research Vice President for Hard Drives for IEC. John, Welcome. >> Thank you, Jeff. >> Absolutely. So, what is your take on today's announcement? >> I think it's our very meaningful announcement, especially when you consider that the previous BIGIT Technology announcement for the industry was Helium, about four or five years ago. But, really, the last big technology announcement prior to that was back in 2005, 2006, when the industry announced making this transition to what they called at that time, "Perpendicular Magnetic Recording." And when that was announced it was kind of a similar problem at that time in the industry that we have today, where the industry was just having a difficult time putting more data on each disc inside that drive. And, so they kind of hit this technology wall. And they announced Perpendicular Magnetic Recording and it really put them on a new S curve in terms of their ability to pack more data on each disc and just kind of put it in some perspective. So, after they announce Perpendicular Magnetic Recording, the capacity per disc increased about 30% a year for about five years. And then over, really, a ten year period, increased about an average of about 20% a year. And, so today's announcement is I see a lot of parallels to that. You know, back when Perpendicular Magnetic Recording was announced, really they build. They increased the capacity per platter was growing very slowly. That's where we are today. And with this announcement of MAMR Technology the direction that Western Digital's choosing really could put the industry on a new S curve and putting in terms of putting more capacity, storage capacity on each one of those discs. >> It's interesting. Always reminds me kind of back to the OS in Microsoft in Intel battles. Right? Intel would come out with a new chip and then Microsoft would make a bigger OS and they go back and back and forth and back and forth. >> John: Yeah, that's very >> And we're seeing that here, right? Cuz the demands for the data are growing exponentially. I think one of the numbers that was thrown out earlier today that the data thrown off by people and the data thrown off by machines is so exponentially larger than the data thrown off by business, which has been kind of the big driver of IT spin. And it's really changing. >> It's a huge fundamental shift. It really is >> They had to do something. Right? >> Yeah, the demand for a storage capacity by these large data centers is just phenomenal and yet at the same time, they don't want to just keep building new data center buildings. And putting more and more racks. They want to put more storage density in that footprint inside that building. So, that's what's really pushing the demand for these higher capacity storage devices. They want to really increase the storage capacity per cubic meter. >> Right, right. >> Inside these data centers. >> It's also just fascinating that our expectation is that they're going to somehow pull it off, right? Our expectation that Moore's laws continue, things are going to get better, faster, cheaper, and bigger. But, back in the back room, somebody's actually got to figure out how to do it. And as you said, we hit these kind of seminal moments where >> Yeah, that's right. >> You do get on a new S curve, and without that it does flatten out over time. >> You know, what's interesting though, Jeff, is really about the time that Perpendicular Magnetic Recording was announced way back in 2005, 2006, the industry was really, already at that time, talking about these thermal assist technologies like MAMR that Western Digital announced today. And it's always been a little bit of a question for those folks that are either in the industry or watching the industry, like IDC. And maybe even even more importantly for some of the HDD industry customers. They're kind of wondering, so what's really going to be the next technology race horse that takes us to that next capacity point? And it's always been a bit of a horse race between HAMR and MAMR. And there's been this lack of clarity or kind of a huge question mark hanging over the industry about which one is it going to be. And Western Digital certainly put a stake in the ground today that they see MAMR as that next technology for the future. >> (mumbles words) Just read a quote today (rushes through name) key alumni just took a new job. And he's got a pin tweet at the top of his thing. And he says, "The smart man looks for ways "To solve the problem. "Or looks at new solutions. "The wise man really spends his time studying the problem." >> I like that. >> And it's really interesting here cuz it seems kind of obvious there. Heat's never necessarily a good thing with electronics and data centers as you mentioned trying to get efficiency up. There's pressure as these things have become huge, energy consumption machines. That said, they're relatively efficient, based on other means that we've been doing they compute and the demand for this compute continues to increase, increase, increase, increase. >> Absolutely >> So, as you kind of look forward, is there anything kind of? Any gems in the numbers that maybe those of us at a layman level are kind of a first read are missing that we should really be paying attention that give us a little bit of a clue of what the feature looks like? >> Well, there's a couple of major trends going on. One is that, at least for the hard drive industry, if you kind of look back the last ten years or so, a pretty significant percentage of the revenue that they've generated a pretty good percentage of the petabytes that they ship have really gone into the PC market. And that's fundamentally shifting. And, so now it's really the data centers, so that by the time you get to 2020, 2021, about 60 plus percent of the petabytes that the industry's shipping is going into data centers, where if you look back a few years ago, 60% was going into PCs. That's a big, big change for the industry. And it's really that kind of change that's pushing the need for these higher capacity hard drives. >> Jeff: Right. >> So, that's, I think, one of the biggest shifts has taking place. >> Well, the other thing that's interesting in that comment because we know scale drives innovation better than anything and clearly Intel microprocessors rode the PC boom to get out scale to drive the innovation. And, so if you're saying, now, that the biggest scale is happening in the data center Then, that's a tremendous force for innovation in there versus Flash, which is really piggy-backing on the growth of these jobs, because that's where it's getting it's scale. So, when you look at kind of the Flash hard drive comparison, right? Obviously, Flash is the shiny new toy getting a lot of buzz over the last couple years. Western Digital has a play across the portfolio, but the announcement earlier today said, you're still going to have like this TenX cost differentiation. >> Yeah, that's right. >> Even through, I think it was 20, 25. I don't want to say what the numbers were. Over a long period of time. You see that kind of continuing DC&E kind of conflict between those two? Or is there a pretty clear stratification between what's going to go into Flash systems, or what's going to hard drives? >> That's a great question, now. So, even in the very large HyperScale data centers and we definitely see where Flash and hard disk drives are very complimentary. They're really addressing different challenges, different problems, and so I think one of the charts that we saw today at the briefing really is something that we agree with strongly at IDC. Today, maybe, about 7% or 8% of all of the combined HDD SSD petabyte shipped for enterprise are SSD petabytes. And then, that grows to maybe ten. >> What was it? Like 7% you said? >> 6% to 7%. >> 6% to 7% okay. Yeah, so we still have 92, 93%, 94% of all petabytes that again are HDD SSD petabytes for enterprise. Those are still HDD petabytes. And even when you get out to 2020, 2021, again, still bought 90%. We agree with what Western Digital talked about today. About 90% of the combined HDD SSD petabytes that are shipping for enterprise continue to be HDD. So, we do see the two technologies very complementary. Talked about SSD is kind of getting their scale on PCs and that's true. They really are going to quickly continue to become a bigger slice of the storage devices attached to new PCs. But, in the data center you really need that bulk storage capacity, low cost capacity. And that's where we see that the two SSDs and HDDs are going to live together for a long time. >> Yeah, and as we said the conflict barrier, complimentary nature of the two different applications are very different. You need the big data to build the models, to run the algorithms, to do stuff. But, at the same time, you need the fast data that's coming in. You need the real time analytics to make modifications to the algorithms and learn from the algorithms >> That's right, yeah. It's the two of those things together that are one plus one makes three type of solution. Exactly, and especially to address latency. Everybody wants their data fast. When you type something into Google, you want your response right away. And that's where SSDs really come into play, but when you do deep searches, you're looking through a lot of data that has been collected over years and a lot of that's probably sitting on hard disc drives. >> Yeah. The last piece of the puzzle, I just want to you to address before we sign off, That was an interesting point is that not just necessarily the technology story, but the ecosystem story. And I thought that was really kind of, I thought, the most interesting part of the MAMR announcement was that it fits in the same form factor, there's no change to OS, there's no kind of change in the ecosystem components in which you plug this in. >> Yeah, that's right. It's just you take out the smaller drive, the 10, or the 12, or whatever, or 14 I guess is coming up. And plug in. They showed a picture of a 40 terabyte drive. >> Right. >> You know, that's the other part of the story that maybe doesn't get as much play as it should. You're playing in an ecosystem. You can't just come up with this completely, kind of independent, radical, new thing, unless it'S so radical that people are willing to swap out their existing infrastructure. >> I completely agree. It's can be very difficult for the customer to figure out how to adopt some of these new technologies and actually, the hard disk drive industry has thrown a couple of technologies at their customers over the past five, six years, that have been a little challenging for them to adopt. So, one was when the industry went from a native 512 by sectors to 4K sectors. Seems like a pretty small change that you're making inside the drive, but it actually presented some big challenges for some of the enterprise customers. And even the single magnetic recording technologies. So, it has a way to get more data on the disc, and Western Digital certainly talked about that today. But, for the customer trying to plug and play that into a system and SMR technology actually created some real challenges for them to figure out how to adopt that. So, I agree that what was shown today about the MAMR technology is definitely a plug and play. >> Alright, we'll give you the last word as people are driving away today from the headquarters. They got a bumper sticker as to why this is so important. What's it say on the bumper sticker about MAMR? It says that we continue to get more capacity at a lower cost. >> (chuckles) Isn't that just always the goal? >> I agree. >> (chuckles) Alright, well thank you for stopping by and sharing your insight. Really appreciate it. >> Thanks, Jeff. >> Alright. Jeff Frick here at Western Digital. You're watching theCUBE! Thanks for watching. (futuristic beat)

Published Date : Oct 12 2017

SUMMARY :

Brought to you by Western Digital. He is the Research Vice President So, what is your take on today's announcement? for the industry was Helium, about four or five years ago. Always reminds me kind of back to the OS that the data thrown off by people It's a huge fundamental shift. They had to do something. Yeah, the demand for a storage capacity But, back in the back room, and without that it does flatten out over time. as that next technology for the future. "To solve the problem. and the demand for this compute continues And it's really that kind of change that's pushing the need one of the biggest shifts has taking place. and clearly Intel microprocessors rode the PC boom You see that kind of continuing DC&E kind of conflict So, even in the very large HyperScale data centers of the storage devices attached to new PCs. You need the big data to build the models, It's the two of those things together is that not just necessarily the technology story, the 10, or the 12, or whatever, or 14 I guess is coming up. that's the other part of the story that maybe doesn't get And even the single magnetic recording technologies. What's it say on the bumper sticker about MAMR? and sharing your insight. Thanks for watching.

ENTITIES

Entity	Category	Confidence
Jeff Frick	PERSON	0.99+
John Rydning	PERSON	0.99+
Jeff	PERSON	0.99+
Western Digital	ORGANIZATION	0.99+
John	PERSON	0.99+
2005	DATE	0.99+
two	QUANTITY	0.99+
7%	QUANTITY	0.99+
90%	QUANTITY	0.99+
2006	DATE	0.99+
60%	QUANTITY	0.99+
2020	DATE	0.99+
two technologies	QUANTITY	0.99+
94%	QUANTITY	0.99+
92, 93%	QUANTITY	0.99+
each disc	QUANTITY	0.99+
Microsoft	ORGANIZATION	0.99+
2021	DATE	0.99+
ten year	QUANTITY	0.99+
Today	DATE	0.99+
San Jose, California	LOCATION	0.99+
20	QUANTITY	0.99+
today	DATE	0.99+
8%	QUANTITY	0.99+
one	QUANTITY	0.99+
BIGIT	ORGANIZATION	0.99+
two different applications	QUANTITY	0.99+
40 terabyte	QUANTITY	0.99+
25	QUANTITY	0.99+
MAMR	ORGANIZATION	0.98+
ten	QUANTITY	0.98+
IDC	ORGANIZATION	0.98+
12	QUANTITY	0.98+
first read	QUANTITY	0.98+
MAMR Technology	ORGANIZATION	0.98+
Intel	ORGANIZATION	0.97+
theCUBE	ORGANIZATION	0.97+
about 20% a year	QUANTITY	0.97+
about 30% a year	QUANTITY	0.97+
about five years	QUANTITY	0.97+
One	QUANTITY	0.96+
Google	ORGANIZATION	0.96+
about 7%	QUANTITY	0.96+
HAMR	ORGANIZATION	0.96+
14	QUANTITY	0.95+
next decade	DATE	0.95+
five years ago	DATE	0.95+
10	QUANTITY	0.94+
6%	QUANTITY	0.94+
512	QUANTITY	0.93+
About 90%	QUANTITY	0.91+
about 60 plus percent	QUANTITY	0.91+
last couple years	DATE	0.91+
earlier today	DATE	0.9+
single	QUANTITY	0.89+
six years	QUANTITY	0.89+
few years ago	DATE	0.88+

Janet George , Western Digital | Western Digital the Next Decade of Big Data 2017

>> Announcer: Live from San Jose, California, it's theCUBE, covering Innovating to Fuel the Next Decade of Big Data, brought to you by Western Digital. >> Hey welcome back everybody, Jeff Frick here with theCUBE. We're at Western Digital at their global headquarters in San Jose, California, it's the Almaden campus. This campus has a long history of innovation, and we're excited to be here, and probably have the smartest person in the building, if not the county, area code and zip code. I love to embarrass here, Janet George, she is the Fellow and Chief Data Scientist for Western Digital. We saw you at Women in Data Science, you were just at Grace Hopper, you're everywhere and get to get a chance to sit down again. >> Thank you Jeff, I appreciate it very much. >> So as a data scientist, today's announcement about MAMR, how does that make you feel, why is this exciting, how is this going to make you be more successful in your job and more importantly, the areas in which you study? >> So today's announcement is actually a breakthrough announcement, both in the field of machine learning and AI, because we've been on this data journey, and we have been very selectively storing data on our storage devices, and the selection is actually coming from the preconstructed queries that we do with business data, and now we no longer have to preconstruct these queries. We can store the data at scale in raw form. We don't even have to worry about the format or the schema of the data. We can look at the schema dynamically as the data grows within the storage and within the applications. >> Right, cause there's been two things, right. Before data was bad 'cause it was expensive to store >> Yes. >> Now suddenly we want to store it 'cause we know data is good, but even then, it still can be expensive, but you know, we've got this concept of data lakes and data swamps and data all kind of oceans, pick your favorite metaphor, but we want the data 'cause we're not really sure what we're going to do with it, and I think what's interesting that you said earlier today, is it was schema on write, then we evolved to schema on read, which was all the rage at Hadoop Summit a couple years ago, but you're talking about the whole next generation, which is an evolving dynamic schema >> Exactly. >> Based whatever happens to drive that query at the time. >> Exactly, exactly. So as we go through this journey, we are now getting independent of schema, we are decoupled from schema, and what we are finding out is we can capture data at its raw form, and we can do the learning at the raw form without human interference, in terms of transformation of the data and assigning a schema to that data. We got to understand the fidelity of the data, but we can train at scale from that data. So with massive amounts of training, the models already know to train itself from raw data. So now we are only talking about incremental learning, as the train model goes out into the field in production, and actually performs, now we are talking about how does the model learn, and this is where fast data plays a very big role. >> So that's interesting, 'cause you talked about that also earlier in your part of the presentation, kind of the fast data versus big data, which kind of maps the flash versus hard drive, and the two are not, it's not either or, but it's really both, because within the storage of the big data, you build the base foundations of the models, and then you can adapt, learn and grow, change with the fast data, with the streaming data on the front end, >> Exactly >> It's a whole new world. >> Exactly, so the fast data actually helps us after the training phase, right, and these are evolving architectures. This is part of your journey. As you come through the big data journey you experience this. But for fast data, what we are seeing is, these architectures like Lambda and Kappa are evolving, and especially the Lambda architecture is very interesting, because it allows for batch processing of historical data, and then it allows for what we call a high latency layer or a speed layer, where this data can then be promoted up the stack for serving purposes. And then Kappa architecture's where the data is being streamed near real time, bounded and unbounded streams of data. So this is again very important when we build machine learning and AI applications, because evolution is happening on the fly, learning is happening on the fly. Also, if you think about the learning, we are mimicking more and more on how humans learn. We don't really learn with very large chunks of data all at once, right? That's important for initially model training and model learning, but on a regular basis, we are learning with small chunks of data that are streamed to us near real time. >> Right, learning on the Delta. >> Learning on the Delta. >> So what is the bound versus the unbound? Unpack that a little bit. What does that mean? >> So what is bounded is basically saying, hey we are going to get certain amounts of data, so you're sizing the data for example. Unbounded is infinite streams of data coming to you. And so if your architecture can absorb infinite streams of data, like for example, the sensors constantly transmitting data to you, right? At that point you're not worried about whether you can store that data, you're simply worried about the fidelity of that data. But bounded would be saying, I'm going to send the data in chunks. You could also do bounded where you basically say, I'm going to pre-process the data a little bit just to see if the data's healthy, or if there is signal in the data. You don't want to find that out later as you're training, right? You're trying to figure that out up front. >> But it's funny, everything is ultimately bounded, it just depends on how you define the unit of time, right, 'cause you take it down to infinite zero, everything is frozen. But I love the example of the autonomous cars. We were at the event with, just talking about navigation just for autonomous cars. Goldman Sachs says it's going to be a seven billion dollar industry, and the great example that you used of the two systems working well together, 'cause is it the car centers or is it the map? >> Janet: That's right. >> And he says, well you know, you want to use the map, and the data from the map as much as you can to set the stage for the car driving down the road to give it some level of intelligence, but if today we happen to be paving lane number two on 101, and there's cones, now it's the real time data that's going to train the system. But the two have to work together, and the two are not autonomous and really can't work independent of each other. >> Yes. >> Pretty interesting. >> It makes perfect sense, right. And why it makes perfect sense is because first the autonomous cars have to learn to drive. Then the autonomous cars have to become an experienced driver. And the experience cannot be learned. It comes on the road. So one of the things I was watching was how insurance companies were doing testing on these cars, and they had a human, a human driving a car, and then an autonomous car. And the autonomous car, with the sensors, were predicting the behavior, every permutation and combination of how a bicycle would react to that car. It was almost predicting what the human on the bicycle would do, like jump in front of the car, and it got it right 80% of the cases. But a human driving a car, we're not sure how the bicycle is going to perform. We don't have peripheral vision, and we can't predict how the bicycle is going to perform, so we get it wrong. Now, we can't transmit that knowledge. If I'm a driver and I just encountered a bicycle, I can't transmit that knowledge to you. But a driverless car can learn, it can predict the behavior of the bicycle, and then it can transfer that information to a fleet of cars. So it's very powerful in where the learning can scale. >> Such a big part of the autonomous vehicle story that most people don't understand, that not only is the car driving down the road, but it's constantly measuring and modeling everything that's happening around it, including bikes, including pedestrians, including everything else, and whether it gets in a crash or not, it's still gathering that data and building the model and advancing the models, and I think that's, you know, people just don't talk about that enough. I want follow up on another topic. So we were both at Grace Hopper last week, which is a phenomenal experience, if you haven't been, go. Ill just leave it at that. But Dr. Fei-Fei Li gave one of the keynotes, and she made a really deep statement at the end of her keynote, and we were both talking about it before we turned the cameras on, which is, there's no question that AI is going to change the world, and it's changing the world today. The real question is, who are the people that are going to build the algorithms that train the AI? So you sit in your position here, with the power, both in the data and the tools and the compute that are available today, and this brand new world of AI and ML. How do you think about that? How does that make you feel about the opportunity to define the systems that drive the cars, et cetera. >> I think not just the diversity in data, but the diversity in the representation of that data are equally powerful. We need both. Because we cannot tackle diverse data, diverse experiences with only a single representation. We need multiple representation to be able to tackle that data. And this is how we will overcome bias of every sort. So it's not the question of who is going to build the AI models, it is a question of who is going to build the models, but not the question of will the AI models be built, because the AI models are already being built, but some of the models have biases into it from any kind of lack of representation. Like who's building the model, right? So I think it's very important. I think we have a powerful moment in history to change that, to make real impact. >> Because the trick is we all have bias. You can't do anything about it. We grew up in the world in which we grew up, we saw what we saw, we went to our schools, we had our family relationships et cetera. So everyone is locked into who they are. That's not the problem. The problem is the acceptance of bring in some other, (chuckles) and the combination will provide better outcomes, it's a proven scientific fact. >> I very much agree with that. I also think that having the freedom, having the choice to hear another person's conditioning, another person's experiences is very powerful, because that enriches our own experiences. Even if we are constrained, even if we are like that storage that has been structured and processed, we know that there's this other storage, and we can figure out how to get the freedom between the two point of views, right? And we have the freedom to choose. So that's very, very powerful, just having that freedom. >> So as we get ready to turn the calendar on 2017, which is hard to imagine it's true, it is. You look to 2018, what are some of your personal and professional priorities, what are you looking forward to, what are you working on, what's top of mind for Janet George? >> So right now I'm thinking about genetic algorithms, genetic machine learning algorithms. This has been around for a while, but I'll tell you where the power of genetic algorithms is, especially when you're creating powerful new technology memory cell. So when you start out trying to create a new technology memory cell, you have materials, material deformations, you have process, you have hundred permutation combination, and the genetic algorithms, we can quickly assign a cause function, and we can kill all the survival of the fittest, all that won't fit we can kill, arriving to the fastest, quickest new technology node, and then from there, we can scale that in mass production. So we can use these survival of the fittest mechanisms that evolution has used for a long period of time. So this is biology inspired. And using a cause function we can figure out how to get the best of every process, every technology, all the coupling effects, all the master effects of introducing a program voltage on a particular cell, reducing the program voltage on a particular cell, resetting and setting, and the neighboring effects, we can pull all that together, so 600, 700 permutation combination that we've been struggling on and not trying to figure out how to quickly narrow down to that perfect cell, which is the new technology node that we can then scale out into tens of millions of vehicles, right? >> Right, you're going to have to >> Getting to that spot. >> You're going to have to get me on the whiteboard on that one, Janet. That is amazing. Smart lady. >> Thank you. >> Thanks for taking a few minutes out of your time. Always great to catch up, and it was terrific to see you at Grace Hopper as well. >> Thank you, I really appreciate it, I appreciate it very much. >> All right, Janet George, I'm Jeff Frick. You are watching theCUBE. We're at Western Digital headquarters at Innovating to Fuel the Next Generation of Big Data. Thanks for watching.

Published Date : Oct 11 2017

SUMMARY :

the Next Decade of Big Data, in San Jose, California, it's the Almaden campus. the preconstructed queries that we do with business data, Right, cause there's been two things, right. of the data and assigning a schema to that data. and especially the Lambda architecture is very interesting, So what is the bound versus the unbound? the sensors constantly transmitting data to you, right? and the great example that you used and the data from the map as much as you can and it got it right 80% of the cases. and advancing the models, and I think that's, So it's not the question of who is going to Because the trick is we all have bias. having the choice to hear another person's conditioning, So as we get ready to turn the calendar on 2017, and the genetic algorithms, we can quickly assign You're going to have to get me on the whiteboard and it was terrific to see you at Grace Hopper as well. I appreciate it very much. at Innovating to Fuel the Next Generation of Big Data.

ENTITIES

Entity	Category	Confidence
Janet George	PERSON	0.99+
Jeff	PERSON	0.99+
Jeff Frick	PERSON	0.99+
Janet	PERSON	0.99+
Western Digital	ORGANIZATION	0.99+
80%	QUANTITY	0.99+
two things	QUANTITY	0.99+
2018	DATE	0.99+
last week	DATE	0.99+
2017	DATE	0.99+
Goldman Sachs	ORGANIZATION	0.99+
San Jose, California	LOCATION	0.99+
two systems	QUANTITY	0.99+
two	QUANTITY	0.99+
today	DATE	0.99+
both	QUANTITY	0.99+
seven billion dollar	QUANTITY	0.99+
Fei-Fei Li	PERSON	0.98+
Almaden	LOCATION	0.98+
two point	QUANTITY	0.97+
one	QUANTITY	0.97+
first	QUANTITY	0.95+
Grace Hopper	ORGANIZATION	0.95+
theCUBE	ORGANIZATION	0.95+
hundred permutation	QUANTITY	0.95+
MAMR	ORGANIZATION	0.94+
Women in Data Science	ORGANIZATION	0.91+
tens of millions of vehicles	QUANTITY	0.9+
one of	QUANTITY	0.89+
Kappa	ORGANIZATION	0.89+
Dr.	PERSON	0.88+
single representation	QUANTITY	0.83+
a couple years ago	DATE	0.83+
earlier today	DATE	0.82+
Next Decade	DATE	0.81+
Lambda	TITLE	0.8+
101	OTHER	0.8+
600, 700 permutation	QUANTITY	0.77+
Lambda	ORGANIZATION	0.7+
of data	QUANTITY	0.67+
keynotes	QUANTITY	0.64+
Hadoop Summit	EVENT	0.62+
zero	QUANTITY	0.6+
number	OTHER	0.55+
Delta	OTHER	0.54+
two	OTHER	0.35+

Mark Grace, Western Digital | Western Digital the Next Decade of Big Data 2017

>> Announcer: Live from San Jose, California, it's theCUBE, covering Innovating to Fuel the Next Decade of Big Data, brought to you by Western Digital. >> Hey welcome back everybody, Jeff Frick here with theCUBE. We're at Western Digital's headquarters in San Jose, California at the Almaden campus. Lot of innovation's been going on here, especially in storage for decades, and we're excited to be at this special press and analyst event that Western Digital put on today to announce some exciting new products. It's called Innovating to Fuel the Next Decade of Data. I'm super happy to have a long-time industry veteran, he just told me, 35 years, I don't know if I can tell (Mark laughs) that on air or not. He's Mark Grace, he's the Senior Vice President of Devices for Western Digital, Mar, great to have you on. >> Thanks Jeff, glad to be here. >> Absolutely, so you've seen this movie over and over and over, I mean that's one of the cool things about being in the Valley, is this relentless pace of innovation. So how does today's announcement stack up as you kind of look at this versus kind of where we've come from? >> Oh I think this is maybe one of the, as big as it comes, Jeff, to be honest. I think we've plotted a course now that I think was relatively uncertain for the hard drive industry and the data center, and plotted a course that I think we can speak clearly to the market, and clearly to customers about the value proposition for rotating magnetic storage for decades to come. >> Which is pretty interesting, 'cause, you know, rotating drives have been taking a hit over the last couple of years, right, flash has been kind of the sexy new kid on the block, so this is something new, >> Mark: It is. >> And a new S curve I think as John said. >> I agree, we're jumping onto a, we're extending the S curve, let's call it that. I think there's actually plenty of other S curve opportunities for us, but in this case, I think the industry, and I would say our customer base, we have been less than clear with those guys about how we see the future of rotating storage in the cloud and enterprise space, and I think today's announcement clarifies that and gives some confidence about architectural decisions relative to rotating storage going forward for a long time. >> Well I think it's pretty interesting, 'cause compared to the other technology that was highlighted, the other option, the HAMR versus the MAMR, this was a much more elegant, simpler way to add this new S curve into an existing ecosystem. >> You know, elegant's probably a good word for it, and it's always the best solution I would say. HAMR's been a push for many years. I can't remember the first time I heard about HAMR. It's still something we're going to continue to explore and invest in, but it has numerous hurdles compared to the simplicity and elegance, as you say, of MAMR, not the least of which is we're going to operate at normal ambient temperatures versus apply tremendous heat to try and energize the recording and the technologies. So any time you introduce extraordinary heat you face all kinds of ancillary engineering challenges, and this simplifies those challenges down to one critical innovation, which is the oscillator. >> Pretty interesting, 'cause it seems pretty obvious that heat's never a good thing. So it's curious that in the quest for this next S curve that the HAMR path was pursued for as long as it was, it sounds like, because it sounds like that's a pretty tough thing to overcome. >> Yeah, I think it initially presented perhaps the most longevity perhaps in early exploration days. I would say that HAMR has certainly received the most press as far as trying to assert it as the extending recording technology for enterprise HDDs. I would say we've invested for almost as long in MAMR, but we've been extremely quiet about it. This is kind of our nature. When we're ready to talk about something, you can kind of be sure we're ready to go with it, and ready to think about productization. So we're quite confident in what we're doing. >> But I'm curious from your perspective, having been in the business a long time, you know, we who are not directly building these magical machines, just now have come to expect that Moore's Law will contain, has zero to do with semiconductor physics anymore, it's really an attitude and this relentless pace of innovation that now is expected and taken for granted. You're on the other side, and have to face real physics and mechanical limitations of the media and the science and everything else. So is that something that gets you up every day >> Mark: Keeps me awake every night! >> Obviously keeps you awake at night and up every day. You've been doing it for 35 years, so there must be some appeal. >> Yeah. (laughs) >> But you know, it's a unique challenge, 'cause at the same time not only has it got to be better and faster and bigger, it's got to be cheaper, and it has been. So when you look at that, how does that kind of motivate you, the teams here, to deliver on that promise? >> Yeah, I mean in this case, we are a little bit defensive, in the sense of the flash expectations that you mentioned, and I think as we digest our news today, we'll be level setting a little bit more in a more balanced way the expectations for contribution from rotating magnetic storage and solid state storage to what I think is a more accurate picture of its future going forward in the enterprise and hyperscale space. To your point about just relentless innovation, a few of us were talking the other day in advance of this announcement that this MAMR adventure feels like the early days of PMR, perpendicular, the current recording technology. It feels like we understand a certain amount of distance ahead of us, and that's about this four-terabit per inch kind of distance, but it feels like the early days where we could only see so far but the road actually goes much further, and we're going to find more and more ways to extend this technology, and keep that order of magnitude cost advantage going from a hard drive standpoint versus flash. >> I wonder how this period compares to that, just to continue, in terms of on the demand side, 'cause you know, back in the day, the demand and the applications for these magical compute machines weren't near, I would presume, as pervasive as now, or am I missing the boat? 'Cause now clearly there is no shortage of demand for storage and compute. >> Yeah, depending on where you're coming from, you could take two different views of that. The engine that drove the scale of the hard drive industry to date has, a big piece of it in the long history of the hard drive industry has been the PC space. So you see that industry converting to flash and solid state storage more aggressively, and we embrace that, you know we're invested in flash and we have great products in that space, and we see that happening. The opportunity in the hyperscale and cloud space is we're only at the tip of the iceberg, and therefore I think, as we think about this generation, we think about it differently than those opportunities in terms of breadth of applications, PCs, and all that kind of create the foundation for the hard drive, but what we see here is the virtuous cycle of more storage, more economical storage begets more value proposition, more opportunities to integrate more data, more data collection, more storage. And that virtuous cycle seems to me that we're just getting started. So long live data, that's what we say. (both laugh) >> The other piece that I find interesting is before the PCs were the driver of scale relative to an enterprise data center, but with the hyperscale guys and the proliferation of cloud and actually the growth of PCs is slowing down dramatically, that it's kind of flipped the bit. Now the data centers themselves have the scale to drive >> Absolutely. >> the scale innovation that before was before was really limited to either a PC or a phone or some more consumer device. >> Absolutely the case. When you take that cross-section of hard drive applications, that's a hundred percent the case, and in fact, we look at the utilization as a vertically integrated company we look at our media facilities for the disks, we look at our wafer facilities for heads, and those facilities as we look forward are going to be as busy as busier than they've ever been. I mean the amount of data is relative to the density as well as disks and heads and how many you can employ. So we see this in terms of fundamental technology and component construction, manufacturing, busier than it's ever been. We'll make fewer units. I mean there will be fewer units as they become bigger and denser for this application space, but the fundamental consumption of magnetic recording technology and components is all-time records. >> Right. And you haven't even talked about the software-defined piece that's dragging the utilization of that data across multiple applications. >> And it's just one of these that come in to help everybody there too, yeah. >> Jeff: You got another 35 years more years in you? (both laugh) >> I hope so. >> All right. >> But that would be the edge of it, I think. >> All right, we're going to take Mark Grace here, only 35 more years, Lord knows what he'll be working on. Well Mark, thanks for taking a few minutes and answering your prospective >> No that's fine, thanks a lot. >> Absolutely, Mark Grace, I'm Jeff Frick, you're watching theCUBE from Western Digital headquarters in San Jose, California. Thanks for watching. >> Mark: All right.

Published Date : Oct 11 2017

SUMMARY :

the Next Decade of Big Data, in San Jose, California at the Almaden campus. and over, I mean that's one of the cool things and clearly to customers about the value proposition in the cloud and enterprise space, the HAMR versus the MAMR, and it's always the best solution I would say. So it's curious that in the quest for this next S curve and ready to think about productization. and mechanical limitations of the media and the science Obviously keeps you awake at night and up every day. 'cause at the same time not only has it got to be in the sense of the flash expectations that you mentioned, and the applications for these magical compute machines PCs, and all that kind of create the foundation and actually the growth of PCs is slowing down dramatically, the scale innovation I mean the amount of data is relative to the density piece that's dragging the utilization of that data that come in to help everybody there too, yeah. and answering your prospective No that's fine, in San Jose, California.

ENTITIES

Entity	Category	Confidence
Jeff Frick	PERSON	0.99+
Jeff	PERSON	0.99+
Western Digital	ORGANIZATION	0.99+
John	PERSON	0.99+
Mark Grace	PERSON	0.99+
35 years	QUANTITY	0.99+
Mark	PERSON	0.99+
San Jose, California	LOCATION	0.99+
Almaden	LOCATION	0.99+
Mar	PERSON	0.98+
both	QUANTITY	0.98+
35 more years	QUANTITY	0.97+
today	DATE	0.97+
first time	QUANTITY	0.97+
theCUBE	ORGANIZATION	0.96+
one	QUANTITY	0.95+
hundred percent	QUANTITY	0.94+
zero	QUANTITY	0.94+
four-terabit per inch	QUANTITY	0.93+
two different views	QUANTITY	0.93+
years	DATE	0.92+
HAMR	ORGANIZATION	0.91+
Moore	PERSON	0.9+
both laugh	QUANTITY	0.9+
35 years more years	QUANTITY	0.88+
decades	QUANTITY	0.86+
2017	DATE	0.75+
one critical innovation	QUANTITY	0.71+
Innovating to Fuel the Next Decade of	EVENT	0.7+
MAMR	OTHER	0.7+
last couple	DATE	0.66+
Next Decade	TITLE	0.62+
Innovating to Fuel the Next Decade of Big Data	TITLE	0.6+
Senior	PERSON	0.53+
President	PERSON	0.53+
HAMR	OTHER	0.45+
Data	TITLE	0.44+
MAMR	ORGANIZATION	0.38+

Dave Tang, Western Digital | Western Digital the Next Decade of Big Data 2017

(upbeat techno music) >> Announcer: Live from San Jose, California it's theCUBE, covering Innovating to Fuel the Next Decade of Big Data, brought to you by Western Digital. >> Hey, welcome back everybody. Jeff Frick here at theCUBE. We're at the Western Digital Headquarters off Almaden down in San Jose, a really important place. Western Digital's been here for a while, their headquarters. A lot of innovation's been going on here forever. So we're excited to be here really for the next generation. The event's called Innovating to Fuel the Next Generation of big data, and we're joined by many time Cuber, Dave Tang. He is the SVP in corporate marketing from Western Digital. Dave, always great to see you. >> Yeah. Always great to be here, Jeff. Thanks. >> Absolutely. So you got to MC the announcement today. >> Yes. >> So for the people that weren't there, let's give them a quick overview on what the announcement was and then we can dive in a little deeper. >> Great, so what we were announcing was a major breakthrough in technology that's going to allow us to drive the increase in capacity in density to support big data for the next decade and beyond, right? So capacities and densities had been starting to level off in terms of hard drive technology capability. So what we announced was microwave-assisted magnetic recording technology that will allow us to keep growing that areal density up and reducing the cost per terabyte. >> You know, it's fascinating cause everyone loves to talk about Moore's Law and have these silly architectural debates, whether Moore's Law is alive or dead, but, as anyone who's lived here knows, Moore's Law is really an attitude much more it is than the specific physics of microprocessor density growth. And it's interesting to see. As we know the growth of data is growing in giant and the types of data, and not only regular big data, but now streaming data are bigger and bigger and bigger. I think you talked about stuff coming off of people and machines compared to business data is way bigger. >> Right. >> But you guys continue to push limits and break through, and even though we expect everything to be cheaper, faster, and better, you guys actually have to execute it-- >> Dave: Right. >> Back at the factory. >> Right, well it's interesting. There's this healthy tension, right, a push and pull in the environment. So you're right, it's not just Moore's Law that's enabling a technology push, but we have this virtuous cycle, right? We've realized what the value is of data and how to extract the possibilities and value of data, so that means that we want to store more of that data and access more of that data, which drives the need for innovation to be able to support all of that in a cost effective way. But then that triggers another wave of new applications, new ways to tap into the possibilities of data. So it just feeds on itself, and fortunately we have great technologists, great means of innovation, and a great attitude and spirit of innovation to help drive that. >> Yeah, so for people that want more, they can go to the press releases and get the data. We won't dive deep into the weeds here on the technology, but I thought you had Janet George speak, and she's chief data scientist. Phenomenal, phenomenal big brain. >> Dave: Yes. >> A smart lady. But she talked about, from her perspective we're still just barely even getting onto this data opportunity in terms of automation, and we see over and over at theCUBE events, innovation's really not that complicated. Give more people access to the data, give them more access to the tools, and let them try things easier and faster and feel quick, there's actually a ton of innovation that companies can unlock within their own four walls. But the data is such an important piece of it, and there's more and more and more of this. >> Dave: Right, right. >> What used to be digital exhaust now is, I think maybe you said, or maybe it was Dave, that there's a whole economy now built on data like we used to do with petroleum. I thought that was really insightful. >> Yeah, right. It's like a gold mine. So not only are the sources of data increasing, which is driving increased volume, but, as Janet was alluding to, we're starting to come up with the tools and the sophistication with machine learning and artificial intelligence to be able to put that data to new use as well as to find the pieces of data to interconnect, to drive these new capabilities and new insights. >> Yeah, but unlike petroleum it doesn't get used up. I mean that's the beauty of data. (laughing) >> Yeah, that's right. >> It's a digital process that can be used over and over and over again. >> And a self-renewing, renewing resource. And you're right, in that sense that it's being used over and over again that the longevity of that data, the use for life is growing exponentially along with the volume. >> Right, and Western Digital's in a unique position cause you have systems and you have big systems that could be used in data centers, but you also have the media that powers a whole bunch of other people's systems. So I thought one of the real important announcements today was, yes it's an interesting new breakthrough technology that uses energy assist to get more density on the drives, but it's done in such a way that the stuff's all backward compatible. It's plug and play. You've got production scheduled in a couple years I think with test out the customers-- >> Dave: That's right. >> Next year. So, you know, that is such an important piece beyond the technology. What's the commercial acceptance? What are the commercial barriers? And this sounds like a pretty interesting way to skin that cow. >> Right, often times the best answers aren't the most complex answers. They're the more elegant and simplistic answers. So it goes from the standpoint of a user being able to plug and play with older systems, older technologies. That's beautiful, and for us, to be able to, the ability to manufacture it in high volume reliably and cost effectively is equally as important. >> And you also talked, which I think was interesting, is kind of the relationship between hard drives and flash, because, obviously, flash is a, I want to say the sexy new toy, but it's not a sexy new toy anymore. >> Right. >> It's been around for a while, but, with that pressure on flash performance, you're still seeing the massive amounts of big data, which is growing faster than that. And there is a rule for the high density hard drives in that environment, and, based on the forecast you shared, which I'm presuming came from IDC or people that do numbers for a living, still a significant portion of a whole lot of data is not going to be on flash. >> Yeah, that's right. I think we have a tendency, especially in technology, to think either or, right? Something is going to take over from something else, but in this case it's definitely an and, right. And a lot of that is driven by this notion that there's fast data and big data, and, while our attention seems to shift over to maybe some fast data applications like autonomous vehicles and realtime applications, surveillance applications, there's still a need for big data because the algorithms that drive those realtime applications have to come from analysis of vast amounts of data. So big data is here to stay. It's not going away or shifting over. >> I think it's a really interesting kind of cross over, which Janet talked about too, where you need the algorithms to continue sharing the system that are feeding, continuing, and reacting to the real data, but then that just adds more vocabulary to their learning set so they can continue to evolve overtime. >> Yeah, what really helps us out in the market place is that because we have technologies and products across that full spectrum of flash and rotating magnetic recording, and we sell to customers who buy devices as well as platforms and systems, we see a lot of applications, a lot of uses of data, and we're able to then anticipate what those needs are going to be in the near future and in the distant future. >> Right, so we're getting towards the end of 2017, which I find hard to say, but as you look forward kind of to 2018 and this insatiable desire for more storage, cause this insatiable creation of more data, what are some of your priorities for 2018 and what are you kind of looking at as, like I said, I can't believe we're going to actually flip the calendar here-- >> Dave: Right, right. >> In just a few short months. (laughing) >> Well, I think for us, it's the realization that all these applications that are coming at us are more and more diverse, and their needs are very specialized. So it's not just the storage, although we're thought of as a storage company, it's not just about the storage of that data, but you have contrive complete environments to capture and preserve and access and transform that data, which means we have to go well beyond storage and think about how that data is accessed, technical interfaces to our memory products as well as storage products, and then where compute sits. Does it still sit in a centralized place or do you move compute to out closer to where the data sits. So, all this innovation and changing the way that we think about how we can mine that data is top of the mind for us for the next year and beyond. >> It's only job security for you, Dave. (laughing) >> Dave: Funny to think about. >> Alright. He's Dave Tang. Thanks for inviting us and again congratulations on the presentation. >> Always a pleasure. >> Alright, Dave Tang, I'm Jeff Frick. You're watching theCUBE from Western Digital headquarters in San Jose, California. Thanks for watching. (upbeat techno music)

Published Date : Oct 11 2017

SUMMARY :

brought to you by Western Digital. He is the SVP in corporate marketing Always great to be here, Jeff. So you got to MC the announcement today. So for the people that weren't there, and reducing the cost per terabyte. and machines compared to business data and how to extract the possibilities and get the data. Give more people access to the data, that there's a whole economy now the pieces of data to interconnect, I mean that's the beauty of data. It's a digital process that can be used that the longevity of that data, that the stuff's all backward compatible. What are the commercial barriers? the ability to manufacture it in high volume is kind of the relationship between hard drives and, based on the forecast you shared, So big data is here to stay. and reacting to the real data, in the near future and in the distant future. (laughing) So it's not just the storage, It's only job security for you, Dave. and again congratulations on the in San Jose, California.

ENTITIES

Entity	Category	Confidence
Janet	PERSON	0.99+
Janet George	PERSON	0.99+
Jeff Frick	PERSON	0.99+
Dave Tang	PERSON	0.99+
Dave	PERSON	0.99+
Western Digital	ORGANIZATION	0.99+
Jeff	PERSON	0.99+
2018	DATE	0.99+
San Jose, California	LOCATION	0.99+
Almaden	LOCATION	0.99+
Next year	DATE	0.99+
San Jose	LOCATION	0.99+
next year	DATE	0.99+
today	DATE	0.98+
next decade	DATE	0.97+
one	QUANTITY	0.95+
Moore's Law	TITLE	0.91+
Western Digital	LOCATION	0.89+
Cuber	PERSON	0.88+
theCUBE	ORGANIZATION	0.88+
end of 2017	DATE	0.88+
Innovating to Fuel the Next Generation of	EVENT	0.81+
IDC	ORGANIZATION	0.8+
2017	DATE	0.8+
Moore's	TITLE	0.76+
Moore	PERSON	0.72+
months	QUANTITY	0.68+
couple years	QUANTITY	0.65+
lot of applications	QUANTITY	0.61+
four walls	QUANTITY	0.6+
Innovating to Fuel the Next Decade of Big Data	EVENT	0.59+
Next Decade	DATE	0.58+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for San Jose California: