SEAGATE AI FINAL

>>C G technology is focused on data where we have long believed that data is in our DNA. We help maximize humanity's potential by delivering world class, precision engineered data solutions developed through sustainable and profitable partnerships. Included in our offerings are hard disk drives. As I'm sure many of you know, ah, hard drive consists of a slider also known as a drive head or transducer attached to a head gimbal assembly. I had stack assembly made up of multiple head gimbal assemblies and a drive enclosure with one or more platters, or just that the head stacked assembles into. And while the concept hasn't changed, hard drive technology has progressed well beyond the initial five megabytes, 500 quarter inch drives that Seagate first produced. And, I think 1983. We have just announced in 18 terabytes 3.5 inch drive with nine flatters on a single head stack assembly with dual head stack assemblies this calendar year, the complexity of these drives further than need to incorporate Edge analytics at operation sites, so G Edward stemming established the concept of continual improvement and everything that we do, especially in product development and operations and at the end of World War Two, he embarked on a mission with support from the US government to help Japan recover from its four time losses. He established the concept of continual improvement and statistical process control to the leaders of prominent organizations within Japan. And because of this, he was honored by the Japanese emperor with the second order of the sacred treasure for his teachings, the only non Japanese to receive this honor in hundreds of years. Japan's quality control is now world famous, as many of you may know, and based on my own experience and product development, it is clear that they made a major impact on Japan's recovery after the war at Sea Gate. The work that we've been doing and adopting new technologies has been our mantra at continual improvement. As part of this effort, we embarked on the adoption of new technologies in our global operations, which includes establishing machine learning and artificial intelligence at the edge and in doing so, continue to adopt our technical capabilities within data science and data engineering. >>So I'm a principal engineer and member of the Operations and Technology Advanced Analytics Group. We are a service organization for those organizations who need to make sense of the data that they have and in doing so, perhaps introduce a different way to create an analyzed new data. Making sense of the data that organizations have is a key aspect of the work that data scientist and engineers do. So I'm a project manager for an initiative adopting artificial intelligence methodologies for C Gate manufacturing, which is the reason why I'm talking to you today. I thought I'd start by first talking about what we do at Sea Gate and follow that with a brief on artificial intelligence and its role in manufacturing. And I'd like them to discuss how AI and machine Learning is being used at Sea Gate in developing Edge analytics, where Dr Enterprise and Cooper Netease automates deployment, scaling and management of container raised applications. So finally, I like to discuss where we are headed with this initiative and where Mirant is has a major role in case some of you are not conversant in machine learning, artificial intelligence and difference outside some definitions. To cite one source, machine learning is the scientific study of algorithms and statistical bottles without computer systems use to effectively perform a specific task without using explicit instructions, relying on patterns and inference Instead, thus, being seen as a subset of narrow artificial intelligence were analytics and decision making take place. The intent of machine learning is to use basic algorithms to perform different functions, such as classify images to type classified emails into spam and not spam, and predict weather. The idea and this is where the concept of narrow artificial intelligence comes in, is to make decisions of a preset type basically let a machine learn from itself. These types of machine learning includes supervised learning, unsupervised learning and reinforcement learning and in supervised learning. The system learns from previous examples that are provided, such as images of dogs that are labeled by type in unsupervised learning. The algorithms are left to themselves to find answers. For example, a Siris of images of dogs can be used to group them into categories by association that's color, length of coat, length of snout and so on. So in the last slide, I mentioned narrow a I a few times, and to explain it is common to describe in terms of two categories general and narrow or weak. So Many of us were first exposed to General Ai in popular science fiction movies like 2000 and One, A Space Odyssey and Terminator General Ai is a I that can successfully perform any intellectual task that a human can. And if you ask you Lawn Musk or Stephen Hawking, this is how they view the future with General Ai. If we're not careful on how it is implemented, so most of us hope that is more like this is friendly and helpful. Um, like Wally. The reality is that machines today are not only capable of weak or narrow, a I AI that is focused on a narrow, specific task like understanding, speech or finding objects and images. Alexa and Google Home are becoming very popular, and they can be found in many homes. Their narrow task is to recognize human speech and answer limited questions or perform simple tasks like raising the temperature in your home or ordering a pizza as long as you have already defined the order. Narrow. AI is also very useful for recognizing objects in images and even counting people as they go in and out of stores. As you can see in this example, so artificial intelligence supplies, machine learning analytics inference and other techniques which can be used to solve actual problems. The two examples here particle detection, an image anomaly detection have the potential to adopt edge analytics during the manufacturing process. Ah, common problem in clean rooms is spikes in particle count from particle detectors. With this application, we can provide context to particle events by monitoring the area around the machine and detecting when foreign objects like gloves enter areas where they should not. Image Anomaly detection historically has been accomplished at sea gate by operators in clean rooms, viewing each image one at a time for anomalies, creating models of various anomalies through machine learning. Methodologies can be used to run comparative analyses in a production environment where outliers can be detected through influence in an automated real Time analytics scenario. So anomaly detection is also frequently used in machine learning to find patterns or unusual events in our data. How do you know what you don't know? It's really what you ask, and the first step in anomaly detection is to use an algorithm to find patterns or relationships in your data. In this case, we're looking at hundreds of variables and finding relationships between them. We can then look at a subset of variables and determine how they are behaving in relation to each other. We use this baseline to define normal behavior and generate a model of it. In this case, we're building a model with three variables. We can then run this model against new data. Observations that do not fit in the model are defined as anomalies, and anomalies can be good or bad. It takes a subject matter expert to determine how to classify the anomalies on classify classification could be scrapped or okay to use. For example, the subject matter expert is assisting the machine to learn the rules. We then update the model with the classifications anomalies and start running again, and we can see that there are few that generate these models. Now. Secret factories generate hundreds of thousands of images every day. Many of these require human toe, look at them and make a decision. This is dull and steak prone work that is ideal for artificial intelligence. The initiative that I am project managing is intended to offer a solution that matches the continual increased complexity of the products we manufacture and that minimizes the need for manual inspection. The Edge Rx Smart manufacturing reference architecture er, is the initiative both how meat and I are working on and sorry to say that Hamid isn't here today. But as I said, you may have guessed. Our goal is to introduce early defect detection in every stage of our manufacturing process through a machine learning and real time analytics through inference. And in doing so, we will improve overall product quality, enjoy higher yields with lesser defects and produce higher Ma Jin's. Because this was entirely new. We established partnerships with H B within video and with Docker and Amaranthus two years ago to develop the capability that we now have as we deploy edge Rx to our operation sites in four continents from a hardware. Since H P. E. And in video has been an able partner in helping us develop an architecture that we have standardized on and on the software stack side doctor has been instrumental in helping us manage a very complex project with a steep learning curve for all concerned. To further clarify efforts to enable more a i N M l in factories. Theobald active was to determine an economical edge Compute that would access the latest AI NML technology using a standardized platform across all factories. This objective included providing an upgrade path that scales while minimizing disruption to existing factory systems and burden on factory information systems. Resource is the two parts to the compute solution are shown in the diagram, and the gateway device connects to see gates, existing factory information systems, architecture ER and does inference calculations. The second part is a training device for creating and updating models. All factories will need the Gateway device and the Compute Cluster on site, and to this day it remains to be seen if the training devices needed in other locations. But we do know that one devices capable of supporting multiple factories simultaneously there are also options for training on cloud based Resource is the stream storing appliance consists of a kubernetes cluster with GPU and CPU worker notes, as well as master notes and docker trusted registries. The GPU nodes are hardware based using H B E l 4000 edge lines, the balance our virtual machines and for machine learning. We've standardized on both the H B E. Apollo 6500 and the NVIDIA G X one, each with eight in video V 100 GP use. And, incidentally, the same technology enables augmented and virtual reality. Hardware is only one part of the equation. Our software stack consists of Docker Enterprise and Cooper Netease. As I mentioned previously, we've deployed these clusters at all of our operations sites with specific use. Case is planned for each site. Moran Tous has had a major impact on our ability to develop this capability by offering a stable platform in universal control plane that provides us, with the necessary metrics to determine the health of the Kubernetes cluster and the use of Dr Trusted Registry to maintain a secure repository for containers. And they have been an exceptional partner in our efforts to deploy clusters at multiple sites. At this point in our deployment efforts, we are on prem, but we are exploring cloud service options that include Miranda's next generation Docker enterprise offering that includes stack light in conjunction with multi cluster management. And to me, the concept of federation of multi cluster management is a requirement in our case because of the global nature of our business where our operation sites are on four continents. So Stack Light provides the hook of each cluster that banks multi cluster management and effective solution. Open source has been a major part of Project Athena, and there has been a debate about using Dr CE versus Dr Enterprise. And that decision was actually easy, given the advantages that Dr Enterprise would offer, especially during a nearly phase of development. Cooper Netease was a natural addition to the software stack and has been widely accepted. But we have also been a work to adopt such open source as rabbit and to messaging tensorflow and tensor rt, to name three good lab for developments and a number of others. As you see here, is well, and most of our programming programming has been in python. The results of our efforts so far have been excellent. We are seeing a six month return on investment from just one of seven clusters where the hardware and software cost approached close to $1 million. The performance on this cluster is now over three million images processed per day for their adoption has been growing, but the biggest challenge we've seen has been handling a steep learning curve. Installing and maintaining complex Cooper needs clusters in data centers that are not used to managing the unique aspect of clusters like this. And because of this, we have been considering adopting a control plane in the cloud with Kubernetes as the service supported by Miranda's. Even without considering, Kubernetes is a service. The concept of federation or multi cluster management has to be on her road map, especially considering the global nature of our company. Thank you.

Published Date : Sep 15 2020

SUMMARY :

at the end of World War Two, he embarked on a mission with support from the US government to help and the first step in anomaly detection is to use an algorithm to find patterns

ENTITIES

Entity	Category	Confidence
Seagate	ORGANIZATION	0.99+
hundreds of years	QUANTITY	0.99+
two parts	QUANTITY	0.99+
python	TITLE	0.99+
six month	QUANTITY	0.99+
World War Two	EVENT	0.99+
C Gate	ORGANIZATION	0.99+
one	QUANTITY	0.99+
Stephen Hawking	PERSON	0.99+
Sea Gate	ORGANIZATION	0.99+
Japan	LOCATION	0.99+
Lawn Musk	PERSON	0.99+
Terminator	TITLE	0.99+
1983	DATE	0.99+
one part	QUANTITY	0.99+
two examples	QUANTITY	0.99+
A Space Odyssey	TITLE	0.99+
five megabytes	QUANTITY	0.99+
3.5 inch	QUANTITY	0.99+
second part	QUANTITY	0.99+
18 terabytes	QUANTITY	0.99+
first step	QUANTITY	0.99+
hundreds	QUANTITY	0.99+
both	QUANTITY	0.98+
NVIDIA	ORGANIZATION	0.98+
over three million images	QUANTITY	0.98+
first	QUANTITY	0.98+
each site	QUANTITY	0.98+
H B E. Apollo 6500	COMMERCIAL_ITEM	0.98+
each cluster	QUANTITY	0.98+
each image	QUANTITY	0.98+
one source	QUANTITY	0.98+
today	DATE	0.98+
G X one	COMMERCIAL_ITEM	0.98+
Cooper	PERSON	0.98+
second order	QUANTITY	0.98+
Japan	ORGANIZATION	0.98+
Hamid	PERSON	0.97+
Dr Enterprise	ORGANIZATION	0.97+
Cooper Netease	ORGANIZATION	0.97+
each	QUANTITY	0.97+
One	TITLE	0.97+
Theobald	PERSON	0.97+
nine flatters	QUANTITY	0.97+
one devices	QUANTITY	0.96+
Siris	TITLE	0.96+
hundreds of thousands of images	QUANTITY	0.96+
Docker Enterprise	ORGANIZATION	0.95+
Docker	ORGANIZATION	0.95+
seven clusters	QUANTITY	0.95+
two years ago	DATE	0.95+
US government	ORGANIZATION	0.95+
Mirant	ORGANIZATION	0.95+
Operations and Technology Advanced Analytics Group	ORGANIZATION	0.94+
four time losses	QUANTITY	0.94+
Wally	PERSON	0.94+
Japanese	OTHER	0.93+
two categories	QUANTITY	0.93+
H B E l 4000	COMMERCIAL_ITEM	0.9+
H B	ORGANIZATION	0.9+
three variables	QUANTITY	0.9+
General Ai	TITLE	0.87+
G Edward	PERSON	0.87+
Google Home	COMMERCIAL_ITEM	0.87+
$1 million	QUANTITY	0.85+
Miranda	ORGANIZATION	0.85+
Sea Gate	LOCATION	0.85+
Alexa	TITLE	0.85+
500 quarter inch drives	QUANTITY	0.84+
Kubernetes	TITLE	0.83+
single head	QUANTITY	0.83+
eight	QUANTITY	0.83+
Dr	TITLE	0.82+
variables	QUANTITY	0.81+
this calendar year	DATE	0.78+
H P. E.	ORGANIZATION	0.78+
2000	DATE	0.73+
Project Athena	ORGANIZATION	0.72+
Rx Smart	COMMERCIAL_ITEM	0.69+
dual	QUANTITY	0.68+
V 100	COMMERCIAL_ITEM	0.65+
close	QUANTITY	0.65+
four continents	QUANTITY	0.64+
GP	QUANTITY	0.62+

Breaking Down Your Data

>>from the Cube Studios in Palo Alto and Boston. It's the Cube covering empowering the autonomous enterprise brought to you by Oracle Consulting. Welcome back, everybody to this special digital event coverage. The Cube is looking into the rebirth of Oracle Consulting. Janet George is here. She's group VP Autonomous for Advanced Analytics with machine learning and artificial intelligence at Oracle on she joined by Grant Gibson is VP of growth and strategy. Folks, welcome to the Cube. Thanks so much for coming on. I want to start with you because you get strategy in your title start big picture. What is the strategy with Oracle specifically as it relates to autonomous and also consulting? >>Sure. So I think you know, Oracle has a deep legacy of strength and data and over the company's successful history, it's evolved what that is from steps along the way. If you look at the modern enterprise Oracle client, I think there's no denying that we've entered the age of AI, that everyone knows that artificial intelligence and machine learning are key to their success in the business marketplace going forward. And while generally it's acknowledged that it's a transformative technology and people know that they need to take advantage of it. It's the how that's really tricky and that most enterprises, in order to really get an enterprise level, are rely on AI investment. Need to engage in projects of significant scope, and going from realizing there's an opportunity realizing there's a threat to mobilize yourself to capitalize on it is a daunting task. Certainly one that's anybody that's got any sort of legacy of success has built in processes as building systems has built in skill sets, and making that leap to be an autonomous enterprise is challenging for companies to wrap their heads around. So as part of the rebirth of Oracle Consulting, we've developed a practice around how to both manage the technology needs for that transformation as well as the human needs as well as the data science needs. >>So there's about five or six things that I want to follow up with you there, so this is a good conversation. Ever since I've been in the industry, we were talking about a sort of start stop start stopping at the ai Winter, and now it seems to be here. I almost feel like the technology never lived up to its promise you didn't have the horsepower compute power data may be so we're here today. It feels like we are entering a new era. Why is that? And how will the technology perform this time? >>So for AI to perform is very reliant on the data. We entered the age of Ai without having the right data for AI. So you can imagine that we just launched into Ai without our data being ready to be training sex for AI. So we started with big data. We started the data that was already historically transformed. Formatted had logical structures, physical structures. This data was sort of trapped in many different tools. And then suddenly Ai comes along and we see Take this data, our historical data we haven't tested to see if this has labels in it. This has learning capability in it. Just trust the data to AI. And that's why we saw the initial wave of ai sort of failing because it was not ready to fully ai ready for the generation of ai if >>you will. And part of I think the leap that clients are finding success with now is getting novel data types and you're moving from zeros and ones of structured data, too. Image language, written language, spoken language You're capturing different data sets in ways that prior tools never could. So the classifications that come out of it, the insights that come out of it, the business process transformation comes out of it is different than what we would have understood under the structure data formats. So I think it's that combination of really being able to push massive amounts of data through a cloud product processes at scale. That is what I think is the combination that takes it to the next plateau, for >>sure. The language that we use today, I feel like it's going to change. And you just started to touch on some of it, sensing our senses and visualization on the the auditory. So it's it's sort of this new experience that customers are seeing a lot of this machine intelligence behind. >>I call it the autonomous and price right, the journey to be the autonomous enterprise, and when you're on this journey to be the autonomous enterprise, you need really the platform that can help you be cloud is that platform which can help you get to the autonomous journey. But the Thomas journey does not end with the cloud. It doesn't end with the Data Lake. These are just infrastructures that are basic necessary necessities for being on that on that autonomous journey. But at the end, it's about how do you train and scale at, um, very large scale training that needs to happen on this platform for AI to be successful. And if you are an autonomous and price, then you have really figured out how to tap into AI and machine learning in a way that nobody else has to derive business value, if you will. So you've got the platform, you've got the data, and now you're actually tapping into the autonomous components ai and machine learning to derive business, intelligence and business value. >>So I want to get into a little bit of Oracle's role. But to do that, I want to talk a little bit more about the industry. So if you think about the way that the industry seems to be restructuring around data, historically, industries had their own stack value chain and if you were in in in the finance industry, you were there for life. >>So when you think about banking, for example, highly regulated industry think about our culture. These are highly regulated industries there. It was very difficult to destruct these industries. But now you look at an Amazon, right? And what does an Amazon or any other tech giants like Apple have? They have incredible amounts of data. They understand how people use for how they want to do banking. And so they've come up with a lot of cash or Amazon pay. And these things are starting to eat into the market. Right? So you would have never thought and Amazon could be a competition to a banking industry just because of regulations. But they're not hindered by the regulations because they're starting at a different level. And so they become an instant threat in an instant destructive to these highly regulated industries. That's what data does, right when you use data as your DNA for your business and you are sort of born in data or you figure out how to be autonomous. If you will capture value from that data in a very significant manner, then you can get into industries that are not traditionally your own industry. It can be like the food industry can be the cloud industry, the book industry, you know, different industries. So you know that that's what I see happening with the tech giants. >>So great, there's a really interesting point that the Gina is making that you mentioned. You started off with a couple of industries that are highly regulated, harder to disrupt, use it got disrupted. Publishing got disrupted. But you've got these regulated businesses. Defense. Automotive actually hasn't been surely disrupted yet. Tesla. Maybe a harbinger. And so you've got this spectrum of disruption. But is anybody safe from disruption? >>I don't think anyone's ever say from it. It's It's changing evolution, right? That you whether it's, you know, swapping horseshoes for cars are TV for movies or Netflix are any sort of evolution of a business. You're I wouldn't coast on any of it. And I think t earlier question around the value that we can help bring the Oracle customers is that you know, we have a rich stack of applications, and I find that the space between the applications, the data that that spans more than one of them is a ripe playground for innovations that where the data already exists inside a company, but it's trapped from both a technology and a business perspective. And that's where I think really any company can take advantage of knowing it's data better and changing itself to take advantage of what's already there. >>Yet powerful people always throw the bromide of the data is the new oil. And we've said no data is far more valuable because you can use it in a lot of different places where you can use once, and it's follow the laws of scarcity data, if you can unlock it. And so a lot of the incumbents they have built a business around whatever factory, our process and people, a lot of the trillion are starting us that become millionaires. You know, I'm talking about data is at the core data company. So So it seems like a big challenge for your incumbent customers. Clients is to put data at the core, be able to break down those silos. How do they do that? >>Grading down silos is really super critical for any business. It was okay to operate in a silo, for example. You would think that Oh, you know, I could just be payroll, inexpensive falls, and it wouldn't matter matter if I get into vendor performance management or purchasing that can operate as asylum. But anymore, we are finding that there are tremendous insights. But in vendor performance management, I expensive for these things are all connected, so you can't afford to have your data sits in silos. So grading down that silo actually gives the business very good performance right insights that they didn't have before. So that's one way to go. But but another phenomena happens When you start to great down the silos, you start to recognize what data you don't have to take your business to the next level. That awareness will not happen when you're working with existing data so that Obama's comes into form. When you great the silos and you start to figure out you need to go after a different set of data to get you to a new product creation. What would that look like? New test insights or new Catholics avoidance that that data is just you have to go through the iteration to be able to figure that out. >>Stakes is what you're saying. So this notion of the autonomous enterprise. I help me here cause I get kind of autonomous and automation coming into I t I t ops. I'm interested in how you see customers taking that beyond the technology organization into the enterprise. >>I think when is a technology problem? The company? Is it a loss? AI has to be a business problem. AI has to inform the business strategy. Ai has been companies the successful companies that have done so. 90% of my investments are going towards state. We know that most of it going towards ai this data out there about this, right? And so we look at what are these? 90 90% of the companies investments where he's going and whose doing this right who's not doing this right? One of the things we're seeing as results is that the companies that are doing it right have brought data into the business strategy. They've changed their business model, right? So it's not like making a better taxi, but coming up with global, right? So it's not like saying Okay, I'm going to have all these. I'm going to be the drug manufacturing company. I'm gonna put drugs out there in the market this is I'm going to do connected help, right? And so how does data serves the business model of being connected? Help rather than being a drug company selling drugs to my customers, right? It's a completely different way of looking at it. And so now you guys informing drug discovery is not helping you just put more drugs to the market. Rather, it's helping you come up with new drugs that would help the process of connected games. There's a >>lot of discussion in the press about, you know, the ethics of a and how far should we take a far. Can we take it from a technology standpoint, Long room there? But how far should we take it? Do you feel as though public policy will take care of that? A lot of that narrative is just kind of journalists looking for, You know, the negative story. Well, that's sort itself out. How much time do you spend with your customers talking about that >>we in Oracle, we're building our data science platform with an explicit feature called Explained Ability. Off the model on how the model came up with the features what features they picked. We can rearrange the features that the model picked. Citing Explain ability is very important for ordinary people. Trust ai because we can't trust even even they decided this contrast right to a large extent. So for us to get to that level where we can really trust what AI is picking in terms of a modern, we need to have explain ability. And I think a lot of the companies right now are starting to make that as part of their platform. >>We're definitely entering a new era the age of of AI of the autonomous enterprise folks. Thanks very much for great segment. Really appreciate it. >>Yeah. Pleasure. Thank you for having us. >>All right. And thank you and keep it right there. We'll be back with our next guest right after this short break. You're watching the Cube's coverage of the rebirth of Oracle consulting right back. Yeah, yeah, yeah, yeah, yeah, yeah

Published Date : Jul 6 2020

SUMMARY :

empowering the autonomous enterprise brought to you by Oracle Consulting. So as part of the rebirth of Oracle Consulting, So there's about five or six things that I want to follow up with you there, so this is a good conversation. So you can imagine that we just launched into Ai without our So the classifications that come out of it, the insights that come out of it, the business process transformation comes And you just started to touch on some of I call it the autonomous and price right, the journey to be the autonomous enterprise, the finance industry, you were there for life. It can be like the food industry can be the cloud industry, the book industry, you know, different industries. So great, there's a really interesting point that the Gina is making that you mentioned. the value that we can help bring the Oracle customers is that you know, we have a rich stack the laws of scarcity data, if you can unlock it. the silos, you start to recognize what data you don't have to take your business to the I'm interested in how you see customers taking that beyond the technology And so now you guys informing drug discovery is lot of discussion in the press about, you know, the ethics of a and how far should we take a far. Off the model on how the model came up with the features what features they picked. We're definitely entering a new era the age of of AI of the autonomous enterprise Thank you for having us. And thank you and keep it right there.

ENTITIES

Entity	Category	Confidence
Apple	ORGANIZATION	0.99+
Janet George	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Grant Gibson	PERSON	0.99+
Oracle	ORGANIZATION	0.99+
Boston	LOCATION	0.99+
Palo Alto	LOCATION	0.99+
Obama	PERSON	0.99+
Oracle Consulting	ORGANIZATION	0.99+
Tesla	ORGANIZATION	0.99+
90%	QUANTITY	0.99+
Netflix	ORGANIZATION	0.99+
six things	QUANTITY	0.98+
both	QUANTITY	0.98+
One	QUANTITY	0.98+
today	DATE	0.97+
more than one	QUANTITY	0.95+
Gina	PERSON	0.93+
Cube	COMMERCIAL_ITEM	0.89+
Cube Studios	ORGANIZATION	0.88+
Catholics	ORGANIZATION	0.87+
one way	QUANTITY	0.87+
Advanced Analytics	ORGANIZATION	0.87+
90 90%	QUANTITY	0.83+
Cube	ORGANIZATION	0.83+
about five	QUANTITY	0.81+
Cube	PERSON	0.77+
Stakes	PERSON	0.67+
once	QUANTITY	0.63+
Data Lake	COMMERCIAL_ITEM	0.57+
zeros	QUANTITY	0.55+
Thomas	PERSON	0.39+

UNLIST TILL 4/2 - Vertica Big Data Conference Keynote

>> Joy: Welcome to the Virtual Big Data Conference. Vertica is so excited to host this event. I'm Joy King, and I'll be your host for today's Big Data Conference Keynote Session. It's my honor and my genuine pleasure to lead Vertica's product and go-to-market strategy. And I'm so lucky to have a passionate and committed team who turned our Vertica BDC event, into a virtual event in a very short amount of time. I want to thank the thousands of people, and yes, that's our true number who have registered to attend this virtual event. We were determined to balance your health, safety and your peace of mind with the excitement of the Vertica BDC. This is a very unique event. Because as I hope you all know, we focus on engineering and architecture, best practice sharing and customer stories that will educate and inspire everyone. I also want to thank our top sponsors for the virtual BDC, Arrow, and Pure Storage. Our partnerships are so important to us and to everyone in the audience. Because together, we get things done faster and better. Now for today's keynote, you'll hear from three very important and energizing speakers. First, Colin Mahony, our SVP and General Manager for Vertica, will talk about the market trends that Vertica is betting on to win for our customers. And he'll share the exciting news about our Vertica 10 announcement and how this will benefit our customers. Then you'll hear from Amy Fowler, VP of strategy and solutions for FlashBlade at Pure Storage. Our partnership with Pure Storage is truly unique in the industry, because together modern infrastructure from Pure powers modern analytics from Vertica. And then you'll hear from John Yovanovich, Director of IT at AT&T, who will tell you about the Pure Vertica Symphony that plays live every day at AT&T. Here we go, Colin, over to you. >> Colin: Well, thanks a lot joy. And, I want to echo Joy's thanks to our sponsors, and so many of you who have helped make this happen. This is not an easy time for anyone. We were certainly looking forward to getting together in person in Boston during the Vertica Big Data Conference and Winning with Data. But I think all of you and our team have done a great job, scrambling and putting together a terrific virtual event. So really appreciate your time. I also want to remind people that we will make both the slides and the full recording available after this. So for any of those who weren't able to join live, that is still going to be available. Well, things have been pretty exciting here. And in the analytic space in general, certainly for Vertica, there's a lot happening. There are a lot of problems to solve, a lot of opportunities to make things better, and a lot of data that can really make every business stronger, more efficient, and frankly, more differentiated. For Vertica, though, we know that focusing on the challenges that we can directly address with our platform, and our people, and where we can actually make the biggest difference is where we ought to be putting our energy and our resources. I think one of the things that has made Vertica so strong over the years is our ability to focus on those areas where we can make a great difference. So for us as we look at the market, and we look at where we play, there are really three recent and some not so recent, but certainly picking up a lot of the market trends that have become critical for every industry that wants to Win Big With Data. We've heard this loud and clear from our customers and from the analysts that cover the market. If I were to summarize these three areas, this really is the core focus for us right now. We know that there's massive data growth. And if we can unify the data silos so that people can really take advantage of that data, we can make a huge difference. We know that public clouds offer tremendous advantages, but we also know that balance and flexibility is critical. And we all need the benefit that machine learning for all the types up to the end data science. We all need the benefits that they can bring to every single use case, but only if it can really be operationalized at scale, accurate and in real time. And the power of Vertica is, of course, how we're able to bring so many of these things together. Let me talk a little bit more about some of these trends. So one of the first industry trends that we've all been following probably now for over the last decade, is Hadoop and specifically HDFS. So many companies have invested, time, money, more importantly, people in leveraging the opportunity that HDFS brought to the market. HDFS is really part of a much broader storage disruption that we'll talk a little bit more about, more broadly than HDFS. But HDFS itself was really designed for petabytes of data, leveraging low cost commodity hardware and the ability to capture a wide variety of data formats, from a wide variety of data sources and applications. And I think what people really wanted, was to store that data before having to define exactly what structures they should go into. So over the last decade or so, the focus for most organizations is figuring out how to capture, store and frankly manage that data. And as a platform to do that, I think, Hadoop was pretty good. It certainly changed the way that a lot of enterprises think about their data and where it's locked up. In parallel with Hadoop, particularly over the last five years, Cloud Object Storage has also given every organization another option for collecting, storing and managing even more data. That has led to a huge growth in data storage, obviously, up on public clouds like Amazon and their S3, Google Cloud Storage and Azure Blob Storage just to name a few. And then when you consider regional and local object storage offered by cloud vendors all over the world, the explosion of that data, in leveraging this type of object storage is very real. And I think, as I mentioned, it's just part of this broader storage disruption that's been going on. But with all this growth in the data, in all these new places to put this data, every organization we talk to is facing even more challenges now around the data silo. Sure the data silos certainly getting bigger. And hopefully they're getting cheaper per bit. But as I said, the focus has really been on collecting, storing and managing the data. But between the new data lakes and many different cloud object storage combined with all sorts of data types from the complexity of managing all this, getting that business value has been very limited. This actually takes me to big bet number one for Team Vertica, which is to unify the data. Our goal, and some of the announcements we have made today plus roadmap announcements I'll share with you throughout this presentation. Our goal is to ensure that all the time, money and effort that has gone into storing that data, all the data turns into business value. So how are we going to do that? With a unified analytics platform that analyzes the data wherever it is HDFS, Cloud Object Storage, External tables in an any format ORC, Parquet, JSON, and of course, our own Native Roth Vertica format. Analyze the data in the right place in the right format, using a single unified tool. This is something that Vertica has always been committed to, and you'll see in some of our announcements today, we're just doubling down on that commitment. Let's talk a little bit more about the public cloud. This is certainly the second trend. It's the second wave maybe of data disruption with object storage. And there's a lot of advantages when it comes to public cloud. There's no question that the public clouds give rapid access to compute storage with the added benefit of eliminating data center maintenance that so many companies, want to get out of themselves. But maybe the biggest advantage that I see is the architectural innovation. The public clouds have introduced so many methodologies around how to provision quickly, separating compute and storage and really dialing-in the exact needs on demand, as you change workloads. When public clouds began, it made a lot of sense for the cloud providers and their customers to charge and pay for compute and storage in the ratio that each use case demanded. And I think you're seeing that trend, proliferate all over the place, not just up in public cloud. That architecture itself is really becoming the next generation architecture for on-premise data centers, as well. But there are a lot of concerns. I think we're all aware of them. They're out there many times for different workloads, there are higher costs. Especially if some of the workloads that are being run through analytics, which tend to run all the time. Just like some of the silo challenges that companies are facing with HDFS, data lakes and cloud storage, the public clouds have similar types of siloed challenges as well. Initially, there was a belief that they were cheaper than data centers, and when you added in all the costs, it looked that way. And again, for certain elastic workloads, that is the case. I don't think that's true across the board overall. Even to the point where a lot of the cloud vendors aren't just charging lower costs anymore. We hear from a lot of customers that they don't really want to tether themselves to any one cloud because of some of those uncertainties. Of course, security and privacy are a concern. We hear a lot of concerns with regards to cloud and even some SaaS vendors around shared data catalogs, across all the customers and not enough separation. But security concerns are out there, you can read about them. I'm not going to jump into that bandwagon. But we hear about them. And then, of course, I think one of the things we hear the most from our customers, is that each cloud stack is starting to feel even a lot more locked in than the traditional data warehouse appliance. And as everybody knows, the industry has been running away from appliances as fast as it can. And so they're not eager to get locked into another, quote, unquote, virtual appliance, if you will, up in the cloud. They really want to make sure they have flexibility in which clouds, they're going to today, tomorrow and in the future. And frankly, we hear from a lot of our customers that they're very interested in eventually mixing and matching, compute from one cloud with, say storage from another cloud, which I think is something that we'll hear a lot more about. And so for us, that's why we've got our big bet number two. we love the cloud. We love the public cloud. We love the private clouds on-premise, and other hosting providers. But our passion and commitment is for Vertica to be able to run in any of the clouds that our customers choose, and make it portable across those clouds. We have supported on-premises and all public clouds for years. And today, we have announced even more support for Vertica in Eon Mode, the deployment option that leverages the separation of compute from storage, with even more deployment choices, which I'm going to also touch more on as we go. So super excited about our big bet number two. And finally as I mentioned, for all the hype that there is around machine learning, I actually think that most importantly, this third trend that team Vertica is determined to address is the need to bring business critical, analytics, machine learning, data science projects into production. For so many years, there just wasn't enough data available to justify the investment in machine learning. Also, processing power was expensive, and storage was prohibitively expensive. But to train and score and evaluate all the different models to unlock the full power of predictive analytics was tough. Today you have those massive data volumes. You have the relatively cheap processing power and storage to make that dream a reality. And if you think about this, I mean with all the data that's available to every company, the real need is to operationalize the speed and the scale of machine learning so that these organizations can actually take advantage of it where they need to. I mean, we've seen this for years with Vertica, going back to some of the most advanced gaming companies in the early days, they were incorporating this with live data directly into their gaming experiences. Well, every organization wants to do that now. And the accuracy for clickability and real time actions are all key to separating the leaders from the rest of the pack in every industry when it comes to machine learning. But if you look at a lot of these projects, the reality is that there's a ton of buzz, there's a ton of hype spanning every acronym that you can imagine. But most companies are struggling, do the separate teams, different tools, silos and the limitation that many platforms are facing, driving, down sampling to get a small subset of the data, to try to create a model that then doesn't apply, or compromising accuracy and making it virtually impossible to replicate models, and understand decisions. And if there's one thing that we've learned when it comes to data, prescriptive data at the atomic level, being able to show end of one as we refer to it, meaning individually tailored data. No matter what it is healthcare, entertainment experiences, like gaming or other, being able to get at the granular data and make these decisions, make that scoring applies to machine learning just as much as it applies to giving somebody a next-best-offer. But the opportunity has never been greater. The need to integrate this end-to-end workflow and support the right tools without compromising on that accuracy. Think about it as no downsampling, using all the data, it really is key to machine learning success. Which should be no surprise then why the third big bet from Vertica is one that we've actually been working on for years. And we're so proud to be where we are today, helping the data disruptors across the world operationalize machine learning. This big bet has the potential to truly unlock, really the potential of machine learning. And today, we're announcing some very important new capabilities specifically focused on unifying the work being done by the data science community, with their preferred tools and platforms, and the volume of data and performance at scale, available in Vertica. Our strategy has been very consistent over the last several years. As I said in the beginning, we haven't deviated from our strategy. Of course, there's always things that we add. Most of the time, it's customer driven, it's based on what our customers are asking us to do. But I think we've also done a great job, not trying to be all things to all people. Especially as these hype cycles flare up around us, we absolutely love participating in these different areas without getting completely distracted. I mean, there's a variety of query tools and data warehouses and analytics platforms in the market. We all know that. There are tools and platforms that are offered by the public cloud vendors, by other vendors that support one or two specific clouds. There are appliance vendors, who I was referring to earlier who can deliver package data warehouse offerings for private data centers. And there's a ton of popular machine learning tools, languages and other kits. But Vertica is the only advanced analytic platform that can do all this, that can bring it together. We can analyze the data wherever it is, in HDFS, S3 Object Storage, or Vertica itself. Natively we support multiple clouds on-premise deployments, And maybe most importantly, we offer that choice of deployment modes to allow our customers to choose the architecture that works for them right now. It still also gives them the option to change move, evolve over time. And Vertica is the only analytics database with end-to-end machine learning that can truly operationalize ML at scale. And I know it's a mouthful. But it is not easy to do all these things. It is one of the things that highly differentiates Vertica from the rest of the pack. It is also why our customers, all of you continue to bet on us and see the value that we are delivering and we will continue to deliver. Here's a couple of examples of some of our customers who are powered by Vertica. It's the scale of data. It's the millisecond response times. Performance and scale have always been a huge part of what we have been about, not the only thing. I think the functionality all the capabilities that we add to the platform, the ease of use, the flexibility, obviously with the deployment. But if you look at some of the numbers they are under these customers on this slide. And I've shared a lot of different stories about these customers. Which, by the way, it still amaze me every time I talk to one and I get the updates, you can see the power and the difference that Vertica is making. Equally important, if you look at a lot of these customers, they are the epitome of being able to deploy Vertica in a lot of different environments. Many of the customers on this slide are not using Vertica just on-premise or just in the cloud. They're using it in a hybrid way. They're using it in multiple different clouds. And again, we've been with them on that journey throughout, which is what has made this product and frankly, our roadmap and our vision exactly what it is. It's been quite a journey. And that journey continues now with the Vertica 10 release. The Vertica 10 release is obviously a massive release for us. But if you look back, you can see that building on that native columnar architecture that started a long time ago, obviously, with the C-Store paper. We built it to leverage that commodity hardware, because it was an architecture that was never tightly integrated with any specific underlying infrastructure. I still remember hearing the initial pitch from Mike Stonebreaker, about the vision of Vertica as a software only solution and the importance of separating the company from hardware innovation. And at the time, Mike basically said to me, "there's so much R&D in innovation that's going to happen in hardware, we shouldn't bake hardware into our solution. We should do it in software, and we'll be able to take advantage of that hardware." And that is exactly what has happened. But one of the most recent innovations that we embraced with hardware is certainly that separation of compute and storage. As I said previously, the public cloud providers offered this next generation architecture, really to ensure that they can provide the customers exactly what they needed, more compute or more storage and charge for each, respectively. The separation of compute and storage, compute from storage is a major milestone in data center architectures. If you think about it, it's really not only a public cloud innovation, though. It fundamentally redefines the next generation data architecture for on-premise and for pretty much every way people are thinking about computing today. And that goes for software too. Object storage is an example of the cost effective means for storing data. And even more importantly, separating compute from storage for analytic workloads has a lot of advantages. Including the opportunity to manage much more dynamic, flexible workloads. And more importantly, truly isolate those workloads from others. And by the way, once you start having something that can truly isolate workloads, then you can have the conversations around autonomic computing, around setting up some nodes, some compute resources on the data that won't affect any of the other data to do some things on their own, maybe some self analytics, by the system, etc. A lot of things that many of you know we've already been exploring in terms of our own system data in the product. But it was May 2018, believe it or not, it seems like a long time ago where we first announced Eon Mode and I want to make something very clear, actually about Eon mode. It's a mode, it's a deployment option for Vertica customers. And I think this is another huge benefit that we don't talk about enough. But unlike a lot of vendors in the market who will dig you and charge you for every single add-on like hit-buy, you name it. You get this with the Vertica product. If you continue to pay support and maintenance, this comes with the upgrade. This comes as part of the new release. So any customer who owns or buys Vertica has the ability to set up either an Enterprise Mode or Eon Mode, which is a question I know that comes up sometimes. Our first announcement of Eon was obviously AWS customers, including the trade desk, AT&T. Most of whom will be speaking here later at the Virtual Big Data Conference. They saw a huge opportunity. Eon Mode, not only allowed Vertica to scale elastically with that specific compute and storage that was needed, but it really dramatically simplified database operations including things like workload balancing, node recovery, compute provisioning, etc. So one of the most popular functions is that ability to isolate the workloads and really allocate those resources without negatively affecting others. And even though traditional data warehouses, including Vertica Enterprise Mode have been able to do lots of different workload isolation, it's never been as strong as Eon Mode. Well, it certainly didn't take long for our customers to see that value across the board with Eon Mode. Not just up in the cloud, in partnership with one of our most valued partners and a platinum sponsor here. Joy mentioned at the beginning. We announced Vertica Eon Mode for Pure Storage FlashBlade in September 2019. And again, just to be clear, this is not a new product, it's one Vertica with yet more deployment options. With Pure Storage, Vertica in Eon mode is not limited in any way by variable cloud, network latency. The performance is actually amazing when you take the benefits of separate and compute from storage and you run it with a Pure environment on-premise. Vertica in Eon Mode has a super smart cache layer that we call the depot. It's a big part of our secret sauce around Eon mode. And combined with the power and performance of Pure's FlashBlade, Vertica became the industry's first advanced analytics platform that actually separates compute and storage for on-premises data centers. Something that a lot of our customers are already benefiting from, and we're super excited about it. But as I said, this is a journey. We don't stop, we're not going to stop. Our customers need the flexibility of multiple public clouds. So today with Vertica 10, we're super proud and excited to announce support for Vertica in Eon Mode on Google Cloud. This gives our customers the ability to use their Vertica licenses on Amazon AWS, on-premise with Pure Storage and on Google Cloud. Now, we were talking about HDFS and a lot of our customers who have invested quite a bit in HDFS as a place, especially to store data have been pushing us to support Eon Mode with HDFS. So as part of Vertica 10, we are also announcing support for Vertica in Eon Mode using HDFS as the communal storage. Vertica's own Roth format data can be stored in HDFS, and actually the full functionality of Vertica is complete analytics, geospatial pattern matching, time series, machine learning, everything that we have in there can be applied to this data. And on the same HDFS nodes, Vertica can actually also analyze data in ORC or Parquet format, using External tables. We can also execute joins between the Roth data the External table holds, which powers a much more comprehensive view. So again, it's that flexibility to be able to support our customers, wherever they need us to support them on whatever platform, they have. Vertica 10 gives us a lot more ways that we can deploy Eon Mode in various environments for our customers. It allows them to take advantage of Vertica in Eon Mode and the power that it brings with that separation, with that workload isolation, to whichever platform they are most comfortable with. Now, there's a lot that has come in Vertica 10. I'm definitely not going to be able to cover everything. But we also introduced complex types as an example. And complex data types fit very well into Eon as well in this separation. They significantly reduce the data pipeline, the cost of moving data between those, a much better support for unstructured data, which a lot of our customers have mixed with structured data, of course, and they leverage a lot of columnar execution that Vertica provides. So you get complex data types in Vertica now, a lot more data, stronger performance. It goes great with the announcement that we made with the broader Eon Mode. Let's talk a little bit more about machine learning. We've been actually doing work in and around machine learning with various extra regressions and a whole bunch of other algorithms for several years. We saw the huge advantage that MPP offered, not just as a sequel engine as a database, but for ML as well. Didn't take as long to realize that there's a lot more to operationalizing machine learning than just those algorithms. It's data preparation, it's that model trade training. It's the scoring, the shaping, the evaluation. That is so much of what machine learning and frankly, data science is about. You do know, everybody always wants to jump to the sexy algorithm and we handle those tasks very, very well. It makes Vertica a terrific platform to do that. A lot of work in data science and machine learning is done in other tools. I had mentioned that there's just so many tools out there. We want people to be able to take advantage of all that. We never believed we were going to be the best algorithm company or come up with the best models for people to use. So with Vertica 10, we support PMML. We can import now and export PMML models. It's a huge step for us around that operationalizing machine learning projects for our customers. Allowing the models to get built outside of Vertica yet be imported in and then applying to that full scale of data with all the performance that you would expect from Vertica. We also are more tightly integrating with Python. As many of you know, we've been doing a lot of open source projects with the community driven by many of our customers, like Uber. And so now with Python we've integrated with TensorFlow, allowing data scientists to build models in their preferred language, to take advantage of TensorFlow. But again, to store and deploy those models at scale with Vertica. I think both these announcements are proof of our big bet number three, and really our commitment to supporting innovation throughout the community by operationalizing ML with that accuracy, performance and scale of Vertica for our customers. Again, there's a lot of steps when it comes to the workflow of machine learning. These are some of them that you can see on the slide, and it's definitely not linear either. We see this as a circle. And companies that do it, well just continue to learn, they continue to rescore, they continue to redeploy and they want to operationalize all that within a single platform that can take advantage of all those capabilities. And that is the platform, with a very robust ecosystem that Vertica has always been committed to as an organization and will continue to be. This graphic, many of you have seen it evolve over the years. Frankly, if we put everything and everyone on here wouldn't fit on a slide. But it will absolutely continue to evolve and grow as we support our customers, where they need the support most. So, again, being able to deploy everywhere, being able to take advantage of Vertica, not just as a business analyst or a business user, but as a data scientists or as an operational or BI person. We want Vertica to be leveraged and used by the broader organization. So I think it's fair to say and I encourage everybody to learn more about Vertica 10, because I'm just highlighting some of the bigger aspects of it. But we talked about those three market trends. The need to unify the silos, the need for hybrid multiple cloud deployment options, the need to operationalize business critical machine learning projects. Vertica 10 has absolutely delivered on those. But again, we are not going to stop. It is our job not to, and this is how Team Vertica thrives. I always joke that the next release is the best release. And, of course, even after Vertica 10, that is also true, although Vertica 10 is pretty awesome. But, you know, from the first line of code, we've always been focused on performance and scale, right. And like any really strong data platform, the execution engine, the optimizer and the execution engine are the two core pieces of that. Beyond Vertica 10, some of the big things that we're already working on, next generation execution engine. We're already actually seeing incredible early performance from this. And this is just one example, of how important it is for an organization like Vertica to constantly go back and re-innovate. Every single release, we do the sit ups and crunches, our performance and scale. How do we improve? And there's so many parts of the core server, there's so many parts of our broader ecosystem. We are constantly looking at coverages of how we can go back to all the code lines that we have, and make them better in the current environment. And it's not an easy thing to do when you're doing that, and you're also expanding in the environment that we are expanding into to take advantage of the different deployments, which is a great segue to this slide. Because if you think about today, we're obviously already available with Eon Mode and Amazon, AWS and Pure and actually MinIO as well. As I talked about in Vertica 10 we're adding Google and HDFS. And coming next, obviously, Microsoft Azure, Alibaba cloud. So being able to expand into more of these environments is really important for the Vertica team and how we go forward. And it's not just running in these clouds, for us, we want it to be a SaaS like experience in all these clouds. We want you to be able to deploy Vertica in 15 minutes or less on these clouds. You can also consume Vertica, in a lot of different ways, on these clouds. As an example, in Amazon Vertica by the Hour. So for us, it's not just about running, it's about taking advantage of the ecosystems that all these cloud providers offer, and really optimizing the Vertica experience as part of them. Optimization, around automation, around self service capabilities, extending our management console, we now have products that like the Vertica Advisor Tool that our Customer Success Team has created to actually use our own smarts in Vertica. To take data from customers that give it to us and help them tune automatically their environment. You can imagine that we're taking that to the next level, in a lot of different endeavors that we're doing around how Vertica as a product can actually be smarter because we all know that simplicity is key. There just aren't enough people in the world who are good at managing data and taking it to the next level. And of course, other things that we all hear about, whether it's Kubernetes and containerization. You can imagine that that probably works very well with the Eon Mode and separating compute and storage. But innovation happens everywhere. We innovate around our community documentation. Many of you have taken advantage of the Vertica Academy. The numbers there are through the roof in terms of the number of people coming in and certifying on it. So there's a lot of things that are within the core products. There's a lot of activity and action beyond the core products that we're taking advantage of. And let's not forget why we're here, right? It's easy to talk about a platform, a data platform, it's easy to jump into all the functionality, the analytics, the flexibility, how we can offer it. But at the end of the day, somebody, a person, she's got to take advantage of this data, she's got to be able to take this data and use this information to make a critical business decision. And that doesn't happen unless we explore lots of different and frankly, new ways to get that predictive analytics UI and interface beyond just the standard BI tools in front of her at the right time. And so there's a lot of activity, I'll tease you with that going on in this organization right now about how we can do that and deliver that for our customers. We're in a great position to be able to see exactly how this data is consumed and used and start with this core platform that we have to go out. Look, I know, the plan wasn't to do this as a virtual BDC. But I really appreciate you tuning in. Really appreciate your support. I think if there's any silver lining to us, maybe not being able to do this in person, it's the fact that the reach has actually gone significantly higher than what we would have been able to do in person in Boston. We're certainly looking forward to doing a Big Data Conference in the future. But if I could leave you with anything, know this, since that first release for Vertica, and our very first customers, we have been very consistent. We respect all the innovation around us, whether it's open source or not. We understand the market trends. We embrace those new ideas and technologies and for us true north, and the most important thing is what does our customer need to do? What problem are they trying to solve? And how do we use the advantages that we have without disrupting our customers? But knowing that you depend on us to deliver that unified analytics strategy, it will deliver that performance of scale, not only today, but tomorrow and for years to come. We've added a lot of great features to Vertica. I think we've said no to a lot of things, frankly, that we just knew we wouldn't be the best company to deliver. When we say we're going to do things we do them. Vertica 10 is a perfect example of so many of those things that we from you, our customers have heard loud and clear, and we have delivered. I am incredibly proud of this team across the board. I think the culture of Vertica, a customer first culture, jumping in to help our customers win no matter what is also something that sets us massively apart. I hear horror stories about support experiences with other organizations. And people always seem to be amazed at Team Vertica's willingness to jump in or their aptitude for certain technical capabilities or understanding the business. And I think sometimes we take that for granted. But that is the team that we have as Team Vertica. We are incredibly excited about Vertica 10. I think you're going to love the Virtual Big Data Conference this year. I encourage you to tune in. Maybe one other benefit is I know some people were worried about not being able to see different sessions because they were going to overlap with each other well now, even if you can't do it live, you'll be able to do those sessions on demand. Please enjoy the Vertica Big Data Conference here in 2020. Please you and your families and your co-workers be safe during these times. I know we will get through it. And analytics is probably going to help with a lot of that and we already know it is helping in many different ways. So believe in the data, believe in data's ability to change the world for the better. And thank you for your time. And with that, I am delighted to now introduce Micro Focus CEO Stephen Murdoch to the Vertica Big Data Virtual Conference. Thank you Stephen. >> Stephen: Hi, everyone, my name is Stephen Murdoch. I have the pleasure and privilege of being the Chief Executive Officer here at Micro Focus. Please let me add my welcome to the Big Data Conference. And also my thanks for your support, as we've had to pivot to this being virtual rather than a physical conference. Its amazing how quickly we all reset to a new normal. I certainly didn't expect to be addressing you from my study. Vertica is an incredibly important part of Micro Focus family. Is key to our goal of trying to enable and help customers become much more data driven across all of their IT operations. Vertica 10 is a huge step forward, we believe. It allows for multi-cloud innovation, genuinely hybrid deployments, begin to leverage machine learning properly in the enterprise, and also allows the opportunity to unify currently siloed lakes of information. We operate in a very noisy, very competitive market, and there are people, who are in that market who can do some of those things. The reason we are so excited about Vertica is we genuinely believe that we are the best at doing all of those things. And that's why we've announced publicly, you're under executing internally, incremental investment into Vertica. That investments targeted at accelerating the roadmaps that already exist. And getting that innovation into your hands faster. This idea is speed is key. It's not a question of if companies have to become data driven organizations, it's a question of when. So that speed now is really important. And that's why we believe that the Big Data Conference gives a great opportunity for you to accelerate your own plans. You will have the opportunity to talk to some of our best architects, some of the best development brains that we have. But more importantly, you'll also get to hear from some of our phenomenal Roth Data customers. You'll hear from Uber, from the Trade Desk, from Philips, and from AT&T, as well as many many others. And just hearing how those customers are using the power of Vertica to accelerate their own, I think is the highlight. And I encourage you to use this opportunity to its full. Let me close by, again saying thank you, we genuinely hope that you get as much from this virtual conference as you could have from a physical conference. And we look forward to your engagement, and we look forward to hearing your feedback. With that, thank you very much. >> Joy: Thank you so much, Stephen, for joining us for the Vertica Big Data Conference. Your support and enthusiasm for Vertica is so clear, and it makes a big difference. Now, I'm delighted to introduce Amy Fowler, the VP of Strategy and Solutions for FlashBlade at Pure Storage, who was one of our BDC Platinum Sponsors, and one of our most valued partners. It was a proud moment for me, when we announced Vertica in Eon mode for Pure Storage FlashBlade and we became the first analytics data warehouse that separates compute from storage for on-premise data centers. Thank you so much, Amy, for joining us. Let's get started. >> Amy: Well, thank you, Joy so much for having us. And thank you all for joining us today, virtually, as we may all be. So, as we just heard from Colin Mahony, there are some really interesting trends that are happening right now in the big data analytics market. From the end of the Hadoop hype cycle, to the new cloud reality, and even the opportunity to help the many data science and machine learning projects move from labs to production. So let's talk about these trends in the context of infrastructure. And in particular, look at why a modern storage platform is relevant as organizations take on the challenges and opportunities associated with these trends. The answer is the Hadoop hype cycles left a lot of data in HDFS data lakes, or reservoirs or swamps depending upon the level of the data hygiene. But without the ability to get the value that was promised from Hadoop as a platform rather than a distributed file store. And when we combine that data with the massive volume of data in Cloud Object Storage, we find ourselves with a lot of data and a lot of silos, but without a way to unify that data and find value in it. Now when you look at the infrastructure data lakes are traditionally built on, it is often direct attached storage or data. The approach that Hadoop took when it entered the market was primarily bound by the limits of networking and storage technologies. One gig ethernet and slower spinning disk. But today, those barriers do not exist. And all FlashStorage has fundamentally transformed how data is accessed, managed and leveraged. The need for local data storage for significant volumes of data has been largely mitigated by the performance increases afforded by all Flash. At the same time, organizations can achieve superior economies of scale with that segregation of compute and storage. With compute and storage, you don't always scale in lockstep. Would you want to add an engine to the train every time you add another boxcar? Probably not. But from a Pure Storage perspective, FlashBlade is uniquely architected to allow customers to achieve better resource utilization for compute and storage, while at the same time, reducing complexity that has arisen from the siloed nature of the original big data solutions. The second and equally important recent trend we see is something I'll call cloud reality. The public clouds made a lot of promises and some of those promises were delivered. But cloud economics, especially usage based and elastic scaling, without the control that many companies need to manage the financial impact is causing a lot of issues. In addition, the risk of vendor lock-in from data egress, charges, to integrated software stacks that can't be moved or deployed on-premise is causing a lot of organizations to back off the all the way non-cloud strategy, and move toward hybrid deployments. Which is kind of funny in a way because it wasn't that long ago that there was a lot of talk about no more data centers. And for example, one large retailer, I won't name them, but I'll admit they are my favorites. They several years ago told us they were completely done with on-prem storage infrastructure, because they were going 100% to the cloud. But they just deployed FlashBlade for their data pipelines, because they need predictable performance at scale. And the all cloud TCO just didn't add up. Now, that being said, well, there are certainly challenges with the public cloud. It has also brought some things to the table that we see most organizations wanting. First of all, in a lot of cases applications have been built to leverage object storage platforms like S3. So they need that object protocol, but they may also need it to be fast. And the said object may be oxymoron only a few years ago, and this is an area of the market where Pure and FlashBlade have really taken a leadership position. Second, regardless of where the data is physically stored, organizations want the best elements of a cloud experience. And for us, that means two main things. Number one is simplicity and ease of use. If you need a bunch of storage experts to run the system, that should be considered a bug. The other big one is the consumption model. The ability to pay for what you need when you need it, and seamlessly grow your environment over time totally nondestructively. This is actually pretty huge and something that a lot of vendors try to solve for with finance programs. But no finance program can address the pain of a forklift upgrade, when you need to move to next gen hardware. To scale nondestructively over long periods of time, five to 10 years plus is a crucial architectural decisions need to be made at the outset. Plus, you need the ability to pay as you use it. And we offer something for FlashBlade called Pure as a Service, which delivers exactly that. The third cloud characteristic that many organizations want is the option for hybrid. Even if that is just a DR site in the cloud. In our case, that means supporting appplication of S3, at the AWS. And the final trend, which to me represents the biggest opportunity for all of us, is the need to help the many data science and machine learning projects move from labs to production. This means bringing all the machine learning functions and model training to the data, rather than moving samples or segments of data to separate platforms. As we all know, machine learning needs a ton of data for accuracy. And there is just too much data to retrieve from the cloud for every training job. At the same time, predictive analytics without accuracy is not going to deliver the business advantage that everyone is seeking. You can kind of visualize data analytics as it is traditionally deployed as being on a continuum. With that thing, we've been doing the longest, data warehousing on one end, and AI on the other end. But the way this manifests in most environments is a series of silos that get built up. So data is duplicated across all kinds of bespoke analytics and AI, environments and infrastructure. This creates an expensive and complex environment. So historically, there was no other way to do it because some level of performance is always table stakes. And each of these parts of the data pipeline has a different workload profile. A single platform to deliver on the multi dimensional performances, diverse set of applications required, that didn't exist three years ago. And that's why the application vendors pointed you towards bespoke things like DAS environments that we talked about earlier. And the fact that better options exists today is why we're seeing them move towards supporting this disaggregation of compute and storage. And when it comes to a platform that is a better option, one with a modern architecture that can address the diverse performance requirements of this continuum, and allow organizations to bring a model to the data instead of creating separate silos. That's exactly what FlashBlade is built for. Small files, large files, high throughput, low latency and scale to petabytes in a single namespace. And this is importantly a single rapid space is what we're focused on delivering for our customers. At Pure, we talk about it in the context of modern data experience because at the end of the day, that's what it's really all about. The experience for your teams in your organization. And together Pure Storage and Vertica have delivered that experience to a wide range of customers. From a SaaS analytics company, which uses Vertica on FlashBlade to authenticate the quality of digital media in real time, to a multinational car company, which uses Vertica on FlashBlade to make thousands of decisions per second for autonomous cars, or a healthcare organization, which uses Vertica on FlashBlade to enable healthcare providers to make real time decisions that impact lives. And I'm sure you're all looking forward to hearing from John Yavanovich from AT&T. To hear how he's been doing this with Vertica and FlashBlade as well. He's coming up soon. We have been really excited to build this partnership with Vertica. And we're proud to provide the only on-premise storage platform validated with Vertica Eon Mode. And deliver this modern data experience to our customers together. Thank you all so much for joining us today. >> Joy: Amy, thank you so much for your time and your insights. Modern infrastructure is key to modern analytics, especially as organizations leverage next generation data center architectures, and object storage for their on-premise data centers. Now, I'm delighted to introduce our last speaker in our Vertica Big Data Conference Keynote, John Yovanovich, Director of IT for AT&T. Vertica is so proud to serve AT&T, and especially proud of the harmonious impact we are having in partnership with Pure Storage. John, welcome to the Virtual Vertica BDC. >> John: Thank you joy. It's a pleasure to be here. And I'm excited to go through this presentation today. And in a unique fashion today 'cause as I was thinking through how I wanted to present the partnership that we have formed together between Pure Storage, Vertica and AT&T, I want to emphasize how well we all work together and how these three components have really driven home, my desire for a harmonious to use your word relationship. So, I'm going to move forward here and with. So here, what I'm going to do the theme of today's presentation is the Pure Vertica Symphony live at AT&T. And if anybody is a Westworld fan, you can appreciate the sheet music on the right hand side. What we're going to what I'm going to highlight here is in a musical fashion, is how we at AT&T leverage these technologies to save money to deliver a more efficient platform, and to actually just to make our customers happier overall. So as we look back, and back as early as just maybe a few years ago here at AT&T, I realized that we had many musicians to help the company. Or maybe you might want to call them data scientists, or data analysts. For the theme we'll stay with musicians. None of them were singing or playing from the same hymn book or sheet music. And so what we had was many organizations chasing a similar dream, but not exactly the same dream. And, best way to describe that is and I think with a lot of people this might resonate in your organizations. How many organizations are chasing a customer 360 view in your company? Well, I can tell you that I have at least four in my company. And I'm sure there are many that I don't know of. That is our problem because what we see is a repetitive sourcing of data. We see a repetitive copying of data. And there's just so much money to be spent. This is where I asked Pure Storage and Vertica to help me solve that problem with their technologies. What I also noticed was that there was no coordination between these departments. In fact, if you look here, nobody really wants to play with finance. Sales, marketing and care, sure that you all copied each other's data. But they actually didn't communicate with each other as they were copying the data. So the data became replicated and out of sync. This is a challenge throughout, not just my company, but all companies across the world. And that is, the more we replicate the data, the more problems we have at chasing or conquering the goal of single version of truth. In fact, I kid that I think that AT&T, we actually have adopted the multiple versions of truth, techno theory, which is not where we want to be, but this is where we are. But we are conquering that with the synergies between Pure Storage and Vertica. This is what it leaves us with. And this is where we are challenged and that if each one of our siloed business units had their own stories, their own dedicated stories, and some of them had more money than others so they bought more storage. Some of them anticipating storing more data, and then they really did. Others are running out of space, but can't put anymore because their bodies aren't been replenished. So if you look at it from this side view here, we have a limited amount of compute or fixed compute dedicated to each one of these silos. And that's because of the, wanting to own your own. And the other part is that you are limited or wasting space, depending on where you are in the organization. So there were the synergies aren't just about the data, but actually the compute and the storage. And I wanted to tackle that challenge as well. So I was tackling the data. I was tackling the storage, and I was tackling the compute all at the same time. So my ask across the company was can we just please play together okay. And to do that, I knew that I wasn't going to tackle this by getting everybody in the same room and getting them to agree that we needed one account table, because they will argue about whose account table is the best account table. But I knew that if I brought the account tables together, they would soon see that they had so much redundancy that I can now start retiring data sources. I also knew that if I brought all the compute together, that they would all be happy. But I didn't want them to tackle across tackle each other. And in fact that was one of the things that all business units really enjoy. Is they enjoy the silo of having their own compute, and more or less being able to control their own destiny. Well, Vertica's subclustering allows just that. And this is exactly what I was hoping for, and I'm glad they've brought through. And finally, how did I solve the problem of the single account table? Well when you don't have dedicated storage, and you can separate compute and storage as Vertica in Eon Mode does. And we store the data on FlashBlades, which you see on the left and right hand side, of our container, which I can describe in a moment. Okay, so what we have here, is we have a container full of compute with all the Vertica nodes sitting in the middle. Two loader, we'll call them loader subclusters, sitting on the sides, which are dedicated to just putting data onto the FlashBlades, which is sitting on both ends of the container. Now today, I have two dedicated storage or common dedicated might not be the right word, but two storage racks one on the left one on the right. And I treat them as separate storage racks. They could be one, but i created them separately for disaster recovery purposes, lashing work in case that rack were to go down. But that being said, there's no reason why I'm probably going to add a couple of them here in the future. So I can just have a, say five to 10, petabyte storage, setup, and I'll have my DR in another 'cause the DR shouldn't be in the same container. Okay, but I'll DR outside of this container. So I got them all together, I leveraged subclustering, I leveraged separate and compute. I was able to convince many of my clients that they didn't need their own account table, that they were better off having one. I eliminated, I reduced latency, I reduced our ticketing I reduce our data quality issues AKA ticketing okay. I was able to expand. What is this? As work. I was able to leverage elasticity within this cluster. As you can see, there are racks and racks of compute. We set up what we'll call the fixed capacity that each of the business units needed. And then I'm able to ramp up and release the compute that's necessary for each one of my clients based on their workloads throughout the day. And so while they compute to the right before you see that the instruments have already like, more or less, dedicated themselves towards all those are free for anybody to use. So in essence, what I have, is I have a concert hall with a lot of seats available. So if I want to run a 10 chair Symphony or 80, chairs, Symphony, I'm able to do that. And all the while, I can also do the same with my loader nodes. I can expand my loader nodes, to actually have their own Symphony or write all to themselves and not compete with any other workloads of the other clusters. What does that change for our organization? Well, it really changes the way our database administrators actually do their jobs. This has been a big transformation for them. They have actually become data conductors. Maybe you might even call them composers, which is interesting, because what I've asked them to do is morph into less technology and more workload analysis. And in doing so we're able to write auto-detect scripts, that watch the queues, watch the workloads so that we can help ramp up and trim down the cluster and subclusters as necessary. There has been an exciting transformation for our DBAs, who I need to now classify as something maybe like DCAs. I don't know, I have to work with HR on that. But I think it's an exciting future for their careers. And if we bring it all together, If we bring it all together, and then our clusters, start looking like this. Where everything is moving in harmonious, we have lots of seats open for extra musicians. And we are able to emulate a cloud experience on-prem. And so, I want you to sit back and enjoy the Pure Vertica Symphony live at AT&T. (soft music) >> Joy: Thank you so much, John, for an informative and very creative look at the benefits that AT&T is getting from its Pure Vertica symphony. I do really like the idea of engaging HR to change the title to Data Conductor. That's fantastic. I've always believed that music brings people together. And now it's clear that analytics at AT&T is part of that musical advantage. So, now it's time for a short break. And we'll be back for our breakout sessions, beginning at 12 pm Eastern Daylight Time. We have some really exciting sessions planned later today. And then again, as you can see on Wednesday. Now because all of you are already logged in and listening to this keynote, you already know the steps to continue to participate in the sessions that are listed here and on the previous slide. In addition, everyone received an email yesterday, today, and you'll get another one tomorrow, outlining the simple steps to register, login and choose your session. If you have any questions, check out the emails or go to www.vertica.com/bdc2020 for the logistics information. There are a lot of choices and that's always a good thing. Don't worry if you want to attend one or more or can't listen to these live sessions due to your timezone. All the sessions, including the Q&A sections will be available on demand and everyone will have access to the recordings as well as even more pre-recorded sessions that we'll post to the BDC website. Now I do want to leave you with two other important sites. First, our Vertica Academy. Vertica Academy is available to everyone. And there's a variety of very technical, self-paced, on-demand training, virtual instructor-led workshops, and Vertica Essentials Certification. And it's all free. Because we believe that Vertica expertise, helps everyone accelerate their Vertica projects and the advantage that those projects deliver. Now, if you have questions or want to engage with our Vertica engineering team now, we're waiting for you on the Vertica forum. We'll answer any questions or discuss any ideas that you might have. Thank you again for joining the Vertica Big Data Conference Keynote Session. Enjoy the rest of the BDC because there's a lot more to come

Published Date : Mar 30 2020

SUMMARY :

And he'll share the exciting news And that is the platform, with a very robust ecosystem some of the best development brains that we have. the VP of Strategy and Solutions is causing a lot of organizations to back off the and especially proud of the harmonious impact And that is, the more we replicate the data, Enjoy the rest of the BDC because there's a lot more to come

ENTITIES

Entity	Category	Confidence
Stephen	PERSON	0.99+
Amy Fowler	PERSON	0.99+
Mike	PERSON	0.99+
John Yavanovich	PERSON	0.99+
Amy	PERSON	0.99+
Colin Mahony	PERSON	0.99+
AT&T	ORGANIZATION	0.99+
Boston	LOCATION	0.99+
John Yovanovich	PERSON	0.99+
Vertica	ORGANIZATION	0.99+
Joy King	PERSON	0.99+
Mike Stonebreaker	PERSON	0.99+
John	PERSON	0.99+
May 2018	DATE	0.99+
100%	QUANTITY	0.99+
Wednesday	DATE	0.99+
Colin	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Vertica Academy	ORGANIZATION	0.99+
five	QUANTITY	0.99+
Joy	PERSON	0.99+
2020	DATE	0.99+
two	QUANTITY	0.99+
Uber	ORGANIZATION	0.99+
Stephen Murdoch	PERSON	0.99+
Vertica 10	TITLE	0.99+
Pure Storage	ORGANIZATION	0.99+
one	QUANTITY	0.99+
today	DATE	0.99+
Philips	ORGANIZATION	0.99+
tomorrow	DATE	0.99+
AT&T.	ORGANIZATION	0.99+
September 2019	DATE	0.99+
Python	TITLE	0.99+
www.vertica.com/bdc2020	OTHER	0.99+
One gig	QUANTITY	0.99+
Amazon	ORGANIZATION	0.99+
Second	QUANTITY	0.99+
First	QUANTITY	0.99+
15 minutes	QUANTITY	0.99+
yesterday	DATE	0.99+

TK Keanini, Cisco | Cisco Live EU 2019

>> Live from Barcelona, Spain. It's the cue covering Sisqo. Live Europe. Brought to you by Cisco and its ecosystem partners. >> Welcome back to sunny Barcelona. Everybody watching the Cube, the leader and live tech coverage. We go out to the events, we extract the signal from the noise we hear There's our third day of coverage that Sisqo live. Barcelona David Lot. John Furrier. This here stew Minutemen all week. John, we've been covering this show. Walter Wall like a canon ae is here is a distinguished engineer and product line. CTO for Cisco Analytics. Welcome to the Cube. You see you again. Welcome back to the Cube. I should say thank you very much. So tell us about your role. You're focused right now on malware encryption. We want to get into that, but but set it up with your roll >> first. Well, I'm trying to raise the cost to the bad guy's hiding in your network. I mean, basically it's it. It it's an economics thing because one there's a lot of places for them to hide. And and they they are innovating just as much as we are. And so if I can make it more expensive for them to hide and operate. Then I'm doing my job. And and that means not only using techniques of the past but developing new techniques. You know, Like I said, it's It's really unlike a regular job. I'm not waiting for the hard drive to fail or a power supply to fail. I have an active adversary that's smart and well funded. So if I if I shipped some innovation, I forced them to innovate and vice versa. >> So you're trying to reduce their our ally and incentives. >> I want to make it too expensive for them to do business. >> So what's the strategy there? Because it's an arms race. Obviously wanted one one. You know, Whitehead over a black hat, kind of continue to do that. Is it decentralized to create more segments? What is the current strategies that you see to make it more complex or less economically viable to just throw resource at a port or whatever? >> There's sort of two dimensions that are driving change one. You know they're trying to make a buck. Okay? And and, you know, we saw the ransomware stuff we saw, you know, things that they did to extract money from a victim. Their latest thing now is they've They've realized that Ransomware wasn't a recurring revenue stream for them. Right? And so what's called crypto jacking is so they essentially have taking the cost structure out of doing crypto mining. You know, when you do crypto mining, you'll make a nickel, maybe ten cents, maybe even twenty cents a day. Just doing this. Mathematical mining, solving these puzzles. And if you had to do that on your own computer, you'd suck up all this electricity and thing. You'd have some cost structure, right and less of a margin. But if you go on, you know, breach a thousand computers, maybe ten thousand, maybe one hundred thousand. Guess what, right you? Not one you're hiding. So guess what? Today you make a nickel tomorrow, you make another nickel. So, you know, if you if you go to the threat wall here, you'd be surprised this crypto mining activity taking place here and nobody knows about it. We have it up on the threat wall because we can detect its behavior. We can't see the actual payload because all encrypted. But we have techniques now. Advanced Analytics by which we can now call out its unique behaviour very distinctly. >> Okay, so you're attacking this problem with with data and analytics. Is that right? What? One of the ingredients of your defense? >> Yeah. I mean, they're sort of Ah, three layer cake There. You first. You have? You know, I always say all telemetry is data, but not all data. Is telemetry. All right? So when you when you go about looking at an observation or domain, you know, Inhumans, we have sight. We have hearing these air just like the network or the endpoint. And there's there's telemetry coming out of that, hopefully from the network itself. Okay, because it's the most pervasive. And so you have this dilemma tree telling you something about the good guys and the bad guys and you, you perform synthesis and analytics, and then you have an analytical outcome. So that's sort of the three layer cake is telemetry, analytics, analytical outcome. And what matters to you and me is really the outcome, right? In this case, detecting malicious activity without doing decryption. >> You mentioned observation. Love this. We've been talking to Cuba in the past about observation space. Having an observation base is critical because you know, people don't write bomb on a manifest and ship it. They they hide it's it's hidden in the network, even their high, but also the meta data. You have to kind extract that out. That's kind of where you get into the analytics. How does that observation space gets set up? Happened? Someone creating observation special? They sharing the space with a public private? This becomes kind of almost Internet infrastructure. Sound familiar? Network opportunity? >> Yeah. You know, there's just three other. The other driver of change is just infrastructure is changing. Okay. You mean the past? Go back. Go back twenty years, you had to rent some real estate. You gotto put up some rocks, some air conditioning, and you were running on raw iron. Then the hyper visors came. Okay, well, I need another observation. A ll. You know, I meet eyes and ears on this hyper visor you got urbanity is now you've got hybrid Cloud. You have even serve Ellis computing, right? These are all things I need eyes and ears. Now, there that traditional methods don't don't get me there so again, being able to respect the fact that there are multiple environments that my digital business thrives on. And it's not just the traditional stuff, you know, there's there's the new stuff that we need to invent ways by which to get the dilemma tree and get the analytical >> talkabout this dynamic because we're seeing this. I think we're just both talking before we came on camera way all got our kind of CS degrees in the eighties. But if you look at the decomposition of building blocks with a P, I's and clouds, it's now a lot of moving to spare it parts for good reasons, but also now, to your point, about having eyes and ears on these components. They're all from different vendors, different clouds. Multi cloud creates Mohr opportunities. But yet more complexity. Software abstractions will help manage that. Now you have almost like an operating system concept around it. How are you guys looking at this? I'll see the intent based networking and hyper flex anywhere. You seeing that vision of data being critical, observation space, etcetera. But if you think about holistically, the network is the computer. Scott McNealy once said. Yeah, I mean, last week, when we are this is actually happening. So it's not just cloud a or cloud be anon premise and EJ, it's the totality of the system. This is what's happening >> ways. It's it's absolutely a reality. And and and the sooner you embrace that, the better. Because when the bad guys embrace it verse, You have problems, right? And and you look at even how they you know how they scale techniques. They use their cloud first, okay, that, you know their innovative buns. And when you look at a cloud, you know, we mentioned the eyes and ears right in the past. You had eyes and ears on a body you own. You're trying to put eyes in here on a body you don't own anymore. This's public cloud, right? So again, the reality is somebody you know. These businesses are somewhere on the journey, right? And the journey goes traditional hyper visor. You have then ultimately hybrid multi clouds. >> So the cost issue comes back. The play of everything sass and cloud. It's just You start a company in the cloud versus standing up here on the check, we see the start of wave from a state sponsored terrorist organization. It's easy for me to start a threat. So this lowers the cost actually threat. So that lowers the IQ you needed to be a hacker. So making it harder also helps that this is kind of where you're going. Explain this dynamic because it's easy to start threats, throw, throw some code at something. I could be in a bedroom anywhere in the world. Or I could be a group that gets free, open source tools sent to me by a state and act on behalf of China. Russia, >> Of course, of course, you know, software, software, infrastructures, infrastructure, right? It's It's the same for the bad guys, the good guys. That's sort of the good news and the bad news. And you look at the way they scale, you know, techniques. They used to stay private saying, You know, all of these things are are valid, no matter what side of the line you sit on, right? Math is still math. And again, you know, I just have Ah, maybe a fascination for how quickly they innovate, How quickly they ship code, how quickly they scale. You know, these botnets are massive, right? If you could get about that, you're looking at a very cloud infrastructure system that expands and contracts. >> So let's let's talk a little more about scale. You got way more good guys on the network than bad guys get you. First of all, most trying to do good and you need more good guys to fight the bad guys up, do things. Those things like infrastructure is code dev ops. Does that help the good guys scale? And and how so? >> You know it does. There's a air. You familiar with the concept called The Loop Joe? It was It was invented by a gentleman, Colonel John Boyd, and he was a jet fighter pilot. Need taught other jet fighter pilots tactics, and he invented this thing called Guadalupe and it's it's o d a observe orient decide. And at all right. And the quicker you can spin your doodle ooh, the more disoriented your adversary ISS. And so speed speed matters. Okay. And so if you can observe Orient, decide, act faster, then your adversary, you created almost a knowledge margin by which they're disoriented. And and the speed of Dev ops has really brought this two defenders. They can essentially push code and reorient themselves in a cycle that's frankly too small of a window for the adversary to even get their bearings right. And so speed doesn't matter. And this >> changing the conditions of the test, if you will. How far the environment, of course, on a rabbit is a strategy whether it's segmenting networks, making things harder to get at. So in a way, complexity is better for security because it's more complex. It costs more to penetrate complex to whom to the adversary of the machine, trying very central data base. Second, just hack in, get all the jewels >> leave. That's right, >> that's right. And and again. You know, I think that all of this new technology and and as you mentioned new processes around these technologies, I think it's it's really changing the game. The things that are very deterministic, very static, very slow moving those things. They're just become easy targets. Low cost targets. If you will >> talk about the innovation that you guys are doing around the encryption detecting malware over encrypted traffic. Yeah, the average person Oh, encrypted traffic is totally secure. But you guys have a method to figure out Mel, where behavior over encrypted, which means the payload can't be penetrated or it's not penetrated. So you write full. We don't know what's in there but through and network trav explain what you're working on. >> Yeah. The paradox begins with the fact that everybody's using networks now. Everything, even your thermostat. You're probably your tea kettle is crossing a network somewhere. And and in that reality, that transmission should be secure. So the good news is, I no longer have to complain as much about looking at somebody's business and saying, Why would you operate in the clear? Okay, now I say, Oh, my God, you're business is about ninety percent dot Okay, when I talked about technology working well for everyone, it works just as well for the bad guys. So I'm not going to tell this this business start operating in the clear anymore, so I can expect for malicious activity. No, we have to now in for malicious activity from behavior. Because the inspection, the direct inspection is no longer available. So that we came up with a technique called encrypted Traffic analytics. And again, we could have done it just in a product. But what we did that was clever was we went to the Enterprise networking group and said, if I could get of new telemetry, I can give you this analytical outcome. Okay? That'll allow us to detect malicious activity without doing decryption. And so the network as a sensor, the routers and switches, all of those things are sending me this. Richard, it's Tellem aji, by which I can infer this malicious activity without doing any secret. >> So payload and network are too separate things contractually because you don't need look at the payload network. >> Yeah. I mean, if you want to think about it this way, all encrypted traffic starts out unencrypted. Okay, It's a very small percentage, but everything in that start up is visible. So we have the routers and switches are sending us that metadata. Then we do something clever. I call it Instead of having direct observation, I need an observational derivative. Okay, I need to see its shape and size over time. So at minute five minute, fifteen minute thirty, I can see it's timing, and I can model on that timing. And this is where machine learning comes in because it's It's a science. That's just it's day has come for behavioral science, so I could train on all this data and say, If this malware looks like this at minute, five minute, ten minute fifteen, then if I see that exact behavior mathematically precise behaviour on your network, I can infer that's the same Mallory >> Okay, And your ability you mentioned just you don't have to decrypt that's that gives you more protection. Obviously, you're not exposed, but also presumably better performance. Is that right, or is that not affected? >> A lot? A lot better performance. The cryptographic protocols themselves are becoming more and more opaque. T L s, which is one of the protocols used to encrypt all of the Web traffic. For instance, they just went through a massive revision from one dot two two version one not three. It is faster, It is stronger. It's just better. But there's less visible fields now in the hitter. So you know things that there's a term being thrown around called Dark Data, and it's getting darker for everyone. >> So, looking at the envelope, looking at the network of fact, this is the key thing. Value. The network is now more important than ever explain why? Well, >> it connects everything right, and there's more things getting connected. And so, as you build, you know you can reach more customers. You can You can operate more efficiently, efficiently. You can. You can bring down your operational costs. There's so many so many benefit. >> FBI's also add more connection points as well. Integration. It's Metcalfe's law within a third dimension That dimension data value >> conductivity. I mean, the message itself is growing exponentially. Right? So that's just incredibly exciting. >> Super awesome topic. Looking forward to continuing this conversation. Great. Great. Come. Super important, cool and relevant and more impactful. A lot more action happening. Okay, Thanks for sharing that. Great. It's so great to have you on a keeper. Right, everybody, we'll be back to wrap Day three. Francisco live Barcelona. You're watching the Cube. Stay right there.

Published Date : Jan 31 2019

SUMMARY :

Brought to you by Cisco and its ecosystem partners. You see you again. the hard drive to fail or a power supply to fail. What is the current strategies that you see to make it more complex or less And if you had to do that on your own computer, One of the ingredients of your defense? And so you have this dilemma tree telling you something about the good guys and the bad guys That's kind of where you get into the analytics. And it's not just the traditional stuff, you know, there's there's the new stuff that we need to invent But if you look at the decomposition of building blocks with a P, And and you look at even how they you So that lowers the IQ you needed to be a And you look at the way they scale, you know, techniques. First of all, most trying to do good and you need more good guys to fight And so if you changing the conditions of the test, if you will. That's right, and as you mentioned new processes around these technologies, I think it's it's really talk about the innovation that you guys are doing around the encryption detecting malware over So the good news is, I no longer have to complain as much about So payload and network are too separate things contractually because you don't I can infer that's the same Mallory Okay, And your ability you mentioned just you don't have to decrypt that's that gives you more protection. So you know things that there's a term being thrown around called Dark So, looking at the envelope, looking at the network of fact, this is the key thing. as you build, you know you can reach more customers. It's Metcalfe's law within a I mean, the message itself is growing exponentially. It's so great to have you on a keeper.

ENTITIES

Entity	Category	Confidence
Richard	PERSON	0.99+
Cisco	ORGANIZATION	0.99+
ten thousand	QUANTITY	0.99+
John	PERSON	0.99+
Walter Wall	PERSON	0.99+
ten cents	QUANTITY	0.99+
Scott McNealy	PERSON	0.99+
one hundred thousand	QUANTITY	0.99+
TK Keanini	PERSON	0.99+
John Furrier	PERSON	0.99+
Today	DATE	0.99+
ten minute	QUANTITY	0.99+
tomorrow	DATE	0.99+
FBI	ORGANIZATION	0.99+
last week	DATE	0.99+
David Lot	PERSON	0.99+
five minute	QUANTITY	0.99+
two defenders	QUANTITY	0.99+
third day	QUANTITY	0.99+
Colonel	PERSON	0.99+
Barcelona, Spain	LOCATION	0.99+
Second	QUANTITY	0.99+
two dimensions	QUANTITY	0.98+
one	QUANTITY	0.98+
Cuba	LOCATION	0.98+
Day three	QUANTITY	0.98+
both	QUANTITY	0.98+
One	QUANTITY	0.98+
twenty cents a day	QUANTITY	0.97+
three	QUANTITY	0.97+
Europe	LOCATION	0.97+
Barcelona	LOCATION	0.97+
Metcalfe	PERSON	0.97+
first	QUANTITY	0.97+
eighties	DATE	0.96+
about ninety percent	QUANTITY	0.96+
Cisco Analytics	ORGANIZATION	0.95+
a thousand computers	QUANTITY	0.94+
twenty years	QUANTITY	0.93+
fifteen	QUANTITY	0.92+
First	QUANTITY	0.88+
Cisco	EVENT	0.88+
Cube	TITLE	0.85+
Ellis	ORGANIZATION	0.85+
Sisqo	TITLE	0.83+
third dimension	QUANTITY	0.8+
Whitehead	ORGANIZATION	0.8+
Advanced Analytics	TITLE	0.79+
fifteen minute thirty	QUANTITY	0.76+
three layer	QUANTITY	0.73+
John Boyd	PERSON	0.71+
two	QUANTITY	0.71+
Tellem aji	PERSON	0.71+
ransomware	ORGANIZATION	0.69+
Russia	ORGANIZATION	0.67+
two version	QUANTITY	0.67+
Guadalupe	PERSON	0.66+
Sisqo	PERSON	0.65+
China	ORGANIZATION	0.64+
canon	ORGANIZATION	0.63+
Ransomware	ORGANIZATION	0.54+
2019	DATE	0.54+
Loop	TITLE	0.49+
EU	EVENT	0.47+
Joe	OTHER	0.43+
ingredients	QUANTITY	0.43+
Cube	COMMERCIAL_ITEM	0.38+
Francisco	TITLE	0.28+

Neil Mendelson, Oracle – CUBEConversation - #theCUBE

(dynamic music) >> Hi, I'm Peter Burris, welcome to The Cube. We're having a conversation with Oracle about how to create business value out of data. This is the second segment that we're going to be looking at, the first segment focused on what is data value and how does a business think about generating value with data. This section is going to focus more on what path do you follow, what journey do you take to actually achieve the outcomes, that you want, using data to generate overall better business working with customers, with operators, whatever else it might be. Now, to have that conversation, we've got Neil Mendelson with us today >> Thank you. >> Neil is the Vice President of Big Data and Advanced Analytics at Oracle, welcome to The Cube. >> Thank you, good to be here. >> So Neil, in the first segment that we talked about the idea of what is data value, how do we think about data capital, how we think about how business uses data capital to generate business. Now we're going to get practical and talk about the journey. Once the business is thinking about using data differently, to differentiate itself it then has to undertake a journey to do something. So, what's the first step that a business has to think about as it starts to conceive of the role that data's going to play in their business differently. >> Well, I think you correctly tagged it as being a journey and starting with the business. I think part of where sometimes this goes awry is when we start with it technology first, right? It really begins with the business, right? So, we're starting really with business analysts, people within the line of business. Now, what we're looking for is things that we can actually measure, right, things that we can measure and quantify that drive a real value to the business. >> But, those things are specific outcomes that have a consequence to the business, right? >> They are specific, right? So, it's not like, "Oh, we're interested in "improving our overall business." That's not specific enough, right? You can't give a data scientist the charter to go build an algorithm for improving the overall business, right? It's got to be much more specific. So, let's say we're going to pick something like churn, right, even down to churn to a particular segment, right? So, you want a specific measurable outcome and then you want to be able to understand which executive in the business actually owns that outcome. Because if you can't find the executive that owns the outcome then it may never really matter, right? >> Now that's all the business analyst's job is to try to make sure that the question is being framed properly and that the right people are participating in the process of answering that question. Have I got that right? >> Correct and that the outcome that you hope to achieve is material enough to make a difference in the business and that th key executive that's responsible for that cares and knows about the endeavor that you're embarking upon. >> So, a material outcome that's not so abstract like the business, but also not so pedantic as change the air filter on time that is then, has clear measures associated with it where you can test whether or not you have achieved the outcome and an executive identified that ultimately has responsibility for improving those measure in the business. >> Exactly. >> Okay. What's the next step? >> So, the next step on the journey is to look at how you can pull together the data necessary to begin to answer that question. So, that brings in the data engineer, we used to call them data wranglers. And, you're beginning to look at what kind of data, right, can I obtain from the inside of the business or outside that is material to answering that question. Now, sometimes what happens is that you end up finding out that we're not capturing that key information. And, you've got to go back to the business analyst again and say, "Hey, we can begin a process to being "capturing that information, but is there something else? "What's priority number two? "What's the next thing on the list "in an agile type method that we could go to? "Let's see if that data is readily available." Because, one of the things that you want to do, obviously, is create as much success early in the process as possible. So, things that will elongate this whole process, right? Like, now I have to invent a whole way to collect data in order to actually examine it. Maybe we ought to move on to the next material measurable outcome to the business and then go examine that. >> So, we're really trying to develop habits here. And, habits don't form if the process of getting even started is just too difficult and there is no success. So, identify the outcome, but then the data engineer is responsible for what's the data and can we economically gather it and acquire it. >> Right, and not just economically gather it. But, can be legally gather it because just because we have it doesn't necessarily mean that the intended use that we looked to put forward is one either would pass regulatory control or policy of the company. So, that's important as well. You don't want to get too far down the line only to find out that what you're pursuing is something that your company is not comfortable yet doing. Even if there's an adjacent company that's doing exactly the same thing. >> Right, so we've got the outcome, we've got the measures, we've got the executive support. We've also got the data and we've determined that we can economically and legally and ethically acquire it. What's next? >> So, next the business analyst is going to collaborate together with both the data wrangler, we've got the data, and now the data scientist or the mathematician, right, gets involved. And, what you're beginning to do is to begin to look at the data that's been derived and for the business analysts looking at it it's more of a visual metaphor and for the data scientist looking at it is more from a quantitative point of view. And, you want to spend enough time to understand that you're now looking at the data and some of your original assumptions about the data and about your business are actually holding true. Because, it's possible at that point that you find out that your original assumption that you're working toward, toward changing this outcome needs to actually shift a little bit because what you thought was happening was actually different, right? We were working with a Japanese financial services company and they thought that a lot of their business was essentially coming from younger people that are comfortable using computers. And, it turned out that there was a much older demographic that was actually using their systems than they thought. So, sometimes you have to rejigger and you have to be open to being agile, not to be so fixed on that particular outcome. The data itself and being able to initially examine the data might shift you a little bit left or a little bit right. You got to be open to that. >> So, this process has allowed us to, started to putting in place some of the habits to be empirical, iterative, opportunistic around data. We've actually, now got the data scientists have started building out the data models. We've even started the process of training those models, get them up to creating some value and improving and refining them over time. But, where the industry sometimes falls down is now you get a bunch of technology people involved who say, "Oh, I want to do this without anybody else "knowing about it. "I'm going to download a bunch of open source software. "I'll go secure some stuff over here, "some capacity, maybe in the cloud or maybe "I'll just borrow some cycles somewhere." And, we end up in this 12, 15 this long process of trying to implement the technology. Let's now talk about how we take the habits that are being formed, the outcomes we want to achieve, this working group that's actually making progress and then turn that into a practical solution in the business. >> So, just as you said what we're starting with is trying to become specific in terms of our outcomes, to be able to make sure that it's measurable and to be agile in our process where time, right, is an important factor, cost is an important factor, time is factor and so for is risk, right? And, when it comes to building that technology platform necessary to enable all this time, cost, and risk are still factors, right? So, starting off with trying to build everything yourself from a technological point of view doesn't make a lot of sense, anymore, right? The value that you're going to get from the business is not by assembling computers into racks, right? People have done this stuff for you, right? It's not about taking any kind of software and integrating it together to the extent that you can get higher level components and begin working with those that will give you the ability to turn that data into actual monetary value faster, right? So, don't take the time, necessarily, all the extra time necessary to assemble the stuff. See if you can already get it in a prepackaged form. >> So, time to value becomes a primary criteria overall. 'Cause in many respects, certainly what our research has shown, is that costs go up as you take longer and risk goes up, at least in these complex kinds of initiatives as you take longer, because more people get involved and there's all kinds of crazy things happening. So, the ability to stay agile and make things happen in a valuable away is crucial. Another thing we've seen here, I want to ask you about this, is we've talked to a number of CIOs who were making the observation that while there's a lot of things, a lot of ways they could procure stuff that their shop, itself, has to go through some transformation. And, they are looking at how they can buy options on some of these changes right now and deliver value while setting themselves up for the future. What's the right way of thinking about that process? >> So, it's easy for us sitting here in Silicon Valley to immediately jump to the conclusion that everybody just ought to move to the public cloud, right? And, we're very much a huge proponent of that ourselves, right? In fact, we've transformed our business to essentially heavily weight entirely toward the cloud, right? And, there are real benefits in obviously doing that, right? When you're getting infrastructure in the cloud it's immediately available to you. You don't have to pay for it all up front. You can scale it over time. It has all those obvious benefits, right? But, there are times when either because of a governmental regulation or because of a policy, your company policy or because of just latency issues, it's not really possible to go to the public cloud. In which case you need to do that work behind your firewall. >> You need to bring the cloud to the data. >> Exactly right? And, as you said, even when that option is available and in fact Oracle does have that option now available to customers with this notion of cloud a customer where we're literally taking a piece of the Oracle cloud and putting it behind your firewall. But, for some companies, that in and of itself may be a leap too far. So, being able to consolidate systems together being able to move to a more simplistic option that gives you still that open ability to move to the cloud either on premises or in the public cloud over time is important to people. So, what we find is companies are looking for different paths, right? They may be looking to go directly to the public cloud if they're comfortable doing so and if the kind of use case that they're working on is capable of doing that. Or, they may need to stay behind their firewall and entertain the notion of cloud a customer. Or, depending upon where they are in terms of their organizational readiness they also may find that they'd rather move toward an engineered system or an appliance model which gives them the ability to move to the cloud when they're ready but doesn't force that sea change on an organization that may not yet be ready for it. >> Right, so we're looking at a couple of different options predicated on the characteristics of the problem that we're trying to solve, the maturity of the shop that's trying to solve it or the combination of the shop and the business, and then obviously, where we want to put our time and energy. Do we want to put it into the infrastructure or do we want to put it into solving the problem? And increasingly, people want to solve the problem. >> Well, in the end that's what we're expected to do as a business. And, that's some of the key differences or shifts that's happening in IT or technology segment. Today, we have to be focused, from a technologists point of view and understand how we can help the business solve the problem and technology is a means to that end not a thing unto itself. >> So Neil, as you think from your perspective in big data and analytics, as you think about what the world's going to look like differently in three years what is the one or two things that you would focus your attention on if you were a CIO and about to undertake this journey of finding new and better ways of turning data into value within the business? >> I think we mentioned a few, right? One, we want to make sure that we're driving it from a business perspective. We want to make sure that we have tangible outcomes that we've identified. We want to make sure that the data is more readily available for those use cases that we want to pursue. And, we want to make sure that the infrastructure that put into play is appropriate not only from a regulatory and policy point of view but is a good fit for where the organization happens to be at that time. >> And, doesn't cut us off from options. >> Exactly, it's important not to be able to invest in something that will become a dead end, right? We're really working hard to ensure that at whatever place the customers are at in this journey, right, that we can on board them, right, in a place that they're comfortable with but still allow them to move through the different stages as they see fit. >> Right, so over all we've talked about the value of data. We've talked about some of the practical things that an IT shop with the business can do to achieve valued data. It doesn't diminish the role that the cloud is going to play. It positions it in the context of the nature of the problem, the nature of the shop. Neil, this has been a great discussion. >> I've enjoyed it, thank you. >> So, once again Peter Burris from The Cube talking about the journey to creating value, business value out of data with the appropriate combination of agile data methods and an infrastructure approach that allows business to stay focused on the problem and not the infrastructure. Once again, thank you for joining us from The Cube and we hope to see you again soon. (dynamic music)

Published Date : Jun 30 2017

SUMMARY :

This section is going to focus more on what path do you follow, Neil is the Vice President the role that data's going to play So, we're starting really with business analysts, that owns the outcome then it may never is being framed properly and that the right people Correct and that the outcome that you as change the air filter on time that is then, What's the next step? So, the next step on the journey is to look at So, identify the outcome, but then the data engineer that the intended use that we looked to put forward We've also got the data and we've determined So, next the business analyst is going to collaborate that are being formed, the outcomes we want to achieve, all the extra time necessary to assemble the stuff. So, the ability to stay agile and make things happen it's not really possible to go to the public cloud. and if the kind of use case that they're working on and the business, and then obviously, Well, in the end that's what we're expected to do happens to be at that time. Exactly, it's important not to be able to of the problem, the nature of the shop. talking about the journey to creating value,

ENTITIES

Entity	Category	Confidence
Peter Burris	PERSON	0.99+
Neil Mendelson	PERSON	0.99+
Neil	PERSON	0.99+
Oracle	ORGANIZATION	0.99+
one	QUANTITY	0.99+
Silicon Valley	LOCATION	0.99+
second segment	QUANTITY	0.99+
first segment	QUANTITY	0.99+
Today	DATE	0.99+
both	QUANTITY	0.99+
first step	QUANTITY	0.98+
three years	QUANTITY	0.98+
two things	QUANTITY	0.98+
Big Data	ORGANIZATION	0.98+
The Cube	ORGANIZATION	0.98+
today	DATE	0.97+
One	QUANTITY	0.97+
Vice President	PERSON	0.87+
two	QUANTITY	0.82+
first	QUANTITY	0.81+
Advanced Analytics	ORGANIZATION	0.75+
12, 15	QUANTITY	0.69+
#theCUBE	ORGANIZATION	0.63+
Japanese	LOCATION	0.57+
agile	TITLE	0.34+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Operations and Technology Advanced Analytics Group: