Image Title

Search Results for Mode Two:

Nagaraj Sastry, HCL Technologies | Snowflake Summit 2022


 

>>Welcome back to the cubes. Continuing coverage of day, one of the snowflake summit 22 live from seizures forum in Las Vegas. I'm Lisa Martin. My co-host for the week is Dave ante, Dave and I are pleased to welcome Naga Raj Sastry to the program, the vice president of data and analytics at HCL technologies. Welcome. Great to have you. >>Same here. Thank you for inviting me here. >>Isn't it great to be back in person? >>Oh, love it. >>This the keynote this morning. I don't know if you had a chance to see it standing room only there was overflow rooms. People are ready for this, and it was a jam packed morning of announcements. >>Absolutely. >>Talk to us a little bit about the HCL snowflake partnership, but anybody in the audience who may not be familiar with HCL, give us a little bit of a background, vision, mission differentiation, and then that snowflake duo. >>Sure, sure. So let me first start off with, um, uh, talking about H at seal, we are 11.5 billion organization. Uh, we have three modes of working mode. One is everything to do with our infrastructure business and application services and maintenance mode. Two is anything that we do in the cutting edge, uh, ecosystem, whether it is cloud, whether it is application modernization, ERPs, uh, SA all of those put together is more to data. Analytics is part of our more to culture. Um, the whole ecosystem is called digital services business and, uh, within digital, uh, services, the one of the arms is data and analytics. We are about a billion dollars in terms of revenues from a data and analytics perspective, uh, of the 11 billion that I was talking to you about. And mode three is everything to do with our software services. So we have got our own software products, and that's a third of our business. So that's about HCL. So at C and, uh, snowflake relationship, we are a elite partner with snowflake. We are one of the fastest growing partners. We achieved the elite level within 18 months of us signing up as a snowflake partner. We're close to about 50 plus implementations worldwide, and, uh, about 800 people who are snowflake professionals within, within that CLE ecosystem, large customers that we serve. >>And how long have you been partners? >>Uh, about 18 to 20 months now. >>Okay. So, so the, during the last couple of tumultuous years, why snowflake, what was it about their vision, their strategy, their leadership that really to spoke to HCL as this is a partner for us? >>So, so one of the, uh, biggest things that we realized, uh, probably about four years ago was in terms of, you know, you had all the application databases or RDBMSs PPS, the huddle P ecosystems, which are getting expense systems, which were getting expensive, not in terms of the cost, but in terms of the pro processing times, the way the queries were getting created. And we knew that there was, there is something that is going to come and the people and the people. Yeah. >>And, uh, and we knew that, you know, there will be a hyperscaler that will come. And, uh, of course there was Azure was already there. AWS was there, Google was just picking it up. And at that point in time, we realized that, you know, there will be a cloud data warehouse because we had started reading about snowflake at that point in time. So fast forward a couple of years after that, and we realized that if we are to be in this business, you know, the, the right way of doing it is by getting partnering a partnering with the right tooling company. And snowflake brings that to table. We all know that now. And, uh, with, with what, what the keynote speakers were also saying, right, from 150 member team about five years ago in, uh, conference to about 12,000 people now. So you know that this is the right thing to do, and this is the right place to be at. So we, we devised a methodology in terms of saying that let's get into the partnership, let's get our resources trained and certified on the snowflake ecosystem. And let's take a point of view to our customers in terms of how data migrations and transformations have to be done in the snowflake arena. When >>You, when you think about your modes, you talked about modes one, two, and three. If I feel like snowflake touches on each of those, maybe not so much of the infrastructure and the apps, but although maybe going forward, it does increasingly. So, yeah, that's my question is where do you see snowflake vectoring into your modes? >>So it doesn in both in the first two modes, uh, and mode three also, uh, because, and I'll give you the reasons why mode one is predominantly because you can do application development on cloud yep. On the data cloud now, um, which basically means that I can have a qu application run on snowflake. Eventually that's the goal. Second is, uh, in, in more two, because it is a cloud data warehouse, it fits in exactly because the application data is in snowflake. I've got my, uh, regular data sets within snowflake. Both are talking to each other. There is zero, um, lapse time from a user perspective, >>It's a direct >>Tip. And then more three, the reason why I said more three was because software as a service or software services and products is because I can power by snowflake. I can implement that. So that's why it cuts across our entire ecosystem. >>The, the dig, the whole thing is called your dig business, correct? Yes. Is that right? So that's, this is the, the next wave of digital business that we're seeing here, cuz it's digital is data <laugh> right. That's really what it's about. It's about putting that data to work. >>So the president of our digital business, a BJA who was, who had done the, who had done a session in the, in the afternoon today, he says the D in the digital is data. >>There is right. >>And, uh, that's what we are seeing with our customers, large implementations that we do in this ecosystem. There is one other thing that we are focusing, uh, very heavily on is industrial solutions or industry led solutions. Like whether it is for healthcare, whether it's for retail or financial services, name, a vertical. And we have got our own capabilities around industrialized solutions that's fit that fit certain use cases. >>So in thinking about the D in, in digital is really data. If you think about the operating model for data, it's obviously evolved, you mentioned, had do, went to the cloud and all the data went to the cloud, but today it's, you've got an application development model, you got database, which is sort of hardened. And then you've got your data pipeline and your, your data stack and, and that's kind of the operating model. There's sort of siloed to a great degree. Mm-hmm <affirmative> how is that operating model changing as a result of, of data? So >>I answered it in two parts. Part is if you, if you realize over the years, what used to happen is you had a CIO in an organization or C more CIO, but, and then you had enterprise architecture teams, application development teams, support teams, and so on and so forth in the last 36 months. If you see there is an emergence of a new role, which is called the da chief data and analytics officer. So the data and analytics officer is a role that has been created. And the purpose of creating that role is to ensure that organizations will pull out our call out resources within the CIO organizations who are enterprise architects, who are data architects, who are application architects or security architects, and bring them under into the ecosystem of the data office from an operating model perspective. So that innovations can be driven. >>Data driven enterprises could be created and innovations can come through there. The other part of that is the use cases get prioritized when you start innovating. And then it is a factory model in terms of how those use cases get built, which is, which is, which is a no brainer in my mind, at least. But that is how the operating model is coming up from a people perspective, from a technology perspective. Also there is an operating model that is emerging. If you see all the hyperscalers that are there today, snowflake with its LA most latest and greatest announcements. If you see the way the industry is going, is everything will be housed into one ecosystem and the beauty of this entire thing. And if you, you are to, you'll be able to fathom it effectively, right? Because if you are, if I'm, multi-cloud kind of an environment and if I'm on snowflake, I don't care why, because I'm snowflake, which is, which can work around across the multi clouds. So my data is in one place >>Effectively. Yeah. It's interesting what you were saying about the chief data officer, the chief data officer, that role emerged out of the, the ashes, like a Phoenix of, of, you know, compliance data quality and, and healthcare and financial services and government, the highly regulated industries. And then it took a while, but it, it increasingly became, wow, this is a really front front of the board level role, if you will, you know, data, and now you're seeing it. It's it's, it is integrated with digital. >>Absolutely. And there is one other point, if you think about it, the emergence of the chief data officer came in because there were issues associated to data quality. Yeah. There were issues associated to data cataloging as to how data is cataloged. And there were issues in terms of trustability of the data. Now, the trustability of the data can be in two places. One is a data quality, Hey, bad data, garbage and garbage out. But then the other aspect of the trustability is in terms of, can I do the seven CS of data quality and say that, okay, I can hallmark this data platinum or gold or silver or bronze or UN hallmark data. And with snowflake, the advantage is if I, if you have a hallmark data set, that is a, say a platinum or a gold, and thanks to the virtual warehouse, the same data set gets penetrated across the enterprise. That's the beauty with which it comes. And then of course the metadata aspect of it, bringing in the technical metadata and the business metadata together for the purpose of creating the data catalogs is another key cool thing and enabled again by snowflake. >>What are some of it when you're in customer conversations, some of the myths or misconceptions that customers historically have typically been making when it comes to creating a data strategy, some of the misconceptions, and then what is your recommendation for those folks since every company, these days to be competitive has to be a data company. >>Yeah. So around data structures, the, the whole thought process has to be, uh, either do in the past, we used to go with, from source applications, we would gather requirements. Then we would figure out what sources are there, do a profiling of the data and then say, okay, the target data, data model should be this >>Too slow, >>Too slow right now, fast forward to the digital transformation. There is producers of data, which is basically that applications that are being modernized today are producers of data. They're actually telling you that I'm producing this kind of data. This is the kind of events that I'm producing. And this is my structure. Now the whole deal is I don't need to figure out what the requirements are. I know what the use case the application is going to be helping me with. So therefore the entire data model is supported. So, but at the same point in time, the newer generation applications that are getting created are not only created getting created in terms of the customer experience. Of course, that is very critical, but they're also taking into account aspects around metadata, the technical metadata associated within an application, the data quality rules or business rules that are implemented within an application, all of that is getting documented as a result, the whole timeline from source to profile to model, which used to be X number of days in the past is X minus, at least 20% now or 30% actually. So that is how the structures, uh, the data structures are coming into a play future futuristic thought process would be, there will be producers of data and there'll be consumers of data. Where is ETL then or ELT. Then there is not going to be any ETL or ELT because a producer is going to say that I'm producing the data for this. A consumer says that, okay, I wanna consume the data for this purpose. There, they meet through an API layer. So where is ETL eventually going to go away? >>Well, and those consumers of, if you think about the, the way it works today, the, the data operating model, if you will, the transaction systems and other systems draw off a bunch of exhaust, they gets thrown over the fence to the analytics system. They're not operation the data, the data pipeline, the data systems are not operationalized in a way that they need to be. And obviously Snowflake's trying to change that. >>So data >>That's a big change, please. >>Yeah. Sorry. Didn't mean to cut you off. My >>Apologies. No, no. I'm >>So data operations is a very, very critical aspect. And if you think about it holistically, we used to have ETL pipelines T pipelines. And then we used to have queries being written on top of metadata or PPS and HaLoop and all of that and reporting tools that would have number of reports that were created and certain self-service BI reports into the ecosystem. Now, when you think in terms of a cloud data warehouse, what is happening? Is this the way you are architecting your solution today in terms of data pipelines, those data pipelines are self manageable or self-healing do not need the number of people where there was no documentation in terms of what ETL pipelines were written in the past on certain ETL tools or why something is failing. Nobody knew why something was failing because these are age old code, but take it forward today. >>What happens is our organizations are migrating from on-prem to cloud and to the cloud data warehouse. And the overall cost of ownership is decreasing. The reason is the way we are implementing the data pipelines, the way the data operations are being done in terms of, you know, even before a pipeline is kicked, uh, or kicked in, then, you know, there is a check process to say whether the source application is ready or not ready. So such things, small, small things, which are part and parcel of the entire data operations lifecycle are taking the center stage as a result, self fueling mechanisms are coming in. And because of those self fueling mechanisms, metrics are being captured as a result, you know exactly where to focus on and where not to focus on as, as a result, the number of resources needed to support gets reduced. Cost of one service >>Is low, much higher trust self-service infrastructure, uh, data context in the hands of, of business users. Data is now more discoverable it's governed. So you can now create data products more quickly. So speed and scale become extremely important. >>Absolutely. And in fact, one of the things that, that, uh, that is changing is the way search is getting implemented here to in the past, you created an index and then, you know, the data is searchable, but now it is contextual search. Can I contextualize the entire search? Can I create a machine learning algorithm that will actually say that, okay, Nara as a persona was looking for this kind of data and then Nara as a person, or comes back again and looks for some different kind of data. Can the machine learning algorithm go and figure out, okay, what is, what is going on in a garage's mind? What is he trying to look at? And then, you know, improve the, the whole learnability of the, of the entire algorithm. That's how search is going to also take, get into a change kind of a scenario. >>Excellent NAAU garage. Thank you so much for joining us, talking about data modernization at speed, end scale HCL, what you're doing, what you're doing with snowflake, and the sounds like incredible power that you're enabling. And we're only just scratching the surface. I have a feeling there's a lot more under there that you guys are gonna uncover. >>Sure. So we have, we have a tool or an accelerator. We call it an accelerator in the HCL parlance, but just actually a tool. So when you think about data modernization onto snowflake, it is predominantly migrating the data set from your existing ecosystem onto snowflake. That is one aspect of it. The second aspect of it is the modernization of the ETL or E LT pipelines. The third aspect associated to the data that is there within this, these ecosystems is the reconciliation older application, uh, sorry, older legacy, uh, platform snowflake legacy platform gives me result. X does snowflake give me result X that kind of a reconciliation has to be done. Data reconciliation and testing. And then the third fourth layer associated is the reporting and visualization. So these four layers are part and parcel of something that we call as advantage. Migrate advantage migrate will convert your ter data, data, uh, model into a snowflake understandable data model automatically whether it's ter data, whether it is Oracle, extra data, green plum, <inaudible> you name a ecosystem. >>We have the mechanism to convert a data model from whatever it is into snowflake readable, understandable data model. The second aspect is the et L E L T pipeline. Whether you want to go from Informatica to DBT or Informatica to something else or data stage to something else doesn't matter. There is a, there is an algorithm, or there is a tool which is called the ETL pipeline. We call it gateway suit, gateway suit actually converts the code. It reads the code that is there on the left hand side, which is the legacy code, understands the logic, it reverse engineers and understands the logic. And then what it does is we use that understanding or that logic that has been called out into spark code or DBT or any other tool of your choice from a customer standpoint. That's the second layer. Third layer I talked about, which is basically data testing, automated data testing and data reconciliation and the last, but not the least is the reporting because older ways of reporting and visualization with, with current day reporting and visualization, which is more persona based, the art of visualization is something difficult or different in this, in this aspect, come over to our booth at 2 1, 1 4, and you'll see, uh, advantage migrate in the works >>Advantage. Migrate. There you go. Nero, thank you so much for joining us on the program and unpacking HCL, giving us really that technical dissection of what you guys are doing and together with snowflake. We appreciate your time. >>Thank you. My pleasure. Thank you >>For our guest and Dave ante. This is Lisa Martin live from the show floor of snowflake summit 22, Dave and I will be right back with our final guest of day one in just a minute.

Published Date : Jun 15 2022

SUMMARY :

Continuing coverage of day, one of the snowflake summit 22 live Thank you for inviting me here. This the keynote this morning. Talk to us a little bit about the HCL snowflake partnership, but anybody in the audience who may not be familiar We are one of the fastest growing partners. their strategy, their leadership that really to spoke to HCL as this cost, but in terms of the pro processing times, the way the queries were getting created. And at that point in time, we realized that, you know, there will be a cloud data warehouse because we had started reading You, when you think about your modes, you talked about modes one, two, and three. So it doesn in both in the first two modes, uh, So that's why it cuts across our entire ecosystem. The, the dig, the whole thing is called your dig business, correct? So the president of our digital business, a BJA who was, who had done the, who had done a session in There is one other thing that we are focusing, uh, very heavily on is industrial all the data went to the cloud, but today it's, you've got an application development model, So the data and analytics officer is a role that has been created. The other part of that is the use cases get prioritized when you start innovating. of the board level role, if you will, you know, data, and now you're seeing it. And there is one other point, if you think about it, the emergence of the chief some of the misconceptions, and then what is your recommendation for those folks since every company, these days to be competitive the whole thought process has to be, uh, either do in the past, So that is how the structures, the way it works today, the, the data operating model, if you will, the transaction systems and Didn't mean to cut you off. And if you think about it holistically, The reason is the way we are implementing the data pipelines, the way the data operations So you can now create data products more quickly. And in fact, one of the things that, that, uh, I have a feeling there's a lot more under there that you guys are So when you think about data modernization We have the mechanism to convert a data model from whatever it is into snowflake giving us really that technical dissection of what you guys are doing and together with snowflake. Thank you. This is Lisa Martin live from the show floor of snowflake summit

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Lisa MartinPERSON

0.99+

DavePERSON

0.99+

Nagaraj SastryPERSON

0.99+

GoogleORGANIZATION

0.99+

AWSORGANIZATION

0.99+

11 billionQUANTITY

0.99+

two partsQUANTITY

0.99+

Naga Raj SastryPERSON

0.99+

HCLORGANIZATION

0.99+

Las VegasLOCATION

0.99+

InformaticaORGANIZATION

0.99+

11.5 billionQUANTITY

0.99+

30%QUANTITY

0.99+

two placesQUANTITY

0.99+

third aspectQUANTITY

0.99+

SecondQUANTITY

0.99+

second layerQUANTITY

0.99+

OracleORGANIZATION

0.99+

second aspectQUANTITY

0.99+

DBTORGANIZATION

0.99+

bothQUANTITY

0.99+

18 monthsQUANTITY

0.99+

HCL TechnologiesORGANIZATION

0.99+

one aspectQUANTITY

0.99+

oneQUANTITY

0.99+

NAAUORGANIZATION

0.99+

BothQUANTITY

0.99+

Third layerQUANTITY

0.99+

NeroPERSON

0.99+

TwoQUANTITY

0.98+

four layersQUANTITY

0.98+

firstQUANTITY

0.98+

20 monthsQUANTITY

0.98+

todayDATE

0.97+

about 800 peopleQUANTITY

0.97+

eachQUANTITY

0.97+

OneQUANTITY

0.97+

LALOCATION

0.96+

day oneQUANTITY

0.96+

about 12,000 peopleQUANTITY

0.96+

SnowflakeORGANIZATION

0.96+

Dave antePERSON

0.96+

Snowflake Summit 2022EVENT

0.96+

first two modesQUANTITY

0.95+

zeroQUANTITY

0.95+

NaraPERSON

0.95+

about a billion dollarsQUANTITY

0.94+

AzureTITLE

0.93+

third fourth layerQUANTITY

0.93+

thirdQUANTITY

0.93+

threeQUANTITY

0.92+

about 50 plus implementationsQUANTITY

0.92+

mode threeOTHER

0.91+

this morningDATE

0.91+

about four years agoDATE

0.91+

about five years agoDATE

0.9+

2OTHER

0.89+

twoQUANTITY

0.88+

BJAORGANIZATION

0.87+

thingQUANTITY

0.87+

1OTHER

0.87+

H at sealORGANIZATION

0.86+

seven CSQUANTITY

0.85+

one ecosystemQUANTITY

0.85+

least 20%QUANTITY

0.82+

150 member teamQUANTITY

0.81+

waveEVENT

0.78+

about 18QUANTITY

0.77+

last 36 monthsDATE

0.77+

pointQUANTITY

0.73+

22EVENT

0.73+

Programmable Quantum Simulators: Theory and Practice


 

>>Hello. My name is Isaac twang and I am on the faculty at MIT in electrical engineering and computer science and in physics. And it is a pleasure for me to be presenting at today's NTT research symposium of 2020 to share a little bit with you about programmable quantum simulators theory and practice the simulation of physical systems as described by their Hamiltonian. It's a fundamental problem which Richard Fineman identified early on as one of the most promising applications of a hypothetical quantum computer. The real world around us, especially at the molecular level is described by Hamiltonians, which captured the interaction of electrons and nuclei. What we desire to understand from Hamiltonian simulation is properties of complex molecules, such as this iron molded to them. Cofactor an important catalyst. We desire there are ground States, reaction rates, reaction dynamics, and other chemical properties, among many things for a molecule of N Adams, a classical simulation must scale exponentially within, but for a quantum simulation, there is a potential for this simulation to scale polynomials instead. >>And this would be a significant advantage if realizable. So where are we today in realizing such a quantum advantage today? I would like to share with you a story about two things in this quest first, a theoretical optimal quantum simulation, awkward them, which achieves the best possible runtime for generic Hamiltonian. Second, let me share with you experimental results from a quantum simulation implemented using available quantum computing hardware today with a hardware efficient model that goes beyond what is utilized by today's algorithms. I will begin with the theoretically optimal quantum simulation uncle rhythm in principle. The goal of quantum simulation is to take a time independent Hamiltonian age and solve Schrodinger's equation has given here. This problem is as hard as the hardest quantum computation. It is known as being BQ P complete a simplification, which is physically reasonable and important in practice is to assume that the Hamiltonian is a sum over terms which are local. >>For example, due to allow to structure these local terms, typically do not commute, but their locality means that each term is reasonably small, therefore, as was first shown by Seth Lloyd in 1996, one way to compute the time evolution that is the exponentiation of H with time is to use the lead product formula, which involves a successive approximation by repetitive small time steps. The cost of this charterization procedure is a number of elementary steps, which scales quadratically with the time desired and inverse with the error desired for the simulation output here then is the number of local terms in the Hamiltonian. And T is the desired simulation time where Epsilon is the desired simulation error. Today. We know that for special systems and higher or expansions of this formula, a better result can be obtained such as scaling as N squared, but as synthetically linear in time, this however is for a special case, the latest Hamiltonians and it would be desirable to scale generally with time T for a order T time simulation. >>So how could such an optimal quantum simulation be constructed? An important ingredient is to transform the quantum simulation into a quantum walk. This was done over 12 years ago, Andrew trials showing that for sparse Hamiltonians with around de non-zero entries per row, such as shown in this graphic here, one can do a quantum walk very much like a classical walk, but in a superposition of right and left shown here in this quantum circuit, where the H stands for a hazard market in this particular circuit, the head Mar turns the zero into a superposition of zero and one, which then activate the left. And the right walk in superposition to graph of the walk is defined by the Hamiltonian age. And in doing so Childs and collaborators were able to show the walk, produces a unitary transform, which goes as E to the minus arc co-sign of H times time. >>So this comes close, but it still has this transcendental function of age, instead of just simply age. This can be fixed with some effort, which results in an algorithm, which scales approximately as towel log one over Epsilon with how is proportional to the sparsity of the Hamiltonian and the simulation time. But again, the scaling here is a multiplicative product rather than an additive one, an interesting insight into the dynamics of a cubit. The simplest component of a quantum computer provides a way to improve upon this single cubits evolve as rotations in a sphere. For example, here is shown a rotation operator, which rotates around the axis fi in the X, Y plane by angle theta. If one, the result of this rotation as a projection along the Z axis, the result is a co-sign squared function. That is well-known as a Ravi oscillation. On the other hand, if a cubit is rotated around multiple angles in the X Y plane, say around the fee equals zero fee equals 1.5 and fee equals zero access again, then the resulting response function looks like a flat top. >>And in fact, generalizing this to five or more pulses gives not just flattered hops, but in fact, arbitrary functions such as the Chevy chef polynomial shown here, which gets transplants like bullying or, and majority functions remarkably. If one does rotations by angle theta about D different angles in the X Y plane, the result is a response function, which is a polynomial of order T in co-sign furthermore, as captured by this theorem, given a nearly arbitrary degree polynomial there exists angles fi such that one can achieve the desired polynomial. This is the result that derives from the Remez exchange algorithm used in classical discreet time signal processing. So how does this relate to quantum simulation? Well recall that a quantum walk essentially embeds a Hamiltonian insight, the unitary transform of a quantum circuit, this embedding generalize might be called and it involves the use of a cubit acting as a projector to control the application of H if we generalize the quantum walk to include a rotation about access fee in the X Y plane, it turns out that one obtains a polynomial transform of H itself. >>And this it's the same as the polynomial in the quantum signal processing theorem. This is a remarkable result known as the quantum synchrony value transformed theorem from contrast Julian and Nathan weep published last year. This provides a quantum simulation auger them using quantum signal processing. For example, can start with the quantum walk result and then apply quantum signal processing to undo the arc co-sign transformation and therefore obtain the ideal expected Hamiltonian evolution E to the minus I H T the resulting algorithm costs a number of elementary steps, which scales as just the sum of the evolution time and the log of one over the error desired this saturates, the known lower bound, and thus is the optimal quantum simulation algorithm. This table from a recent review article summarizes a comparison of the query complexities of the known major quantum simulation algorithms showing that the cubitus station and quantum sequel processing algorithm is indeed optimal. >>Of course, this optimality is a theoretical result. What does one do in practice? Let me now share with you the story of a hardware efficient realization of a quantum simulation on actual hardware. The promise of quantum computation traditionally rests on a circuit model, such as the one we just used with quantum circuits, acting on cubits in contrast, consider a real physical problem from quantum chemistry, finding the structure of a molecule. The starting point is the point Oppenheimer separation of the electronic and vibrational States. For example, to connect it, nuclei, share a vibrational mode, the potential energy of this nonlinear spring, maybe model as a harmonic oscillator since the spring's energy is determined by the electronic structure. When the molecule becomes electronically excited, this vibrational mode changes one obtains, a different frequency and different equilibrium positions for the nuclei. This corresponds to a change in the spring, constant as well as a displacement of the nuclear positions. >>And we may write down a full Hamiltonian for this system. The interesting quantum chemistry question is known as the Frank Condon problem. What is the probability of transition between the original ground state and a given vibrational state in the excited state spectrum of the molecule, the Frank content factor, which gives this transition probability is foundational to quantum chemistry and a very hard and generic question to answer, which may be amiable to solution on a quantum computer in particular and natural quantum computer to use might be one which already has harmonic oscillators rather than one, which has just cubits. This has provided any Sonic quantum processors, such as the superconducting cubits system shown here. This processor has both cubits as embodied by the Joseph's injunctions shown here, and a harmonic oscillator as embodied by the resonant mode of the transmission cavity. Given here more over the output of this planar superconducting circuit can be connected to three dimensional cavities instead of using cubit Gates. >>One may perform direct transformations on the bull's Arctic state using for example, beam splitters, phase shifters, displacement, and squeezing operators, and the harmonic oscillator, and may be initialized and manipulated directly. The availability of the cubit allows photon number resolve counting for simulating a tri atomic two mode, Frank Condon factor problem. This superconducting cubits system with 3d cavities was to resonators cavity a and cavity B represent the breathing and wiggling modes of a Triumeq molecule. As depicted here. The coupling of these moles was mediated by a superconducting cubit and read out was accomplished by two additional superconducting cubits, coupled to each one of the cavities due to the superconducting resonators used each one of the cavities had a, a long coherence time while resonator States could be prepared and measured using these strong coupling of cubits to the cavity. And Posana quantum operations could be realized by modulating the coupling cubit in between the two cavities, the cavities are holes drilled into pure aluminum, kept superconducting by millikelvin scale. >>Temperatures microfiber, KT chips with superconducting cubits are inserted into ports to couple via a antenna to the microwave cavities. Each of the cavities has a quality factor so high that the coherence times can reach milliseconds. A coupling cubit chip is inserted into the port in between the cavities and the readout and preparation cubit chips are inserted into ports on the sides. For sake of brevity, I will skip the experimental details and present just the results shown here is the fibrotic spectrum obtained for a water molecule using the Pulsonix superconducting processor. This is a typical Frank content spectrum giving the intensity of lions versus frequency in wave number where the solid line depicts the theoretically expected result and the purple and red dots show two sets of experimental data. One taken quickly and another taken with exhaustive statistics. In both cases, the experimental results have good agreement with the theoretical expectations. >>The programmability of this system is demonstrated by showing how it can easily calculate the Frank Condon spectrum for a wide variety of molecules. Here's another one, the ozone and ion. Again, we see that the experimental data shown in points agrees well with the theoretical expectation shown as a solid line. Let me emphasize that this quantum simulation result was obtained not by using a quantum computer with cubits, but rather one with resonators, one resonator representing each one of the modes of vibration in this trial, atomic molecule. This approach represents a far more efficient utilization of hardware resources compared with the standard cubit model because of the natural match of the resonators with the physical system being simulated in comparison, if cubit Gates had been utilized to perform the same simulation on the order of a thousand cubit Gates would have been required compared with the order of 10 operations, which were performed for this post Sonic realization. >>As in topically, the Cupid motto would have required significantly more operations because of the need to retire each one of the harmonic oscillators into some max Hilbert space size compared with the optimal quantum simulation auger rhythms shown in the first half of this talk, we see that there is a significant gap between available quantum computing hardware can perform and what optimal quantum simulations demand in terms of the number of Gates required for a simulation. Nevertheless, many of the techniques that are used for optimal quantum simulation algorithms may become useful, especially if they are adapted to available hardware, moving for the future, holds some interesting challenges for this field. Real physical systems are not cubits, rather they are composed from bolt-ons and from yawns and from yawns need global anti-Semitism nation. This is a huge challenge for electronic structure calculation in molecules, real physical systems also have symmetries, but current quantum simulation algorithms are largely governed by a theorem, which says that the number of times steps required is proportional to the simulation time. Desired. Finally, real physical systems are not purely quantum or purely classical, but rather have many messy quantum classical boundaries. In fact, perhaps the most important systems to simulate are really open quantum systems. And these dynamics are described by a mixture of quantum and classical evolution and the desired results are often thermal and statistical properties. >>I hope this presentation of the theory and practice of quantum simulation has been interesting and worthwhile. Thank you.

Published Date : Sep 24 2020

SUMMARY :

one of the most promising applications of a hypothetical quantum computer. is as hard as the hardest quantum computation. the time evolution that is the exponentiation of H with time And the right walk in superposition If one, the result of this rotation as This is the result that derives from the Remez exchange algorithm log of one over the error desired this saturates, the known lower bound, The starting point is the point Oppenheimer separation of the electronic and vibrational States. spectrum of the molecule, the Frank content factor, which gives this transition probability The availability of the cubit Each of the cavities has a quality factor so high that the coherence times can reach milliseconds. the natural match of the resonators with the physical system being simulated quantum simulation auger rhythms shown in the first half of this talk, I hope this presentation of the theory and practice of quantum simulation has been interesting

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Richard FinemanPERSON

0.99+

JosephPERSON

0.99+

Isaac twangPERSON

0.99+

Seth LloydPERSON

0.99+

1996DATE

0.99+

SchrodingerPERSON

0.99+

AndrewPERSON

0.99+

TodayDATE

0.99+

fiveQUANTITY

0.99+

last yearDATE

0.99+

JulianPERSON

0.99+

MITORGANIZATION

0.99+

both casesQUANTITY

0.99+

10 operationsQUANTITY

0.99+

SecondQUANTITY

0.99+

two cavitiesQUANTITY

0.99+

Frank CondonPERSON

0.99+

each termQUANTITY

0.99+

NathanPERSON

0.99+

first halfQUANTITY

0.99+

1.5QUANTITY

0.99+

firstQUANTITY

0.98+

two setsQUANTITY

0.98+

two thingsQUANTITY

0.98+

todayDATE

0.97+

zeroQUANTITY

0.97+

OneQUANTITY

0.97+

two additional superconducting cubitsQUANTITY

0.96+

each oneQUANTITY

0.94+

3dQUANTITY

0.94+

one wayQUANTITY

0.94+

NTT research symposiumEVENT

0.93+

HamiltonianOTHER

0.92+

PosanaOTHER

0.91+

over 12 years agoDATE

0.9+

oneQUANTITY

0.89+

Each ofQUANTITY

0.88+

zero entriesQUANTITY

0.88+

GatesPERSON

0.87+

zero feeQUANTITY

0.85+

both cubitsQUANTITY

0.83+

two modeQUANTITY

0.78+

HamiltoniansPERSON

0.77+

FrankOTHER

0.73+

HamiltonianPERSON

0.72+

millisecondsQUANTITY

0.72+

one resonatorQUANTITY

0.71+

CondonPERSON

0.71+

a thousand cubit GatesQUANTITY

0.7+

BQ POTHER

0.69+

single cubitsQUANTITY

0.69+

MarPERSON

0.65+

2020DATE

0.65+

OppenheimerLOCATION

0.59+

GatesOTHER

0.59+

StatesLOCATION

0.57+

ArcticLOCATION

0.56+

HilbertPERSON

0.56+

cavitiesQUANTITY

0.53+

HamiltoniansTITLE

0.53+

HamiltonianTITLE

0.53+

ChevyORGANIZATION

0.51+

EpsilonTITLE

0.48+

TriumeqOTHER

0.48+

N AdamsOTHER

0.46+

RemezOTHER

0.45+

CupidPERSON

0.44+

PulsonixORGANIZATION

0.37+

Physics Successfully Implements Lagrange Multiplier Optimization


 

>> Hello everybody. My title is Physics Implements Lagrange Multiplier Optimization. And let me be very specific about what I mean by this, is that in physics, there are a series of principles that are optimization principles. And we are just beginning to take advantage of them. For example, most famous in physics is the principle of least action. Of equal importance is the principle of least entropy generation. That's to say a dissipated circuit will try to adjust itself to dissipated as little as possible. There's other concepts first-to-gain-threshold, the variational principle, the adiabatic method, simulated annealing but actual physical annealing. So let's look at some of these that I'm sure you probably know about is the principle of least time. And this is sort of illustrated by a lifeguard who is trying to save a swimmer and runs as fast as possible along the sand and finally jumps in the water. So it's like the refraction of light. The lifeguard is trying to get to the swimmer as quickly as possible and is trying to follow the path that takes the least amount of time. This of course occurs in optics and classical mechanics and so forth. It's the principle of least action. Let me show you another one. The principle of minimum power dissipation. Imagine you had a circuit like this, where the current was dividing unequally. Well, that would make you feel very uncomfortable. The circuit will automatically try to adjust itself, so that the two branches which are equal actually are drawing equal amount of current. If they are unequal, it will dissipate excess energy. So we talk about least power dissipation, more sophisticated way of saying the same thing is the least entropy production. This is actually the most common one of all. Here's one that's kind of interesting. People have made a lot of hay about this, is you have lasers and you try to reach the threshold. And so you have different modes on the horizontal axis. And then one mode happens to have the lowest loss and then all the energy goes into that mode. This is the first-to-gain-threshold. This is also a type of minimization principle because physics finds the mode with the lowest gain threshold. Now, what I'll show about this, is it's not as good as it seems because there continues to be, even after you reach the gain threshold, there continues to be evolution among the modes. And so it's not quite as clear cut as it might seem. Here's the one it's famous, the variational principle. It says you have a trial wave function, the red one, it's no good because it has too much energy. The true wave function is illustrated in green. And that one has fines automatically. The fines, the situation with the wave function has the lowest energy. Here's one, of course it's just physical annealing in which you could do as physical annealing, which you could also think of it as simulated annealing. And in simulated annealing, you add noise or you raise the temperature, or do something else to jump out of local minima. So you do tend to get stuck in all of these methods. You tend to get stuck in local minima and you have to find a strategy to jump out of those local minima, but certainly physical annealing actually promises to give you a global optimum. So that's, we've got to keep that one in mind. And then there's the adiabatic method. And in the adiabatic method, you have modes. Now I am one who believes that we could do this even classically, just with LC circuits? We have avoided crossings. And the avoided crossings are such that you start from a solvable problem, and then you go to a very difficult to solve problem. And yet you stay in the ground state and I'm sure you all know this. This is the adiabatic method. Some people think of it as quantum mechanical, it could be, but it's also a classical. And what you're adjusting is one of the inductances in a complicated LC circuit. And this is sort of another illustration of the same thing, a little bit more complicated graph. You go from a simple Hamiltonian to a hard Hamiltonian, and you find a solution that way. So these are all minimization principles. Now, one of the preferred attributes is to have a digital answer, which we can get with bistable elements, physics is loaded with bistable elements, starting with the flip-flop. And you can imagine somehow coupling them together. I show you here just resistors, but it's very important that the, you don't have a pure analog machine. You want to have a machine that provides digital answers and the flip-flop is actually an analog machine, but it locks into a digital state. And so we want bistable elements that will give us binary answers. Okay, so having quickly gone through it, which of these is the best? So let's try to answer, which of these is the best for doing optimization? Which physics principle might be the best? And so one of our nice problems that we like to solve is the Ising problem. And there's a way to set that up with circuits and you can have LC circuits and try to mimic the ferromagnetic case as the two circuits are in phase and so you have, you try to lock them into, either positive or negative phase. You can do that with parametric gains. You have classical parametric gain with a two omega modulation on a capacitor and it's bistable. And if you have crossing couplings, then it's a, the phases tend to be opposite. And so you tend to have anti-ferromagnetic coupling. So you can mimic with these circuits, but there's so many ways to mimic it. So we'll see some more examples. Now, one of the main points I'm going to make today is that it's very easy to set up a physical system that not only does optimization, but also includes constraints and the constraints we normally take into account with Lagrange multipliers and this sort of an explanation of Lagrange multipliers. You're trying to go toward the absolute optimum here, but you run into the red constraint. So you get stopped right there. And the gradient of the constraint is opposite to the a, they cancel each other, the gradient of the merit function. So this is standard stuff in college, Lagrange multiplier calculus. So if physics does this, how does it do it? Well, it does it by steepest descent. We just follow it. Physics, for example, will try to go to the state of lowest power dissipation. So it goes, and it minimizes the participation in blue, but also tries to satisfy the constraint. And then we finally, we find the optimum point in some multi-dimensional configuration space. Another way of saying it, is we go from some initial state to some final state and physics does this for you for free, because it is always trying to reduce the entropy production, the power dissipation. And so there have been, I'm going to show you now five different schemes, actually I have about eight different schemes. And they all use the principle of minimum entropy generation but not all of them recognize it. So here's some work from my colleague, Roychowdhury here in my department, and he has these very amplitude, stable oscillators, but they tend to lock into a phase and in this way, it's unnatural for solving the Ising problem. But if you analyze it in detail and I'll show you the link to the archive where we've shown this is that this one is trying to satisfy the principle of minimum entropy generation and it includes constraints. And the most important constraint for us is that we want a digital answer. So we want to have either a plus or minus as the answer and the parametric oscillator permits that. He's not using a parametric oscillator, he's using something a little different, but it's somewhat similar. He's using lock sort of second-harmonic locking. It's similar to the parametric oscillator. And here's another approach from England, Cambridge University. I have the symbol of the university here and they got very excited. They have polaritons, exciton-polaritons they were very excited about that. But to us they're really just coupled electromagnetic modes and created by optical excitation. And they lock into definite phases and no big surprise they're actually, it also follows, it tends to lock in, in such a way that it minimizes the power dissipation, and it is very easy to include the digital constraint in there. And so that's yet another example. Of course, all the examples I'm going to show you from literature are all following the principle of minimum entropy generation. This is not always acknowledged by the authors. This is the Yamamoto Stanford approach. Thank you very much for inviting me. So I've analyzed this one with, we think that what's going on here. I think the quantum mechanical version could be very interesting possibly. But the versions that are out there right now are they're dissipative and there's dissipation in the optical fiber it's overcome by the parametric gain. And the net conclusion of this is that the different optical parametric oscillator pulses are trying to organize themselves in such a way as to minimize the power dissipation. So it's based upon minimum entropy generation, which for our purposes is synonymous with minimizing the power dissipation. And of course, very beautifully done. It is a very beautiful system because it's time multiplexed and it locks in to digital answers. So that's very nice. Here's something different, not the Ising problem from MIT. It is an optimizer. It's an optimizer for artificial intelligence. It uses Silicon Photonics and does unitary operations. We've gone through this very carefully. I'm sure to the people at MIT, they think they have something very unusual. But to us, this is usual. This is an example of minimizing the power dissipation. As you go round over and over again, through the Silicon Photonics, you end up minimizing the power dissipation. It's kind of surprising. And principle of minimum entropy generation again. Okay. And this is from my own group where we try to mimic the coherent ising machine, except it's just electrical. And we get the, this is an anti-ferromagnetic configuration. If the resistors were this way, it would be a ferromagnetic configuration. And we can arrange that. So I've just done five of my, I think I could have done a few more, but we're running out of time. But all of these optimization approaches are similar in that they're based upon minimum entropy generation, which is a, I don't want to say it's a law of physics, but it's accepted by many physicists, and you have different examples, including particularly MIT's optimizer for artificial intelligence. They all seem to take advantage of this type of physics. So they're all versions of minimum entropy generation. The physics hardware implements steepest descent physically. And because of the constraint though, it produces a binary output. Which is digital in the same sense that a flip-flop is digital. What's the promise? The promise is that the physics-based hardware will perform the same function at far greater speed and far less power dissipation. Now. The challenge of global optimization remains unsolved. I don't think anybody has a solution to the problem of global optimization. We can try to do better, we can get a little closer. But if, so even setting that aside, there all these terrific applications in deep learning and in neural network back-propagation, artificial intelligence, control theory. So there many applications, operations research, biology, et cetera. But there are a couple of action items needed to go further. And that is, I believe that the electronic implementation is perhaps a little easier to scale. And so we need to design some chips. So we need a chip with an array of oscillators. If you had a thousand LC oscillators on the chip, I think that would be already be very interesting. But you need to interconnect them. This would require a resistive network with about a million resistors. I think that can also be done on a chip. So minimizing the power dissipation is the whole point, but you'll do have to, there is an accuracy problem. The resistors have to be very precise but there's good news. Resistors can be programmed very accurately and I'll be happy to take questions on that. So later step though, once we have the chips is we need compiler software to convert the unknown problem into the given resistance values that will fit within these oscillator chips. So let me pause then for questions and thank you very much for your attention.

Published Date : Sep 24 2020

SUMMARY :

And because of the constraint though,

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
two branchesQUANTITY

0.99+

fiveQUANTITY

0.99+

RoychowdhuryPERSON

0.99+

MITORGANIZATION

0.99+

YamamotoPERSON

0.99+

five different schemesQUANTITY

0.99+

oneQUANTITY

0.98+

todayDATE

0.98+

two circuitsQUANTITY

0.97+

Cambridge UniversityORGANIZATION

0.97+

firstQUANTITY

0.97+

one modeQUANTITY

0.95+

about a million resistorsQUANTITY

0.93+

two omegaQUANTITY

0.9+

Silicon PhotonicsOTHER

0.87+

EnglandLOCATION

0.86+

a thousand LCQUANTITY

0.82+

Physics Implements Lagrange Multiplier OptimizationTITLE

0.8+

about eight different schemesQUANTITY

0.78+

HamiltonianOTHER

0.63+

muchQUANTITY

0.59+

coupleQUANTITY

0.54+

StanfordORGANIZATION

0.48+

Joy King, Vertica | Virtual Vertica BDC 2020


 

>>Yeah, it's the queue covering the virtual vertical Big Data Conference 2020 Brought to You by vertical. >>Welcome back, everybody. My name is Dave Vellante, and you're watching the Cube's coverage of the verdict of Virtual Big Data conference. The Cube has been at every BTC, and it's our pleasure in these difficult times to be covering BBC as a virtual event. This digital program really excited to have Joy King joining us. Joy is the vice president of product and go to market strategy in particular. And if that weren't enough, he also runs marketing and education curve for him. So, Joe, you're a multi tool players. You've got the technical side and the marketing gene, So welcome to the Cube. You're always a great guest. Love to have you on. >>Thank you so much, David. The pleasure, it really is. >>So I want to get in. You know, we'll have some time. We've been talking about the conference and the virtual event, but I really want to dig in to the product stuff. It's a big day for you guys. You announced 10.0. But before we get into the announcements, step back a little bit you know, you guys are riding the waves. I've said to ah, number of our guests that that brick has always been good. It riding the wave not only the initial MPP, but you you embraced, embraced HD fs. You embrace data science and analytics and in the cloud. So one of the trends that you see the big waves that you're writing >>Well, you're absolutely right, Dave. I mean, what what I think is most interesting and important is because verdict is, at its core a true engineering culture founded by, well, a pretty famous guy, right, Dr Stone Breaker, who embedded that very technical vertical engineering culture. It means that we don't pretend to know everything that's coming, but we are committed to embracing the tech. An ology trends, the innovations, things like that. We don't pretend to know it all. We just do it all. So right now, I think I see three big imminent trends that we are addressing. And matters had we have been for a while, but that are particularly relevant right now. The first is a combination of, I guess, a disappointment in what Hadoop was able to deliver. I always feel a little guilty because she's a very reasonably capable elephant. She was designed to be HD fs highly distributed file store, but she cant be an entire zoo, so there's a lot of disappointment in the market, but a lot of data. In HD FM, you combine that with some of the well, not some the explosion of cloud object storage. You're talking about even more data, but even more data silos. So data growth and and data silos is Trend one. Then what I would say Trend, too, is the cloud Reality Cloud brings so many events. There are so many opportunities that public cloud computing delivers. But I think we've learned enough now to know that there's also some reality. The cloud providers themselves. Dave. Don't talk about it well, because not, is it more agile? Can you do things without having to manage your own data center? Of course you can. That the reality is it's a little more pricey than we expected. There are some security and privacy concerns. There's some workloads that can go to the cloud, so hybrid and also multi cloud deployments are the next trend that are mandatory. And then maybe the one that is the most exciting in terms of changing the world we could use. A little change right now is operationalize in machine learning. There's so much potential in the technology, but it's somehow has been stuck for the most part in science projects and data science lab, and the time is now to operationalize it. Those are the three big trends that vertical is focusing on right now. >>That's great. I wonder if I could ask you a couple questions about that. I mean, I like you have a soft spot in my heart for the and the thing about the Hadoop that that was, I think, profound was it got people thinking about, you know, bringing compute to the data and leaving data in place, and it really got people thinking about data driven cultures. It didn't solve all the problems, but it collected a lot of data that we can now take your third trend and apply machine intelligence on top of that data. And then the cloud is really the ability to scale, and it gives you that agility and that it's not really that cloud experience. It's not not just the cloud itself, it's bringing the cloud experience to wherever the data lives. And I think that's what I'm hearing from you. Those are the three big super powers of innovation today. >>That's exactly right. So, you know, I have to say I think we all know that Data Analytics machine learning none of that delivers real value unless the volume of data is there to be able to truly predict and influence the future. So the last 7 to 10 years has been correctly about collecting the data, getting the data into a common location, and H DFS was well designed for that. But we live in a capitalist world, and some companies stepped in and tried to make HD Fs and the broader Hadoop ecosystem be the single solution to big data. It's not true. So now that the key is, how do we take advantage of all of that data? And now that's exactly what verdict is focusing on. So as you know, we began our journey with vertical back in the day in 2007 with our first release, and we saw the growth of the dupe. So we announced many years ago verdict a sequel on that. The idea to be able to deploy vertical on Hadoop nodes and query the data in Hadoop. We wanted to help. Now with Verdict A 10. We are also introducing vertical in eon mode, and we can talk more about that. But Verdict and Ian Mode for HDs, This is a way to apply it and see sequel database management platform to H DFS infrastructure and data in each DFS file storage. And that is a great way to leverage the investment that so many companies have made in HD Fs. And I think it's fair to the elephant to treat >>her well. Okay, let's get into the hard news and auto. Um, she's got, but you got a mature stack, but one of the highlights of append auto. And then we can drill into some of the technologies >>Absolutely so in well in 2018 vertical announced vertical in Deon mode is the separation of compute from storage. Now this is a great example of vertical embracing innovation. Vertical was designed for on premises, data centers and bare metal servers, tightly coupled storage de l three eighties from Hewlett Packard Enterprises, Dell, etcetera. But we saw that cloud computing was changing fundamentally data center architectures, and it made sense to separate compute from storage. So you add compute when you need compute. You add storage when you need storage. That's exactly what the cloud's introduced, but it was only available on the club. So first thing we did was architect vertical and EON mode, which is not a new product. Eight. This is really important. It's a deployment option. And in 2018 our customers had the opportunity to deploy their vertical licenses in EON mode on AWS in September of 2019. We then broke an important record. We brought cloud architecture down to earth and we announced vertical in eon mode so vertical with communal or shared storage, leveraging pure storage flash blade that gave us all the advantages of separating compute from storage. All of the workload, isolation, the scale up scale down the ability to manage clusters. And we did that with on Premise Data Center. And now, with vertical 10 we are announcing verdict in eon mode on HD fs and vertically on mode on Google Cloud. So what we've got here, in summary, is vertical Andy on mode, multi cloud and multiple on premise data that storage, and that gives us the opportunity to help our customers both with the hybrid and multi cloud strategies they have and unifying their data silos. But America 10 goes farther. >>Well, let me stop you there, because I just wanna I want to mention So we talked to Joe Gonzalez and past Mutual, who essentially, he was brought in. And one of this task was the lead into eon mode. Why? Because I'm asking. You still had three separate data silos and they wanted to bring those together. They're investing heavily in technology. Joe is an expert, though that really put data at their core and beyond Mode was a key part of that because they're using S three and s o. So that was Ah, very important step for those guys carry on. What else do we need to know about? >>So one of the reasons, for example, that Mass Mutual is so excited about John Mode is because of the operational advantages. You think about exactly what Joe told you about multiple clusters serving must multiple use cases and maybe multiple divisions. And look, let's be clear. Marketing doesn't always get along with finance and finance doesn't necessarily get along with up, and I t is often caught the middle. Erica and Dion mode allows workload, isolation, meaning allocating the compute resource is that different use cases need without allowing them to interfere with other use cases and allowing everybody to access the data. So it's a great way to bring the corporate world together but still protect them from each other. And that's one of the things that Mass Mutual is going to benefit from, as well, so many of >>our other customers I also want to mention. So when I saw you, ah, last last year at the Pure Storage Accelerate conference just today we are the only company that separates you from storage that that runs on Prem and in the cloud. And I was like I had to think about it. I've researched. I still can't find anybody anybody else who doesn't know. I want to mention you beat actually a number of the cloud players with that capability. So good job and I think is a differentiator, assuming that you're giving me that cloud experience and the licensing and the pricing capability. So I want to talk about that a little >>bit. Well, you're absolutely right. So let's be clear. There is no question that the public cloud public clouds introduced the separation of compute storage and these advantages that they do not have the ability or the interest to replicate that on premise for vertical. We were born to be software only. We make no money on underlying infrastructure. We don't charge as a package for the hardware underneath, so we are totally motivated to be independent of that and also to continuously optimize the software to be as efficient as possible. And we do the exact same thing to your question about life. Cloud providers charge for note indignance. That's how they charge for their underlying infrastructure. Well, in some cases, if you're being, if you're talking about a use case where you have a whole lot of data, but you don't necessarily have a lot of compute for that workload, it may make sense to pay her note. Then it's unlimited data. But what if you have a huge compute need on a relatively small data set that's not so good? Vertical offers per node and four terabyte for our customers, depending on their use case, we also offer perpetual licenses for customers who want capital. But we also offer subscription for companies that they Nope, I have to have opt in. And while this can certainly cause some complexity for our field organization, we know that it's all about choice, that everybody in today's world wants it personalized just for me. And that's exactly what we're doing with our pricing in life. >>So just to clarify, you're saying I can pay by the drink if I want to. You're not going to force me necessarily into a term or Aiken choose to have, you know, more predictable pricing. Is that, Is that correct? >>Well, so it's partially correct. The first verdict, a subscription licensing is a fixed amount for the period of the subscription. We do that so many of our customers cannot, and I'm one of them, by the way, cannot tell finance what the budgets forecast is going to be for the quarter after I spent you say what it's gonna be before, So our subscription facing is a fixed amount for a period of time. However, we do respect the fact that some companies do want usage based pricing. So on AWS, you can use verdict up by the hour and you pay by the hour. We are about to launch the very same thing on Google Cloud. So for us, it's about what do you need? And we make it happen natively directly with us or through AWS and Google Cloud. >>So I want to send so the the fixed isn't some floor. And then if you want a surge above that, you can allow usage pricing. If you're on the cloud, correct. >>Well, you actually license your cluster vertical by the hour on AWS and you run your cluster there. Or you can buy a license from vertical or a fixed capacity or a fixed number of nodes and deploy it on the cloud. And then, if you want to add more nodes or add more capacity, you can. It's not usage based for the license that you bring to the cloud. But if you purchase through the cloud provider, it is usage. >>Yeah, okay. And you guys are in the marketplace. Is that right? So, again, if I want up X, I can do that. I can choose to do that. >>That's awesome. Next usage through the AWS marketplace or yeah, directly from vertical >>because every small business who then goes to a salesforce management system knows this. Okay, great. I can pay by the month. Well, yeah, Well, not really. Here's our three year term in it, right? And it's very frustrating. >>Well, and even in the public cloud you can pay for by the hour by the minute or whatever, but it becomes pretty obvious that you're better off if you have reserved instance types or committed amounts in that by vertical offers subscription. That says, Hey, you want to have 100 terabytes for the next year? Here's what it will cost you. We do interval billing. You want to do monthly orderly bi annual will do that. But we won't charge you for usage that you didn't even know you were using until after you get the bill. And frankly, that's something my finance team does not like. >>Yeah, I think you know, I know this is kind of a wonky discussion, but so many people gloss over the licensing and the pricing, and I think my take away here is Optionality. You know, pricing your way of That's great. Thank you for that clarification. Okay, so you got Google Cloud? I want to talk about storage. Optionality. If I found him up, I got history. I got I'm presuming Google now of you you're pure >>is an s three compatible storage yet So your story >>Google object store >>like Google object store Amazon s three object store HD fs pure storage flash blade, which is an object store on prim. And we are continuing on this theft because ultimately we know that our customers need the option of having next generation data center architecture, which is sort of shared or communal storage. So all the data is in one place. Workloads can be managed independently on that data, and that's exactly what we're doing. But what we already have in two public clouds and to on premise deployment options today. And as you said, I did challenge you back when we saw each other at the conference. Today, vertical is the only analytic data warehouse platform that offers that option on premise and in multiple public clouds. >>Okay, let's talk about the ah, go back through the innovation cocktail. I'll call it So it's It's the data applying machine intelligence to that data. And we've talked about scaling at Cloud and some of the other advantages of Let's Talk About the Machine Intelligence, the machine learning piece of it. What's your story there? Give us any updates on your embracing of tooling and and the like. >>Well, quite a few years ago, we began building some in database native in database machine learning algorithms into vertical, and the reason we did that was we knew that the architecture of MPP Columbia execution would dramatically improve performance. We also knew that a lot of people speak sequel, but at the time, not so many people spoke R or even Python. And so what if we could give act us to machine learning in the database via sequel and deliver that kind of performance? So that's the journey we started out. And then we realized that actually, machine learning is a lot more as everybody knows and just algorithms. So we then built in the full end to end machine learning functions from data preparation to model training, model scoring and evaluation all the way through to fold the point and all of this again sequel accessible. You speak sequel. You speak to the data and the other advantage of this approach was we realized that accuracy was compromised if you down sample. If you moved a portion of the data from a database to a specialty machine learning platform, you you were challenged by accuracy and also what the industry is calling replica ability. And that means if a model makes a decision like, let's say, credit scoring and that decision isn't anyway challenged, well, you have to be able to replicate it to prove that you made the decision correctly. And there was a bit of, ah, you know, blow up in the media not too long ago about a credit scoring decision that appeared to be gender bias. But unfortunately, because the model could not be replicated, there was no way to this Prove that, and that was not a good thing. So all of this is built in a vertical, and with vertical 10. We've taken the next step, just like with with Hadoop. We know that innovation happens within vertical, but also outside of vertical. We saw that data scientists really love their preferred language. Like python, they love their tools and platforms like tensor flow with vertical 10. We now integrate even more with python, which we have for a while, but we also integrate with tensorflow integration and PM ML. What does that mean? It means that if you build and train a model external to vertical, using the machine learning platform that you like, you can import that model into a vertical and run it on the full end to end process. But run it on all the data. No more accuracy challenges MPP Kilometer execution. So it's blazing fast. And if somebody wants to know why a model made a decision, you can replicate that model, and you can explain why those are very powerful. And it's also another cultural unification. Dave. It unifies the business analyst community who speak sequel with the data scientist community who love their tools like Tensorflow and Python. >>Well, I think joy. That's important because so much of machine intelligence and ai there's a black box problem. You can't replicate the model. Then you do run into a potential gender bias. In the example that you're talking about there in their many you know, let's say an individual is very wealthy. He goes for a mortgage and his wife goes for some credit she gets rejected. He gets accepted this to say it's the same household, but the bias in the model that may be gender bias that could be race bias. And so being able to replicate that in and open up and make the the machine intelligence transparent is very, very important, >>It really is. And that replica ability as well as accuracy. It's critical because if you're down sampling and you're running models on different sets of data, things can get confusing. And yet you don't really have a choice. Because if you're talking about petabytes of data and you need to export that data to a machine learning platform and then try to put it back and get the next at the next day, you're looking at way too much time doing it in the database or training the model and then importing it into the database for production. That's what vertical allows, and our customers are. So it right they reopens. Of course, you know, they are the ones that are sort of the Trailblazers they've always been, and ah, this is the next step. In blazing the ML >>thrill joint customers want analytics. They want functional analytics full function. Analytics. What are they pushing you for now? What are you delivering? What's your thought on that? >>Well, I would say the number one thing that our customers are demanding right now is deployment. Flexibility. What? What the what the CEO or the CFO mandated six months ago? Now shout Whatever that thou shalt is is different. And they would, I tell them is it is impossible. No, what you're going to be commanded to do or what options you might have in the future. The key is not having to choose, and they are very, very committed to that. We have a large telco customer who is multi cloud as their commit. Why multi cloud? Well, because they see innovation available in different public clouds. They want to take advantage of all of them. They also, admittedly, the that there's the risk of lock it right. Like any vendor, they don't want that either, so they want multi cloud. We have other customers who say we have some workloads that make sense for the cloud and some that we absolutely cannot in the cloud. But we want a unified analytics strategy, so they are adamant in focusing on deployment flexibility. That's what I'd say is 1st 2nd I would say that the interest in operationalize in machine learning but not necessarily forcing the analytics team to hammer the data science team about which tools or the best tools. That's the probably number two. And then I'd say Number three. And it's because when you look at companies like Uber or the Trade Desk or A T and T or Cerner performance at scale, when they say milliseconds, they think that flow. When they say petabytes, they're like, Yeah, that was yesterday. So performance at scale good enough for vertical is never good enough. And it's why we're constantly building at the core the next generation execution engine, database designer, optimization engine, all that stuff >>I wanna also ask you. When I first started following vertical, we covered the cube covering the BBC. One of things I noticed was in talking to customers and people in the community is that you have a community edition, uh, free addition, and it's not neutered ais that have you maintain that that ethos, you know, through the transitions into into micro focus. And can you talk about that a little bit >>absolutely vertical community edition is vertical. It's all of the verdict of functionality geospatial time series, pattern matching, machine learning, all of the verdict, vertical neon mode, vertical and enterprise mode. All vertical is the community edition. The only limitation is one terabyte of data and three notes, and it's free now. If you want commercial support, where you can file a support ticket and and things like that, you do have to buy the life. But it's free, and we people say, Well, free for how long? Like our field? I've asked that and I say forever and what he said, What do you mean forever? Because we want people to use vertical for use cases that are small. They want to learn that they want to try, and we see no reason to limit that. And what we look for is when they're ready to grow when they need the next set of data that goes beyond a terabyte or they need more compute than three notes, then we're here for them, and it also brings up an important thing that I should remind you or tell you about Davis. You haven't heard it, and that's about the Vertical Academy Academy that vertical dot com well, what is that? That is, well, self paced on demand as well as vertical essential certification. Training and certification means you have seven days with your hands on a vertical cluster hosted in the cloud to go through all the certification. And guess what? All of that is free. Why why would you give it for free? Because for us empowering the market, giving the market the expert East, the learning they need to take advantage of vertical, just like with Community Edition is fundamental to our mission because we see the advantage that vertical can bring. And we want to make it possible for every company all around the world that take advantage >>of it. I love that ethos of vertical. I mean, obviously great product. But it's not just the product. It's the business practices and really progressive progressive pricing and embracing of all these trends and not running away from the waves but really leaning in joy. Thanks so much. Great interview really appreciate it. And, ah, I wished we could have been faced face in Boston, but I think it's prudent thing to do, >>I promise you, Dave we will, because the verdict of BTC and 2021 is already booked. So I will see you there. >>Haas enjoyed King. Thanks so much for coming on the Cube. And thank you for watching. Remember, the Cube is running this program in conjunction with the virtual vertical BDC goto vertical dot com slash BBC 2020 for all the coverage and keep it right there. This is Dave Vellante with the Cube. We'll be right back. >>Yeah, >>yeah, yeah.

Published Date : Mar 31 2020

SUMMARY :

Yeah, it's the queue covering the virtual vertical Big Data Conference Love to have you on. Thank you so much, David. So one of the trends that you see the big waves that you're writing Those are the three big trends that vertical is focusing on right now. it's bringing the cloud experience to wherever the data lives. So now that the key is, how do we take advantage of all of that data? And then we can drill into some of the technologies had the opportunity to deploy their vertical licenses in EON mode on Well, let me stop you there, because I just wanna I want to mention So we talked to Joe Gonzalez and past Mutual, And that's one of the things that Mass Mutual is going to benefit from, I want to mention you beat actually a number of the cloud players with that capability. for the hardware underneath, so we are totally motivated to be independent of that So just to clarify, you're saying I can pay by the drink if I want to. So for us, it's about what do you need? And then if you want a surge above that, for the license that you bring to the cloud. And you guys are in the marketplace. directly from vertical I can pay by the month. Well, and even in the public cloud you can pay for by the hour by the minute or whatever, and the pricing, and I think my take away here is Optionality. And as you said, I'll call it So it's It's the data applying machine intelligence to that data. So that's the journey we started And so being able to replicate that in and open up and make the the and get the next at the next day, you're looking at way too much time doing it in the What are they pushing you for now? commanded to do or what options you might have in the future. And can you talk about that a little bit the market, giving the market the expert East, the learning they need to take advantage of vertical, But it's not just the product. So I will see you there. And thank you for watching.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
DavidPERSON

0.99+

Dave VellantePERSON

0.99+

September of 2019DATE

0.99+

Joe GonzalezPERSON

0.99+

DavePERSON

0.99+

2007DATE

0.99+

DellORGANIZATION

0.99+

Joy KingPERSON

0.99+

JoePERSON

0.99+

JoyPERSON

0.99+

UberORGANIZATION

0.99+

2018DATE

0.99+

BostonLOCATION

0.99+

Vertical Academy AcademyORGANIZATION

0.99+

AWSORGANIZATION

0.99+

seven daysQUANTITY

0.99+

one terabyteQUANTITY

0.99+

pythonTITLE

0.99+

three notesQUANTITY

0.99+

TodayDATE

0.99+

Hewlett Packard EnterprisesORGANIZATION

0.99+

GoogleORGANIZATION

0.99+

BBCORGANIZATION

0.99+

AmazonORGANIZATION

0.99+

100 terabytesQUANTITY

0.99+

Ian ModePERSON

0.99+

six months agoDATE

0.99+

PythonTITLE

0.99+

first releaseQUANTITY

0.99+

1st 2ndQUANTITY

0.99+

three yearQUANTITY

0.99+

Mass MutualORGANIZATION

0.99+

EightQUANTITY

0.99+

next yearDATE

0.99+

Stone BreakerPERSON

0.99+

firstQUANTITY

0.99+

oneQUANTITY

0.98+

America 10TITLE

0.98+

KingPERSON

0.98+

todayDATE

0.98+

four terabyteQUANTITY

0.97+

John ModePERSON

0.97+

HaasPERSON

0.97+

yesterdayDATE

0.97+

first verdictQUANTITY

0.96+

one placeQUANTITY

0.96+

s threeCOMMERCIAL_ITEM

0.96+

singleQUANTITY

0.95+

first thingQUANTITY

0.95+

OneQUANTITY

0.95+

bothQUANTITY

0.95+

TensorflowTITLE

0.95+

HadoopTITLE

0.95+

third trendQUANTITY

0.94+

MPP ColumbiaORGANIZATION

0.94+

HadoopPERSON

0.94+

last last yearDATE

0.92+

three big trendsQUANTITY

0.92+

vertical 10TITLE

0.92+

two public cloudsQUANTITY

0.92+

Pure Storage Accelerate conferenceEVENT

0.91+

AndyPERSON

0.9+

few years agoDATE

0.9+

next dayDATE

0.9+

MutualORGANIZATION

0.9+

ModePERSON

0.89+

telcoORGANIZATION

0.89+

three bigQUANTITY

0.88+

eonTITLE

0.88+

VerdictPERSON

0.88+

three separate dataQUANTITY

0.88+

CubeCOMMERCIAL_ITEM

0.87+

petabytesQUANTITY

0.87+

Google CloudTITLE

0.86+

Keynote Analysis | Virtual Vertica BDC 2020


 

(upbeat music) >> Narrator: It's theCUBE, covering the Virtual Vertica Big Data Conference 2020. Brought to you by Vertica. >> Dave Vellante: Hello everyone, and welcome to theCUBE's exclusive coverage of the Vertica Virtual Big Data Conference. You're watching theCUBE, the leader in digital event tech coverage. And we're broadcasting remotely from our studios in Palo Alto and Boston. And, we're pleased to be covering wall-to-wall this digital event. Now, as you know, originally BDC was scheduled this week at the new Encore Hotel and Casino in Boston. Their theme was "Win big with big data". Oh sorry, "Win big with data". That's right, got it. And, I know the community was really looking forward to that, you know, meet up. But look, we're making the best of it, given these uncertain times. We wish you and your families good health and safety. And this is the way that we're going to broadcast for the next several months. Now, we want to unpack Colin Mahony's keynote, but, before we do that, I want to give a little context on the market. First, theCUBE has covered every BDC since its inception, since the BDC's inception that is. It's a very intimate event, with a heavy emphasis on user content. Now, historically, the data engineers and DBAs in the Vertica community, they comprised the majority of the content at this event. And, that's going to be the same for this virtual, or digital, production. Now, theCUBE is going to be broadcasting for two days. What we're doing, is we're going to be concurrent with the Virtual BDC. We got practitioners that are coming on the show, DBAs, data engineers, database gurus, we got a security experts coming on, and really a great line up. And, of course, we'll also be hearing from Vertica Execs, Colin Mahony himself right of the keynote, folks from product marketing, partners, and a number of experts, including some from Micro Focus, which is the, of course, owner of Vertica. But I want to take a moment to share a little bit about the history of Vertica. The company, as you know, was founded by Michael Stonebraker. And, Verica started, really they started out as a SQL platform for analytics. It was the first, or at least one of the first, to really nail the MPP column store trend. Not only did Vertica have an early mover advantage in MPP, but the efficiency and scale of its software, relative to traditional DBMS, and also other MPP players, is underscored by the fact that Vertica, and the Vertica brand, really thrives to this day. But, I have to tell you, it wasn't without some pain. And, I'll talk a little bit about that, and really talk about how we got here today. So first, you know, you think about traditional transaction databases, like Oracle or IMBDB tour, or even enterprise data warehouse platforms like Teradata. They were simply not purpose-built for big data. Vertica was. Along with a whole bunch of other players, like Netezza, which was bought by IBM, Aster Data, which is now Teradata, Actian, ParAccel, which was the basis for Redshift, Amazon's Redshift, Greenplum was bought, in the early days, by EMC. And, these companies were really designed to run as massively parallel systems that smoked traditional RDBMS and EDW for particular analytic applications. You know, back in the big data days, I often joked that, like an NFL draft, there was run on MPP players, like when you see a run on polling guards. You know, once one goes, they all start to fall. And that's what you saw with the MPP columnar stores, IBM, EMC, and then HP getting into the game. So, it was like 2011, and Leo Apotheker, he was the new CEO of HP. Frankly, he has no clue, in my opinion, with what to do with Vertica, and totally missed one the biggest trends of the last decade, the data trend, the big data trend. HP picked up Vertica for a song, it wasn't disclosed, but my guess is that it was around 200 million. So, rather than build a bunch of smart tokens around Vertica, which I always call the diamond in the rough, Apotheker basically permanently altered HP for years. He kind of ruined HP, in my view, with a 12 billion dollar purchase of Autonomy, which turned out to be one of the biggest disasters in recent M&A history. HP was forced to spin merge, and ended up selling most of its software to Microsoft, Micro Focus. (laughs) Luckily, during its time at HP, CEO Meg Whitman, largely was distracted with what to do with the mess that she inherited form Apotheker. So, Vertica was left alone. Now, the upshot is Colin Mahony, who was then the GM of Vertica, and still is. By the way, he's really the CEO, and he just doesn't have the title, I actually think they should give that to him. But anyway, he's been at the helm the whole time. And Colin, as you'll see in our interview, is a rockstar, he's got technical and business jobs, people love him in the community. Vertica's culture is really engineering driven and they're all about data. Despite the fact that Vertica is a 15-year-old company, they've really kept pace, and not been polluted by legacy baggage. Vertica, early on, embraced Hadoop and the whole open-source movement. And that helped give it tailwinds. It leaned heavily into cloud, as we're going to talk about further this week. And they got a good story around machine intelligence and AI. So, whereas many traditional database players are really getting hurt, and some are getting killed, by cloud database providers, Vertica's actually doing a pretty good job of servicing its install base, and is in a reasonable position to compete for new workloads. On its last earnings call, the Micro Focus CFO, Stephen Murdoch, he said they're investing 70 to 80 million dollars in two key growth areas, security and Vertica. Now, Micro Focus is running its Suse play on these two parts of its business. What I mean by that, is they're investing and allowing them to be semi-autonomous, spending on R&D and go to market. And, they have no hardware agenda, unlike when Vertica was part of HP, or HPE, I guess HP, before the spin out. Now, let me come back to the big trend in the market today. And there's something going on around analytic databases in the cloud. You've got companies like Snowflake and AWS with Redshift, as we've reported numerous times, and they're doing quite well, they're gaining share, especially of new workloads that are merging, particularly in the cloud native space. They combine scalable compute, storage, and machine learning, and, importantly, they're allowing customers to scale, compute, and storage independent of each other. Why is that important? Because you don't have to buy storage every time you buy compute, or vice versa, in chunks. So, if you can scale them independently, you've got granularity. Vertica is keeping pace. In talking to customers, Vertica is leaning heavily into the cloud, supporting all the major cloud platforms, as we heard from Colin earlier today, adding Google. And, why my research shows that Vertica has some work to do in cloud and cloud native, to simplify the experience, it's more robust in motor stack, which supports many different environments, you know deep SQL, acid properties, and DNA that allows Vertica to compete with these cloud-native database suppliers. Now, Vertica might lose out in some of those native workloads. But, I have to say, my experience in talking with customers, if you're looking for a great MMP column store that scales and runs in the cloud, or on-prem, Vertica is in a very strong position. Vertica claims to be the only MPP columnar store to allow customers to scale, compute, and storage independently, both in the cloud and in hybrid environments on-prem, et cetera, cross clouds, as well. So, while Vertica may be at a disadvantage in a pure cloud native bake-off, it's more robust in motor stack, combined with its multi-cloud strategy, gives Vertica a compelling set of advantages. So, we heard a lot of this from Colin Mahony, who announced Vertica 10.0 in his keynote. He really emphasized Vertica's multi-cloud affinity, it's Eon Mode, which really allows that separation, or scaling of compute, independent of storage, both in the cloud and on-prem. Vertica 10, according to Mahony, is making big bets on in-database machine learning, he talked about that, AI, and along with some advanced regression techniques. He talked about PMML models, Python integration, which was actually something that they talked about doing with Uber and some other customers. Now, Mahony also stressed the trend toward object stores. And, Vertica now supports, let's see S3, with Eon, S3 Eon in Google Cloud, in addition to AWS, and then Pure and HDFS, as well, they all support Eon Mode. Mahony also stressed, as I mentioned earlier, a big commitment to on-prem and the whole cloud optionality thing. So 10.0, according to Colin Mahony, is all about really doubling down on these industry waves. As they say, enabling native PMML models, running them in Vertica, and really doing all the work that's required around ML and AI, they also announced support for TensorFlow. So, object store optionality is important, is what he talked about in Eon Mode, with the news of support for Google Cloud and, as well as HTFS. And finally, a big focus on deployment flexibility. Migration tools, which are a critical focus really on improving ease of use, and you hear this from a lot of customers. So, these are the critical aspects of Vertica 10.0, and an announcement that we're going to be unpacking all week, with some of the experts that I talked about. So, I'm going to close with this. My long-time co-host, John Furrier, and I have talked some time about this new cocktail of innovation. No longer is Moore's law the, really, mainspring of innovation. It's now about taking all these data troves, bringing machine learning and AI into that data to extract insights, and then operationalizing those insights at scale, leveraging cloud. And, one of the things I always look for from cloud is, if you've got a cloud play, you can attract innovation in the form of startups. It's part of the success equation, certainly for AWS, and I think it's one of the challenges for a lot of the legacy on-prem players. Vertica, I think, has done a pretty good job in this regard. And, you know, we're going to look this week for evidence of that innovation. One of the interviews that I'm personally excited about this week, is a new-ish company, I would consider them a startup, called Zebrium. What they're doing, is they're applying AI to do autonomous log monitoring for IT ops. And, I'm interviewing Larry Lancaster, who's their CEO, this week, and I'm going to press him on why he chose to run on Vertica and not a cloud database. This guy is a hardcore tech guru and I want to hear his opinion. Okay, so keep it right there, stay with us. We're all over the Vertica Virtual Big Data Conference, covering in-depth interviews and following all the news. So, theCUBE is going to be interviewing these folks, two days, wall-to-wall coverage, so keep it right there. We're going to be right back with our next guest, right after this short break. This is Dave Vellante and you're watching theCUBE. (upbeat music)

Published Date : Mar 31 2020

SUMMARY :

Brought to you by Vertica. and the Vertica brand, really thrives to this day.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Dave VellantePERSON

0.99+

Larry LancasterPERSON

0.99+

ColinPERSON

0.99+

IBMORGANIZATION

0.99+

HPORGANIZATION

0.99+

70QUANTITY

0.99+

MicrosoftORGANIZATION

0.99+

Michael StonebrakerPERSON

0.99+

Colin MahonyPERSON

0.99+

Stephen MurdochPERSON

0.99+

VerticaORGANIZATION

0.99+

EMCORGANIZATION

0.99+

Palo AltoLOCATION

0.99+

ZebriumORGANIZATION

0.99+

two daysQUANTITY

0.99+

AWSORGANIZATION

0.99+

BostonLOCATION

0.99+

VericaORGANIZATION

0.99+

Micro FocusORGANIZATION

0.99+

2011DATE

0.99+

HPEORGANIZATION

0.99+

UberORGANIZATION

0.99+

firstQUANTITY

0.99+

MahonyPERSON

0.99+

Meg WhitmanPERSON

0.99+

AmazonORGANIZATION

0.99+

Aster DataORGANIZATION

0.99+

SnowflakeORGANIZATION

0.99+

GoogleORGANIZATION

0.99+

FirstQUANTITY

0.99+

12 billion dollarQUANTITY

0.99+

OneQUANTITY

0.99+

this weekDATE

0.99+

John FurrierPERSON

0.99+

15-year-oldQUANTITY

0.98+

PythonTITLE

0.98+

OracleORGANIZATION

0.98+

olin MahonyPERSON

0.98+

around 200 millionQUANTITY

0.98+

Virtual Vertica Big Data Conference 2020EVENT

0.98+

theCUBEORGANIZATION

0.98+

80 million dollarsQUANTITY

0.97+

todayDATE

0.97+

two partsQUANTITY

0.97+

Vertica Virtual Big Data ConferenceEVENT

0.97+

TeradataORGANIZATION

0.97+

oneQUANTITY

0.97+

ActianORGANIZATION

0.97+

Dan Woicke, Cerner Corporation | Virtual Vertica BDC 2020


 

(gentle electronic music) >> Hello, everybody, welcome back to the Virtual Vertica Big Data Conference. My name is Dave Vellante and you're watching theCUBE, the leader in digital coverage. This is the Virtual BDC, as I said, theCUBE has covered every Big Data Conference from the inception, and we're pleased to be a part of this, even though it's challenging times. I'm here with Dan Woicke, the senior director of CernerWorks Engineering. Dan, good to see ya, how are things where you are in the middle of the country? >> Good morning, challenging times, as usual. We're trying to adapt to having the kids at home, out of school, trying to figure out how they're supposed to get on their laptop and do virtual learning. We all have to adapt to it and figure out how to get by. >> Well, it sure would've been my pleasure to meet you face to face in Boston at the Encore Casino, hopefully next year we'll be able to make that happen. But let's talk about Cerner and CernerWorks Engineering, what is that all about? >> So, CernerWorks Engineering, we used to be part of what's called IP, or Intellectual Property, which is basically the organization at Cerner that does all of our software development. But what we did was we made a decision about five years ago to organize my team with CernerWorks which is the hosting side of Cerner. So, about 80% of our clients choose to have their domains hosted within one of the two Kansas City data centers. We have one in Lee's Summit, in south Kansas City, and then we have one on our main campus that's a brand new one in downtown, north Kansas City. About 80, so we have about 27,000 environments that we manage in the Kansas City data centers. So, what my team does is we develop software in order to make it easier for us to monitor, manage, and keep those clients healthy within our data centers. >> Got it. I mean, I think of Cerner as a real advanced health tech company. It's the combination of healthcare and technology, the collision of those two. But maybe describe a little bit more about Cerner's business. >> So we have, like I said, 27,000 facilities across the world. Growing each day, thank goodness. And, our goal is to ensure that we reduce errors and we digitize the entire medical records for all of our clients. And we do that by having a consulting practice, we do that by having engineering, and then we do that with my team, which manages those particular clients. And that's how we got introduced to the Vertica side as well, when we introduced them about seven years ago. We were actually able to take a tremendous leap forward in how we manage our clients. And I'd be more than happy to talk deeper about how we do that. >> Yeah, and as we get into it, I want to understand, healthcare is all about outcomes, about patient outcomes and you work back from there. IT, for years, has obviously been a contributor but removed, and somewhat indirect from those outcomes. But, in this day and age, especially in an organization like yours, it really starts with the outcomes. I wonder if you could ratify that and talk about what that means for Cerner. >> Sorry, are you talking about medical outcomes? >> Yeah, outcomes of your business. >> So, there's two different sides to Cerner, right? There's the medical side, the clinical side, which is obviously our main practice, and then there's the side that I manage, which is more of the operational side. Both are very important, but they go hand in hand together. On the operational side, the goal is to ensure that our clinicians are on the system, and they don't know they're on the system, right? Things are progressing, doctors don't want to be on the system, trust me. My job is to ensure they're having the most seamless experience possible while they're on the EMR and have it just be one of their side jobs as opposed to taking their attention away from the patients. That make sense? >> Yeah it does, I mean, EMR and meaningful use, around the Affordable Care Act, really dramatically changed the unit. I mean, people had to demonstrate in order to get paid, and so that became sort of an unfunded mandate for folks and you really had to respond to that, didn't you? >> We did, we did that about three to four years ago. And we had to help our clients get through what's called meaningful use, there was different stages of meaningful use. And what we did, is we have the website called the Lights On Network which is free to all of our clients. Once you get onto the website the Lights On Network, you can actually show how you're measured and whether or not you're actually completing the different necessary tasks in order to get those payments for meaningful use. And it also allows you to see what your performance is on your domain, how the clinicians are doing on the system, how many hours they're spending on the system, how many orders they're executing. All of that is completely free and visible to our clients on the Lights On Network. And that's actually backed by some of the Vertica software that we've invested in. >> Yeah, so before we get into that, it sounds like your mission, really, is just great user experiences for the people that are on the network. Full stop. >> We do. So, one of the things that we invented about 10 years ago is called RTMS Timers. They're called Response Time Measurement System. And it started off as a way of us proving that clients are actually using the system, and now it's turned into more of a user outcomes. What we do is we collect 2.5 billion timers per day across all of our clients across the world. And every single one of those records goes to the Vertica platform. And then we've also developed a system on that which allows us in real time to go and see whether or not they're deviating from their normal. So we do baselines every hour of the week and then if they're deviating from those baselines, we can immediately call a service center and have them engage the client before they call in. >> So, Dan, I wonder if you could paint a picture. By the way, that's awesome. I wonder if you could paint a picture of your analytics environment. What does it look like? Maybe give us a sense of the scale. >> Okay. So, I've been describing how we operate, our remote hosted clients in the two Kansas City data centers, but all the software that we write, we also help our client hosted agents as well. Not only do we take care of what's going on at the Kansas City data center, but we do write software to ensure that all of clients are treated the same and we provide the same level of care and performance management across all those clients. So what we do is we have 90,000 agents that we have split across all these clients across the world. And every single hour, we're committing a billion rows to Vertica of operational data. So I talked a little bit about the RTMS timers, but we do things just like everyone else does for CPU, memory, Java Heap Stack. We can tell you how many concurrent users are on the system, I can tell you if there's an application that goes down unexpected, like a crash. I can tell you the response time from the network as most of us use Citrix at Cerner. And so what we do is we measure the amount of time it takes from the client side to PCs, it's sitting in the virtual data centers, sorry, in the hospitals, and then round trip to the Citrix servers that are sitting in the Kansas City data center. That's called the RTT, our round trip transactions. And what we've done is, over the last couple of years, what we've done is we've switched from just summarizing CPU and memory and all that high-level stuff, in order to go down to a user level. So, what are you doing, Dr. Smith, today? How many hours are you using the EMR? Have you experienced any slowness? Have you experienced any hourglass holding within your application? Have you experienced, unfortunately, maybe a crash? Have you experienced any slowness compared to your normal use case? And that's the step we've taken over the last few years, to go from summarization of high-level CPU memory, over to outcome metrics, which are what is really happening with a particular user. >> So, really granular views of how the system is being used and deep analytics on that. I wonder, go ahead, please. >> And, we weren't able to do that by summarizing things in traditional databases. You have to actually have the individual rows and you can't summarize information, you have to have individual metrics that point to exactly what's going on with a particular clinician. >> So, okay, the MPP architecture, the columnar store, the scalability of Vertica, that's what's key. That was my next question, let me take us back to the days of traditional RDBMS and then you brought in Vertica. Maybe you could give us a sense as to why, what that did for you, the before and after. >> Right. So, I'd been painting a picture going forward here about how traditionally, eight years ago, all we could do was summarize information. If CPU was going to go and jump up 8%, I could alarm the data center and say, hey, listen, CPU looks like it's higher, maybe an application's hanging more than it has been in the past. Things are a little slower, but I wouldn't be able to tell you who's affected. And that's where the whole thing has changed, when we brought Vertica in six years ago is that, we're able to take those 90,000 agents and commit a billion rows per hour operational data, and I can tell you exactly what's going on with each of our clinicians. Because you know, it's important for an entire domain to be healthy. But what about the 10 doctors that are experiencing frustration right now? If you're going to summarize that information and roll it up, you'll never know what those 10 doctors are experiencing and then guess what happens? They call the data center and complain, right? The squeaky wheels? We don't want that, we want to be able to show exactly who's experiencing a bad performance right now and be able to reach out to them before they call the help desk. >> So you're able to be proactive there, so you've gone from, Houston, we have a problem, we really can't tell you what it is, go figure it out, to, we see that there's an issue with these docs, or these users, and go figure that out and focus narrowly on where the problem is as opposed to trying to whack-a-mole. >> Exactly. And the other big thing that we've been able to do is corelation. So, we operate two gigantic data centers. And there's things that are shared, switches, network, shared storage, those things are shared. So if there is an issue that goes on with one of those pieces of equipment, it could affect multiple clients. Now that we have every row in Vertica, we have a new program in place called performance abnormality flags. And what we're able to do is provide a website in real time that goes through the entire stack from Citrix to network to database to back-end tier, all the way to the end-user desktop. And so if something was going to be related because we have a network switch going out of the data center or something's backing up slow, you can actually see which clients are on that switch, and, what we did five years ago before this, is we would deploy out five different teams to troubleshoot, right? Because five clients would call in, and they would all have the same problem. So, here you are having to spare teams trying to investigate why the same problem is happening. And now that we have all of the data within Vertica, we're able to show that in a real time fashion, through a very transparent dashboard. >> And so operational metrics throughout the stack, right? A game changer. >> It's very compact, right? I just label five different things, the stack from your end-user device all the way through the back-end to your database and all the way back. All that has to work properly, right? Including the network. >> How big is this, what are we talking about? However you measure it, terabytes, clusters. What can you share there? >> Sorry, you mean, the amount of data that we process within our data centers? >> Give us a fun fact. >> Absolute petabytes, yeah, for sure. And in Vertica right now we have two petabytes of data, and I purge it out every year, one year's worth of data within two different clusters. So we have to two different data centers I've been describing, what we've done is we've set Vertica up to be in both data centers, to be highly redundant, and then one of those is configured to do real-time analysis and corelation research, and then the other one is to provide service towards what I described earlier as our Lights On Network, so it's a very dedicated hardened cluster in one of our data centers to allow the Lights On Network to provide the transparency directly to our clients. So we want that one to be pristine, fast, and nobody touch it. As opposed to the other one, where, people are doing real-time, ad hoc queries, which sometimes aren't the best thing in the world. No matter what kind of database or how fast it is, people do bad things in databases and we just don't want that to affect what we show our clients in a transparent fashion. >> Yeah, I mean, for our audience, Vertica has always been aimed at these big, hairy, analytic problems, it's not for a tiny little data mart in a department, it's really the big scale problems. I wonder if I could ask you, so you guys, obviously, healthcare, with HIPAA and privacy, are you doing anything in the cloud, or is it all on-prem today? >> So, in the operational space that I manage, it's all on-premises, and that is changing. As I was describing earlier, we have an initiative to go to AWS and provide levels of service to countries like Sweden which does not want any operational data to leave that country's walls, whether it be operational data or whether it be PHI. And so, we have to be able to adapt into Vertia Eon Mode in order to provide the same services within Sweden. So obviously, Cerner's not going to go up and build a data center in every single country that requires us, so we're going to leverage our partnership with AWS to make this happen. >> Okay, so, I was going to ask you, so you're not running Eon Mode today, it's something that you're obviously interested in. AWS will allow you to keep the data locally in that region. In talking to a lot of practitioners, they're intrigued by this notion of being able to scale independently, storage from compute. They've said they wished that's a much more efficient way, I don't have to buy in chunks, if I'm out of storage, I don't have to buy compute, and vice-versa. So, maybe you could share with us what you're thinking, I know it's early days, but what's the logic behind the business case there? >> I think you're 100% correct in your assessment of taking compute away from storage. And, we do exactly what you say, we buy a server. And it has so much compute on it, and so much storage. And obviously, it's not scaled properly, right? Either storage runs out first or compute runs out first, but you're still paying big bucks for the entire server itself. So that's exactly why we're doing the POC right now for Eon Mode. And I sit on Vertica's TAB, the advisory board, and they've been doing a really good job of taking our requirements and listening to us, as to what we need. And that was probably number one or two on everybody's lists, was to separate storage from compute. And that's exactly what we're trying to do right now. >> Yeah, it's interesting, I've talked to some other customers that are on the customer advisory board. And Vertica is one of these companies that're pretty transparent about what goes on there. And I think that for the early adopters of Eon Mode there were some challenges with getting data into the new system, I know Vertica has been working on that very hard but you guys push Vertica pretty hard and from what I can tell, they listen. Your thoughts. >> They do listen, they do a great job. And even though the Big Data Conference is canceled, they're committed to having us go virtually to the CAD meeting on Monday, so I'm looking forward to that. They do listen to our requirements and they've been very very responsive. >> Nice. So, I wonder if you could give us some final thoughts as to where you want to take this thing. If you look down the road a year or two, what does success look like, Dan? >> That's a good question. Success means that we're a little bit more nimble as far as the different regions across the world that we can provide our services to. I want to do more corelation. I want to gather more information about what users are actually experiencing. I want to be able to have our phone never ring in our data center, I know that's a grand thought there. But I want to be able to look forward to measuring the data internally and reaching out to our clients when they have issues and then doing the proper corelation so that I can understand how things are intertwining if multiple clients are having an issue. That's the goal going forward. >> Well, in these trying times, during this crisis, it's critical that your operations are running smoothly. The last thing that organizations need right now, especially in healthcare, is disruption. So thank you for all the hard work that you and your teams are doing. I wish you and your family all the best. Stay safe, stay healthy, and thanks so much for coming on theCUBE. >> I really appreciate it, thanks for the opportunity. >> You're very welcome, and thank you, everybody, for watching, keep it right there, we'll be back with our next guest. This is Dave Vellante for theCUBE. Covering Virtual Vertica Big Data Conference. We'll be right back. (upbeat electronic music)

Published Date : Mar 31 2020

SUMMARY :

in the middle of the country? and figure out how to get by. been my pleasure to meet you and then we have one on our main campus and technology, the and then we do that with my team, Yeah, and as we get into it, the goal is to ensure that our clinicians in order to get paid, and so that became in order to get those for the people that are on the network. So, one of the things that we invented I wonder if you could paint a picture from the client side to PCs, of how the system is being used that point to exactly what's going on and then you brought in Vertica. and be able to reach out to them we really can't tell you what it is, And now that we have all And so operational metrics and all the way back. are we talking about? And in Vertica right now we in the cloud, or is it all on-prem today? So, in the operational I don't have to buy in chunks, and listening to us, as to what we need. that are on the customer advisory board. so I'm looking forward to that. as to where you want to take this thing. and reaching out to our that you and your teams are doing. thanks for the opportunity. and thank you, everybody,

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Dan WoickePERSON

0.99+

Dave VellantePERSON

0.99+

AWSORGANIZATION

0.99+

CernerORGANIZATION

0.99+

Affordable Care ActTITLE

0.99+

BostonLOCATION

0.99+

100%QUANTITY

0.99+

DanPERSON

0.99+

10 doctorsQUANTITY

0.99+

SwedenLOCATION

0.99+

90,000 agentsQUANTITY

0.99+

five clientsQUANTITY

0.99+

CernerWorksORGANIZATION

0.99+

8%QUANTITY

0.99+

twoQUANTITY

0.99+

Kansas CityLOCATION

0.99+

SmithPERSON

0.99+

VerticaORGANIZATION

0.99+

Cerner CorporationORGANIZATION

0.99+

next yearDATE

0.99+

MondayDATE

0.99+

BothQUANTITY

0.99+

todayDATE

0.99+

one yearQUANTITY

0.99+

a yearQUANTITY

0.99+

27,000 facilitiesQUANTITY

0.99+

HoustonLOCATION

0.99+

oneQUANTITY

0.99+

two petabytesQUANTITY

0.99+

five years agoDATE

0.99+

CernerWorks EngineeringORGANIZATION

0.98+

south Kansas CityLOCATION

0.98+

eight years agoDATE

0.98+

about 80%QUANTITY

0.98+

Virtual Vertica Big Data ConferenceEVENT

0.98+

CitrixORGANIZATION

0.98+

two different data centersQUANTITY

0.97+

each dayQUANTITY

0.97+

four years agoDATE

0.97+

two different clustersQUANTITY

0.97+

six years agoDATE

0.97+

eachQUANTITY

0.97+

north Kansas CityLOCATION

0.97+

HIPAATITLE

0.97+

five different teamsQUANTITY

0.97+

firstQUANTITY

0.96+

five different thingsQUANTITY

0.95+

two different sidesQUANTITY

0.95+

about 27,000 environmentsQUANTITY

0.95+

both data centersQUANTITY

0.95+

About 80QUANTITY

0.95+

Response Time Measurement SystemOTHER

0.95+

two gigantic data centersQUANTITY

0.93+

Java HeapTITLE

0.92+

UNLIST TILL 4/2 - Sizing and Configuring Vertica in Eon Mode for Different Use Cases


 

>> Jeff: Hello everybody, and thank you for joining us today, in the virtual Vertica BDC 2020. Today's Breakout session is entitled, "Sizing and Configuring Vertica in Eon Mode for Different Use Cases". I'm Jeff Healey, and I lead Vertica Marketing. I'll be your host for this Breakout session. Joining me are Sumeet Keswani, and Shirang Kamat, Vertica Product Technology Engineers, and key leads on the Vertica customer success needs. But before we begin, I encourage you to submit questions or comments during the virtual session, you don't have to wait, just type your question or comment in the question box below the slides, and click submit. There will be a Q&A session at the end of the presentation, we will answer as many questions as we're able to during that time, any questions we don't address, we'll do our best to answer them off-line. Alternatively, visit Vertica Forums, at forum.vertica.com, post your question there after the session. Our Engineering Team is planning to join the forums to keep the conversation going. Also as reminder, that you can maximize your screen by clicking the double arrow button in the lower-right corner of the slides, and yes, this virtual session is being recorded, and will be available to view on-demand this week. We'll send you a notification as soon as it's ready. Now let's get started! Over to you, Shirang. >> Shirang: Thanks Jeff. So, for today's presentation, we have picked Eon Mode concepts, we are going to go over sizing guidelines for Eon Mode, some of the use cases that you can benefit from using Eon Mode. And at last, we are going to talk about, some tips and tricks that can help you configure and manage your cluster. Okay. So, as you know, Vertica has two modes of operation, Eon Mode and Enterprise Mode. So the question that you may have is, which mode should I implement? So let's look at what's there in the Enterprise Mode. Enterprise Mode, you have a cluster, with general purpose compute nodes, that have locally at their storage. Because of this tight integration of compute and storage, you get fast and reliable performance all the time. Now, amount of data that you can store in Enterprise Mode cluster, depends on the total disk capacity of the cluster. Again, Enterprise Mode is more suitable for on premise and cloud deployments. Now, let's look at Eon Mode. To take advantage of cloud economics, Vertica implemented Eon Mode, which is getting very popular among our customers. In Eon Mode, we have compute and storage, that are separated by introducing S3 Bucket, or, S3 compliant storage. Now because of this separation of compute and storage, you can take advantages like mapping all dynamic scale-out and scale-in. Isolation of your workload, as well as you can load data in your cluster, without having to worry about the total disk capacity of your local nodes. Obviously, you know, it's obvious from what they accept, Eon Mode is suitable for cloud deployment. Some of our customers who take advantage of the features of Eon Mode, are also deploying it on premise, by introducing S3 compliant slash web storage. Okay? So, let's look at some of the terminologies used in Eon Mode. The four things that I want to talk about are, communal storage. It's a shared storage, or S3 compliant shared storage, a bucket that is accessible from all the nodes in your cluster. Shard, is a segment of data, stored on the communal storage. Subscription, is the binding with nodes and shards. And last, depot. Depot is a local copy or, a local cache, that can help query in group performance. So, shard is a segment of data stored in communal storage. When you create a Eon Mode cluster, you have to specify the shard count. Shard count decide the maximum number of nodes that will participate in your query. So, Vertica also will introduce a shard, called replica shard, that will hold the data for replicated projections. Subscriptions, as I said before, is a binding between nodes and shards. Each node subscribes to one or more shards, and a shard has at least two nodes that subscribe to it for case 50. Subscribing nodes are responsible for writing and reading from shard data. Also subscriber node holds up-to-date metadata for a catalog of files that are present in the shard. So, when you connect to Vertica node, Vertica will automatically assign you set of nodes and subscriptions that will process your query. There are two important system tables. There are node subscriptions, and session subscriptions, that can help you understand this a little bit more. So let's look at what's on the local disk of your Eon Mode cluster. So, on local disk, you have depot. Depot is a local file system cache, that can hold subset of the data, or copy of the data, in communal storage. Other things that are there, are temp storage, temp storage is used for storing data belonging to temporary tables, and, the data that spills through this, when you are processing queries. And last, is catalog. Catalog is a persistent copy of Vertica, catalog that is written to this. The writes happen at every commit. You only need the persistent copy at node startup. There is also a copy of Vertica catalog, stored in communal storage, called durability. The local copy is synced to the copy in communal storage via service, at the interval of five minutes. So, let's look at depot. Now, as I said before, depot is your file system cache. It's help to reduce network traffic, and slow performance of your queries. So, we make assumption, that when we load data in Vertica, that's the data that you may most frequently query. So, every data that is loaded in Vertica is first entering the depot, and then as a part of same transaction, also synced to communal storage for durability. So, when you query, when you run a query against Vertica, your queries are also going to find the files in the depot first, to be used, and if the files are not found, the queries will access files from communal storage. Now, the behavior of... you know, the new files, should first enter the depot or skip depot can be changed by configuration parameters that can help you skip depot when writing. When the files are not found in depot, we make assumption that you may need those files for future runs of your query. Which means we will fetch them asynchronously into the depot, so that you have those files for future runs. If that's not the behavior that you intend, you can change configuration around return, to tell Vertica to not fetch them when you run your query, and this configuration parameter can be set at database level, session level, query level, and we are also introducing a user level parameter, where you can change this behavior. Because the depot is going to be limited in size, compared to amount of data that you may store in your Eon cluster, at some point in time, your depot will be full, or hit the capacity. To make space for new data that is coming in, Vertica will evict some of the files that are least frequently used. Hence, depot is going to be your query performance enhancer. You want to shape the extent of your depot. And, so what you want to do is, to decide what shall be in your depot. Now Vertica provides some of the policies, called pinning policies, that can help you pin of statistics table or addition of a table, into a depot, at subcluster level, or at the database level. And Sumeet will talk about this a bit more in his future slides. Now look at some of the system tables that can help you understand about the size of the depot, what's in your depot, what files were evicted, what files were recently fetched into the depot. One of the important system tables that I have listed here is DC_FILE_READS. DC_FILE_READS can be used to figure out if your transaction or query fetched with data from depot, from communal storage, or component. One of the important features of Eon Mode is a subcluster. Vertica lets you divide your cluster into smaller execution groups. Now, each of the execution groups has a set of nodes together subscribed to all the shards, and can process your query independently. So when you connect one node in the subcluster, that node, along with other nodes in the subcluster, will only process your query. And because of that, we can achieve isolation as well as, you know, fetches, scale-out and scale-in without impacting what's happening on the cluster. The good thing about subclusters, is all the subclusters have access to the communal storage. And because of this, if you load data in one subcluster, it's accessible to the queries that are running in other subclusters. When we introduced subclusters, we knew that our customers would really love these features, and, some of the things that we were considering is, we knew that our customers would dynamically scale out and in, lots of-- they would add and remove lots of subclusters on demand, and we had to provide that ab-- we had to give this feature, or provide ability to add and remove subclusters in a fast and reliable way. We knew that during off-peak hours, our customers would shut down many of their subclusters, that means, more than half of the nodes could be down. And we had to make adjustment to our quorum policy which requires at least half of the nodes to be up for database to stay up. We also were aware that customers would add hundreds of nodes in the cluster, which means we had to make adjustments to the catalog and commit policy. To take care of all these three requirements we introduced two types of subclusters, primary subclusters, and secondary subclusters. Primary subclusters is the one that you get by default when you create your first Eon cluster. The nodes in the primary subclusters are always up, that means they stay up and participate in the quorum. The nodes in the primary subcluster are responsible for processing commits, and also maintain a persistent copy, of catalog on disk. This is a subcluster that you would use to process all your ETL jobs, because the topper more also runs on the node, in the primary subcluster. If you want now at this point, have another subcluster, where you would like to run queries, and also, build this cluster up and down depending on the demand or the, depending on the workload, you would create a new subcluster. And this subcluster will be off-site secondary in nature. Now secondary subclusters have nodes that don't participate in quorums, so if these nodes are down, Vertica has no impact. These nodes are also not responsible for processing commit, though they maintain up-to-date copies of the catalog in memory. They don't store catalog on disk. And these are subclusters that you can add and remove very quickly, without impacting what is running on the other subclusters. We have customers running hundreds of nodes, subclusters with hundreds of nodes, and subclusters of size like 64 node, and they can bring this subcluster up and down, or add and remove, within few minutes. So before I go into the sizing of Eon Mode, I just want to say one more thing here. We are working very closely with some of our customers who are running Eon Mode and getting better feedback from that on a regular basis. And based on the feedback, we are making lots of improvements and fixes in every hot-fix that we put out. So if you are running Eon Mode, and want to be part of this group, I suggest that, you keep your cluster current with latest hot-fixes and work with us to give us feedback, and get the improvements that you need to be successful. So let's look at what there-- What we need, to size Eon clusters. Sizing Eon clusters is very different from sizing Enterprise Mode cluster. When you are running Enterprise Mode cluster or when you're sizing Vertica cluster running Enterprise Mode, you need to take into account the amount of data that you want to store, and the configuration of your node. Depending on which you decide, how many nodes you will need, and then start the cluster. In Eon Mode, to size a cluster, you need few things like, what should be your shard count. Now, shard count decides the maximum number of nodes that will participate in your query. And we'll talk about this little bit more in the next slide. You will decide on number of nodes that you will need within a subcluster, the instance type you will pick for running statistic subcluster, and how many subclusters you will need, and how many of them should be running all the time, and how many should be running in a dynamic mode. When it comes to shard count, you have to pick shard count up front, and you can't change it once your database is up and running. So, we... So, you need to pick shard count depending the number of nodes, are the same number of nodes that you will need to process a query. Now one thing that we want to remember here, is this is not amount of data that you have in database, but this is amount of data your queries will process. So, you may have data for six years, but if your queries process last month of data, on most of the occasions, or if your dashboards are processing up to six weeks, or ten minutes, based on whatever your needs are, you will decide or pick the number of shards, shard count and nodes, based on how much data your queries process. Looking at most of our customers, we think that 12 is a good number that should work for most of our customers. And, that means, the maximum number of nodes in a subcluster that will process queries is going to be 12. If you feel that, you need more than 12 nodes to process your query, you can pick other numbers like 24 or 48. If you pick a higher number, like 48, and you go with three nodes in your subcluster, that means node subscribes to 16 primary and 16 secondary shard subscription, which totals to 32 subscriptions per node. That will leave your catalog in a broken state. So, pick shard count appropriately, don't pick prime numbers, we suggest 12 should work for most of our customers, if you think you process more than, you know, the regular, the regular number that, or you think that your customers, you think your queries process terabytes of data, then pick a number like 24. Don't pick a prime number. Okay? We are also coming up with features in Vertica like current scaling, that will help you run more-- run queries on more than, more nodes than the number of shards that you pick. And that feature will be coming out soon. So if you have picked a smaller shard count, it's not the end of the story. Now, the next thing is, you need to pick how many nodes you need within your subclusters, to process your query. Ideal number would be node number equal to shard count, or, if you want to pick a number that is less, pick node count which is such that each of the nodes has a balanced distribution of subscriptions. When... So over here, you can have, option where you can have 12 nodes and 12 shards, or you can have two subclusters with 6 nodes and 12 shards. Depending on your workload, you can pick either of the two options. The first option, where you have 12 nodes and 12 shards, is more suitable for, more suitable for batch applications, whereas two subclusters with, with six nodes each, is more suitable for desktop type applications. Picking subclusters is, it depends on your workload, you can add remove nodes relative to isolation, or Elastic Throughput Scaling. Your subclusters can have nodes of different sizes, and you need to make sure that the nodes within the subcluster have to be homogenous. So this is my last slide before I hand over to Sumeet. And this I think is very important slide that I want you to pay attention to. When you pick instance, you are going to pick instance based on workload and query budget. I want to make it clear here that we want you to pay attention to the local disk, because you have depot on your local disk, which is going to be your query performance enhancer for all kinds of deployment, in cloud, as well as on premise. So you'd expect of what you read, or what you heard, depots still play a very important role in every Eon deployment, and they act like performance enhancers. Most of our customers choose Vertica because they love the performance we offer, and we don't want you to compromise on the performance. So pick nodes with some amount of local disk, at least two terabytes is what we suggest. i3 instances in Amazon have, you know, come up with a good local disk that is very helpful, and some of our customers are benefiting from. With that said, I want to pass it over to Sumeet. >> Sumeet: So, hi everyone, my name is Sumeet Keswani, and I'm a Product Technology Engineer at Vertica. I will be discussing the various use cases that customers deploy in Eon Mode. After that, I will go into some technical details of SQL, and then I'll blend that into the best practices, in Eon Mode. And finally, we'll go through some tips and tricks. So let's get started with the use cases. So a very basic use case that users will encounter, when they start Eon Mode the first time, is they will have two subclusters. The first subcluster will be the primary subcluster, used for ETL, like Shirang mentioned. And this subcluster will be mostly on, or always on. And there will be another subcluster used for, purely for queries. And this subcluster is the secondary subcluster and it will be on sometimes. Depending on the use case. Maybe from nine to five, or Monday to Friday, depending on what application is running on it, or what users are doing on it. So this is the most basic use case, something users get started with to get their feet wet. Now as the use of the deployment of Eon Mode with subcluster increases, the users will graduate into the second use case. And this is the next level of deployment. In this situation, they still have the primary subcluster which is used for ETL, typically a larger subcluster where there is more heavier ETL running, pretty much non-stop. Then they have the usual query subcluster which will use for queries, but they may add another one, another secondary subcluster for ad-hoc workloads. The motivation for this subcluster is to isolate the unpredictable workload from the predictable workload, so as not to impact certain isolates. So you may have ad-hoc queries, or users that are running larger queries or bad workloads that occur once in a while, from running on a secondary subcluster, on a different secondary subcluster, so as to not impact the more predictable workload running on the first subcluster. Now there is no reason why these two subclusters need to have the same instances, they can have different number of nodes, different instance types, different depot configurations. And everything can be different. Another benefit is, they can be metered differently, they can be costed differently, so that the appropriate user or tenant can be billed the cost of compute. Now as the use increases even further, this is what we see as the final state of a very advanced Eon Mode deployment here. As you see, there is the primary subcluster of course, used for ETL, very heavy ETL, and that's always on. There are numerous secondary subclusters, some for predictable applications that have a very fine-tuned workload that needs a definite performance. There are other subclusters that have different usages, some for ad-hoc queries, others for demanding tenants, there could be still more subclusters for different departments, like Finance, that need it maybe at the end of the quarter. So very, very different applications, and this is the full and final promise of Eon, where there is workload isolation, there is different metering, and each app runs in its own compute space. Okay, so let's talk about a very interesting feature in Eon Mode, which we call Hibernate and Revive. So what is Hibernate? Hibernating a Vertica database is the act of dissociating all the computers on the database, and shutting it down. At this point, you shut down all compute. You still pay for storage, because your data is in the S3 bucket, but all the compute has been shut down, and you do not pay for compute anymore. If you have reserved instances, or any other instances you can use them for different applications, and your Vertica database is shut down. So this is very similar to stop database, in Eon Mode, you're stopping all compute. The benefit of course being that you pay nothing anymore for compute. So what is Revive, then? The Revive is the opposite of Hibernate, where you now associate compute with your S3 bucket or your storage, and start up the database. There is one limitation here that you should be aware of, is that the size of the database that you have during Hibernate, you must revive it the same size. So if you have a 12-node primary subcluster when hibernating, you need to provision 12 nodes in order to revive. So one best practice comes down to this, is that you must shrink your database to the smallest size possible before you hibernate, so that you can revive it in the same size, and you don't have to spin up a ton of compute in order to revive. So basically, what this means is, when you have decided to hibernate, we ask you to remove all your secondary subclusters and shrink your primary subcluster down to the bare minimum before you hibernate it. And the benefit being, is when you do revive, you will have, you will be able to do so with the mimimum number of nodes. And of course, before you hibernate, you must cleanly shut down the database, so that all the data can be synced to S3. Finally, let's talk about backups and replication. Backups and replications are still supported in Eon Mode, we sometimes get the question, "We're in S3, and S3 has nine nines of reliability, we need a backup." Yes, we highly recommend backups, you can back-up by using the VBR script, you can back-up your database to another bucket, you can also copy the bucket and revive to a different, revive a different instance of your database. This is very useful because many times people want staging or development databases, and they need some of the data from production, and this is a nice way to get that. And it also makes sure that if you accidentally delete something you will be able to get back your data. Okay, so let's go into best practices now. I will start, let's talk about the depot first, which is the biggest performance enhancer that we see for queries. So, I want to state very clearly that reading from S3, or a remote object store like S3 is very slow, because data has to go over the network, and it's very expensive. You will pay for access cost. This is where S3 is not very cheap, is that every time you access the data, there is an ATI and access cost levied. Now the depot is a performance enhancing feature that will improve the performance of queries by keeping a local cache of the data that is most frequently used. It will also reduce the cost of accessing the data because you no longer have to go to the remote object store to get the data, since it's available on a local and permanent volume. Hence depot shaping is a very important aspect of performance tuning in an Eon database. What we ask you to do is, if you are going to use a specific table or partition frequency, you can choose to pin it, in the depot, so that if your depot is under pressure or is highly utilized, these objects that are most frequently used are kept in the depot. So therefore, depot, depot shaping is the act of setting eviction policies, instead you prevent the eviction of files that you believe you need to keep, so for example, you may keep the most recent year's data or the most recent, recent partition in the depot, and thereby all queries running on those partitions will be faster. At this time, we allow you to pin any table or partition in the depot, but it is not subcluster-based. Future versions of Vertica will allow you fine-tuning the depot based on each subcluster. So, let's now go and understand a little bit of internals of how a SQL query works in Eon Mode. And, once I explain this, we will blend into best practice and it will become much more clearer why we recommend certain things. So, since S3 is our layer of durability, where data is persistent in an Eon database. When you run an insert query, like, insert into table value one, or something similar. Data is synchronously written into S3. So, it will control returns back to the client, the copy of the data is first stored in the local depot, and then uploaded to S3. And only then do we hand the control back to the client. This ensures that if something bad were to happen, the data will be persistent. The second, the second types of SQL transactions are what we call DTLs, which are catalog operations. So for example, you create a table, or you added a column. These operations are actually working with metadata. Now, as you may know, S3 does not offer mutable storage, the storage in S3 is immutable. You can never append to a file in S3. And, the way transaction logs work is, they are append operation. So when you modify the metadata, you are actually appending to a transaction log. So this poses an interesting challenge which we resolve by appending to the transaction log locally in the catalog, and then there is a service that syncs the catalog to S3 every five minutes. So this poses an interesting challenge, right. If you were to destroy or delete an instance abruptly, you could lose the commits that happened in the last five minutes. And I'll speak to this more in the subsequent slides. Now, finally let's look at, drops or truncates in Eon. Now a drop or a truncate is really a combination of the first two things that we spoke about, when you drop a table, you are making, a drop operation, you are making a metadata change. You are telling Vertica that this table no longer exists, so we go into the transaction log, and append into the transaction log, that this table has been removed. This log of course, will be synced every five minutes to S3, like we spoke. There is also the secondary operation of deleting all the files that were associated with data in this table. Now these files are on S3. And we can go about deleting them synchronously, but that would take a lot of time. And we do not want to hold up the client for this duration. So at this point, we do not synchronously delete the files, we put the files that need to be removed in a reaper queue. And return the control back to the client. And this has the performance benefit as to the drops appear to occur really fast. This also has a cost benefit, batching deletes, in big batches, is more performant, and less costly. For example, on Amazon, you could delete 1,000 files at a time in a single cost. So if you batched your deletes, you could delete them very quickly. The disadvantage of this is if you were to terminate a Vertica customer abruptly, you could leak files in S3, because the reaper queue would not have had the chance to delete these files. Okay, so let's, let's go into best practices after speaking, after understanding some technical details. So, as I said, reading and writing to S3 is slow and costly. So, the first thing you can do is, avoid as many round trips to S3 as possible. The bigger the batches of data you load, the better. The better performance you get, per commit. The fact thing is, don't read and write from S3 if you can avoid it. A lot of our customers have intermediate data processing which they think temporarily they will transform the data before finally committing it. There is no reason to use regular tables for this kind of intermediate data. We recommend using local temporary tables, and local temporary tables have the benefit of not having to upload data to S3. Finally, there is another optimization you can make. Vertica has the concept of active partitions and inactive partitions. Active partitions are the ones where you have recently loaded data, and Vertica is lazy about merging these partitions into a single ROS container. Inactive partitions are historical partitions, like, consider last year's data, or the year before that data. Those partitions are aggressively merging into a single container. And how do we know how many partitions are active and inactive? Well that's based on the configuration parameter. If you load into an inactive partition, Vertica is very aggressive about merging these containers, so we download the entire partition, merge the records that you loaded into it, and upload it back again. This creates a lot of network traffic, and I said, accessing data is, from S3, slow and costly. So we recommend you not load into inactive partitions. You should load into the most recent or active partitions, and if you happen to load into inactive partitions, set your active partition count correctly. Okay, let's talk about the reaper queue. Depending on the velocity of your ETL, you can pile up a lot of files that need to be deleted asynchronously. If you were were to terminate a Vertica customer without allowing enough time for these files to get deleted, you could leak files in S3. Now, of course if you use local temporary tables this problem does not occur because the files were never created in S3, but if you are using regular tables, you must allow Vertica enough time to delete these files, and you can change the interval at which we delete, and how much time we allow to delete and shut down, by exiting some configuration parameters that I have mentioned here. And, yeah. Okay, so let's talk a little bit about a catalog at this point. So, the catalog is synced every five minutes onto S3 for persistence. And, the catalog truncation version is the minimum, minimal viable version of the catalog to which we can revive. So, for instance, if somebody destroyed a Vertica cluster, the entire Vertica cluster, the catalog truncation version is the mimimum viable version that you will be able to revive to. Now, in order to make sure that the catalog truncation version is up to date, you must always shut down your Vertica cluster cleanly. This allows the catalog to be synced to S3. Now here are some SQL commands that you can use to see what the catalog truncation version is on S3. For the most part, you don't have to worry about this if you're shutting down cleanly, so, this is only in cases of disaster or some event where all nodes were terminated, without... without the user's permission. And... And finally let's talk about backups, so one more time, we highly recommend you take backups, you know, S3 is designed for 99.9% availability, so there could be a, maybe an occasional down-time, making sure you have backups will help you if you accidentally drop a table. S3 will not protect you against data that was deleted by accident, so, having a backup helps you there. And why not backup, right, storage is cheap. You can replicate the entire bucket and have that as a backup, or have DR plus, you're running in a different region, which also sources a backup. So, we highly recommend that you make backups. So, so with this I would like to, end my presentation, and we're ready for any questions if you have it. Thank you very much. Thank you very much.

Published Date : Mar 30 2020

SUMMARY :

Also as reminder, that you can maximize your screen and get the improvements that you need to be successful. So, the first thing you can do is,

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
JeffPERSON

0.99+

SumeetPERSON

0.99+

Sumeet KeswaniPERSON

0.99+

Shirang KamatPERSON

0.99+

Jeff HealeyPERSON

0.99+

6 nodesQUANTITY

0.99+

VerticaORGANIZATION

0.99+

five minutesQUANTITY

0.99+

six yearsQUANTITY

0.99+

ten minutesQUANTITY

0.99+

12 nodesQUANTITY

0.99+

ShirangPERSON

0.99+

1,000 filesQUANTITY

0.99+

oneQUANTITY

0.99+

12 shardsQUANTITY

0.99+

forum.vertica.comOTHER

0.99+

99.9%QUANTITY

0.99+

two modesQUANTITY

0.99+

S3TITLE

0.99+

AmazonORGANIZATION

0.99+

first subclusterQUANTITY

0.99+

first timeQUANTITY

0.99+

two optionsQUANTITY

0.99+

firstQUANTITY

0.99+

first optionQUANTITY

0.99+

eachQUANTITY

0.99+

two subclustersQUANTITY

0.99+

Each nodeQUANTITY

0.99+

hundreds of nodesQUANTITY

0.99+

TodayDATE

0.99+

each appQUANTITY

0.99+

todayDATE

0.99+

last yearDATE

0.99+

secondQUANTITY

0.99+

OneQUANTITY

0.98+

three nodesQUANTITY

0.98+

SQLTITLE

0.98+

Eon ModeTITLE

0.98+

single containerQUANTITY

0.97+

this weekDATE

0.97+

16 secondary shard subscriptionQUANTITY

0.97+

two typesQUANTITY

0.97+

Sizing and Configuring Vertica in Eon Mode for Different Use CasesTITLE

0.97+

VerticaTITLE

0.97+

one limitationQUANTITY

0.97+

UNLIST TILL 4/2 - A Deep Dive into the Vertica Management Console Enhancements and Roadmap


 

>> Jeff: Hello, everybody, and thank you for joining us today for the virtual Vertica BDC 2020. Today's breakout session is entitled "A Deep Dive "into the Vertica Mangement Console Enhancements and Roadmap." I'm Jeff Healey of Vertica Marketing. I'll be your host for this breakout session. Joining me are Bhavik Gandhi and Natalia Stavisky from Vertica engineering. But before we begin, I encourage you to submit questions or comments during the virtual session. You don't have to wait, just type your question or comment in the question box below the slides and click submit. There will be a Q and A session at the end of the presentation. We'll answer as many questions as we're able to during that time. Any questions we don't address, we'll do our best to answer them offline. Alternatively visit Vertica Forums at forum.vertica.com. Post your question there after the session. Our engineering team is planning to join the forums to keep the conversation going well after the event. Also, a reminder that you can maximize the screen by clicking the double arrow button in the lower right corner of the slides. And yes, this virtual session is being recorded and will be available to you on demand this week. We'll send you a notification as soon as it's ready. Now let's get started. Over to you, Bhavik. >> Bhavik: All right. So hello, and welcome, everybody doing this presentation of "Deep Dive into the Vertica Management Console Enhancements and Roadmap." Myself, Bhavik, and my team member, Natalia Stavisky, will go over a few useful announcements on Vertica Management Console, discussing a few real scenarios. All right. So today we will go forward with the brief introduction about the Management Console, then we will discuss the benefits of using Management Console by going over a couple of user scenarios for the query taking too long to run and receiving email alerts from Management Console. Then we will go over a few MC features for what we call Eon Mode databases, like provisioning and reviving the Eon Mode databases from MC, managing the subcluster and understanding the Depot. Then we will go over some of the future announcements on MC that we are planning. All right, so let's get started. All right. So, do you want to know about how to provision a new Vertica cluster from MC? How to analyze and understand a database workload by monitoring the queries on the database? How do you balance the resource pools and use alerts and thresholds on MC? So, the Management Console is basically our answer and we'll talk about its capabilities and new announcements in this presentation. So just to give a brief overview of the Management Console, who uses Management Console, it's generally used by IT administrators and DB admins. Management Console can be used to monitor both Eon Mode and Enterprise Mode databases. Why to use Management Console? You can use Management Console for provisioning Vertica databases and cluster. You can manage the already existing Vertica databases and cluster you have, and you can use various tools on Management Console like query execution, Database Designer, Workload Analyzer, and set up alerts and thresholds to get notified by some of your activities on the MC. So let's go over a few benefits of using Management Console. Okay. So using Management Console, you can view and optimize resource pool usage. Management Console helps you to identify some critical conditions on your Vertica cluster. Additionally, you can set up various thresholds thresholds in MC and get other data if those thresholds are triggered on the database. So now let's dig into the couple of scenarios. So for the first scenario, we will discuss about queries taking too long and using workload analyzer to possibly help to solve the problem. In the second scenario, we will go over alert email that you received from your Management Console and analyzing the problem and taking required actions to solve the problem. So let's go over the scenario where queries are taking too long to run. So in this example, we have this one query that we are running using the query execution on MC. And for some reason we notice that it's taking about 14.8 seconds seconds to execute this query, which is higher than the expected run time of the query. The query that we are running happens to be the query used by MC during the extended monitoring. Notice that the table name and the schema name which is ds_requests_issued, and, is the schema used for extended monitoring. Now in 10.0 MC we have redesigned the Workload Analyzer and Recommendations feature to show the recommendations and allow you to execute those recommendations. In our example, we have taken the table name and figured the tuning descriptions to see if there are any tuning recommendations related to this table. As we see over here, there are three tuning recommendations available for that table. So now in 10.0 MC, you can select those recommendations and then run them. So let's run the recommendations. All right. So once recommendations are run successfully, you can go and see all the processed recommendations that you have run previously. Over here we see that there are three recommendations that we had selected earlier have successfully processed. Now we take the same query and run it on the query execution on MC and hey, it's running really faster and we see that it takes only 0.3 seconds to run the query and, which is about like 98% decrease in original runtime of the query. So in this example we saw that using a Workload Analyzer tool on MC you can possibly triage and solve issue for your queries which are taking to long to execute. All right. So now let's go over another user scenario where DB admin's received some alert email messages from MC and would like to understand and analyze the problem. So to know more about what's going on on the database and proactively react to the problems, DB admins using the Management Console can create set of thresholds and get alerted about the conditions on the database if the threshold values is reached and then respond to the problem thereafter. Now as a DB admin, I see some email message notifications from MC and upon checking the emails, I see that there are a couple of email alerts received from MC on my email. So one of the messages that I received was for Query Resource Rejections greater than 5, pool, midpool7. And then around the same time, I received another email from the MC for the Failed Queries greater than 5, and in this case I see there are 80 failed queries. So now let's go on the MC and investigate the problem. So before going into the deep investigation about failures, let's review the threshold settings on MC. So as we see, we have set up the thresholds under the database settings page for failed queries in the last 10 minutes greater than 5 and MC should send an email to the individual if the threshold is triggered. And also we have a threshold set up for queries and resource rejections in the last five minutes for midpool7 set to greater than 5. There are various other thresholds on this page that you can set if you desire to. Now let's go and triage those email alerts about the failed queries and resource rejections that we had received. To analyze the failed queries, let's take a look at the query statistics page on the database Overview page on MC. Let's take a look at the Resource Pools graph and especially for the failed queries for each resource pools. And over to the right under the failed query section, I see about like, in the last 24 hours, there are about 6,000 failed queries for midpool7. And now I switch to view to see the statistics for each user and on this page I see for User MaryLee on the right hand side there are a high number of failed queries in last 24 hours. And to know more about the failed queries for this user, I can click on the graph for this user and get the reasons behind it. So let's click on the graph and see what's going on. And so clicking on this graph, it takes me to the failed queries view on the Query Monitoring page for database, on Database activities tab. And over here, I see there are a high number of failed queries for this user, MaryLee, with the reasons stated as, exceeding high limit. To drill down more and to know more reasons behind it, I can click on the plus icon on the left hand side for each failed queries to get the failure reason for each node on the database. So let's do that. And clicking the plus icon, I see for the two nodes that are listed, over here it says there are insufficient resources like memory and file handles for midpool7. Now let's go and analyze the midpool7 configurations and activities on it. So to do so, I will go over to the Resource Pool Monitoring view and select midpool7. I see the resource allocations for this resource pool is very low. For example, the max memory is just 1MB and the max concurrency is set to 0. Hmm, that's very odd configuration for this resource pool. Also in the bottom right graph for the resource rejections for midpool7, the graph shows very high values for resource rejection. All right. So since we saw some odd configurations and odd resource allocations for midpool7, I would like to see when this resource, when the settings were changed on the resource pools. So to do this, I can preview the audit logs on, are available on the Management Console. So I can go onto the Vertica Audit Logs and see the logs for the resource pool. So I just (mumbles) for the logs and figuring the logs for midpool7. I see on February 17th, the memory and other attributes for midpool7 were modified. So now let's analyze the resource activity for midpool7 around the time when the configurations were changed. So in our case we are using extended monitoring on MC for this database, so we can go back in time and see the statistics over the larger time range for midpool7. So viewing the activities for midpool7 around February 17th, around the time when these configurations were changed, we see a decrease in resource pool usage. Also, on the bottom right, we see the resource rejections for this midpool7 have an increase, linear increase, after the configurations were changed. I can select a point on the graph to get the more details about the resource rejections. Now to analyze the effects of the modifications on midpool7. Let's go over to the Query Monitoring page. All right, I will adjust the time range around the time when the configurations were changed for midpool7 and completed activities queries for user MaryLee. And I see there are no completed queries for this user. Now I'm taking a look at the Failed Queries tab and adjusting the time range around the time when the configurations were changed. I can do so because we are using extended monitoring. So again, adjusting the time, I can see there are high number of failed queries for this user. There about about like 10,000 failed queries for this user after the configurations were changed on this resource pool. So now let's go and modify the settings since we know after the configurations were changed, this user was not able to run the queries. So you can change the resource pool settings of using Management Console's database settings page and under the Resource Pools tab. So selecting the midpool7, I see the same odd configurations for this resource pool that we saw earlier. So now let's go and modify it, the settings. So I will increase the max memory and modify the settings for midpool7 so that it has adequate resources to run the queries for the user. Hit apply on the right hand top to see the settings. Now let's do the validation after we change the resource pool attributes. So let's go over to the same query monitoring page and see if MaryLee user is able to run the queries for midpool7. We see that now, after the configuration, after the change, after we changed the configuration for midpool7, the user can run the queries successfully and the count for Completed Queries has increased after we modified the settings for this midpool7 resource pool. And also viewing the resource pool monitoring page, we can validate that after the new configurations for midpool7 has been applied and also the resource pool usage after the configuration change has increased. And also on the bottom right graph, we can see that the resource rejections for midpool7 has decreased over the time after we modified the settings. And since we are using extended monitoring for this database, I can see that the trend in data for these resource pools, the before and after effects of modifying the settings. So initially when the settings were changed, there were high resource rejections and after we again modified the settings, the resource rejections went down. Right. So now let's go work with the provisioning and reviving the Eon Mode Vertica database cluster using the Management Console on different platform. So Management Console supports provisioning and reviving of Eon Mode databases on various cloud environments like AWS, the Google Cloud Platform, and Pure Storage. So for Google, for provisioning the Vertica Management Console on Google Cloud Platform you can use launch a template. Or on AWS environment you can use the cloud formation templates available for different OS's. Once you have provisioned Vertica Management Console, you can provision the Vertica cluster and databases from MC itself. So you can provision a Vertica cluster, you can select the Create new database button available on the homepage. This will open up the wizard to create a new database and cluster. In this example, we are using we are using the Google Cloud Platform. So the wizard will ask me for varius authentication parameters for the Google Cloud Platform. And if you're on AWS, it'll ask you for the authentication parameters for the AWS environment. And going forward on the Wizard, it'll ask me to select the instance Type. I will select for the new Vertica cluster. And also provide the communal location url for my Eon Mode database and all the other preferences related to the new cluster. Once I have selected all the preferences for my new cluster I can preview the settings and I can hit, if I am, I can hit Create if all looks okay. So if I hit Create, this will create a new, MC will create a new GCP instances because we are on the GCP environment in this example. It will create a cluster on this instance, it'll create a Vertica Eon Mode Database on this cluster. And it will, additionally, you can load the test data on it if you like to. Now let's go over and revive the existing Eon Mode database from the communal location. So you can do it the same using the Management Console by selecting the Revive Eon Mode database button on the homepage. This will again open up the wizard for reviving the Eon Mode database. Again, in this example, since we are using GCP Platform, it will ask me for the Google Cloud storage authentication attributes. And for reviving, it will ask me for the communal location so I can enter the Google Storage bucket and my folder and it will discover all the Eon Mode databases located under this folder. And I can select one of the databases that I would like to revive. And it will ask me for other Vertica preferences and for this video, for this database reviving. And once I enter all the preferences and review all the preferences I can hit Revive the database button on the Wizard. So after I hit Revive database it will create the GCP instances. The number of GCP instances that I created would be seen as the number of hosts on the original Vertica cluster. It will install the Vertica cluster on this data, on this instances and it will revive the database and it will start the database. And after starting the database, it will be imported on the MC so you can start monitoring on it. So in this example, we saw you can provision and revive the Vertica database on the GCP Platform. Additionally, you can use AWS environment to provision and revive. So now since we have the Eon Mode database on MC, Natalia will go over some Eon Mode features on MC like managing subcluster and Depot activity monitoring. Over to you, Natalia. >> Natalia: Okay, thank you. Hello, my name is Natalia Stavisky. I am also a member of Vertica Management Console Team. And I will talk today about the work I did to allow users to manage subclusters using the Management Console, and also the work I did to help users understand what's going on in their Depot in the Vertica Eon Mode database. So let's look at the picture of the subclusters. On the Manage page of Vertica Management Console, you can see here is a page that has blue tabs, and the tab that's active is Subclusters. You can see that there are two subclusters are available in this database. And for each of the subclusters, you can see subcluster properties, whether this is the primary subcluster or secondary. In this case, primary is the default subcluster. It's indicated by a star. You can see what nodes belong to each subcluster. You can see the node state and node statistics. You can also easily add a new subcluster. And we're quickly going to do this. So once you click on the button, you'll launch the wizard that'll take you through the steps. You'll enter the name of the subcluster, indicate whether this is secondary or primary subcluster. I should mention that Vertica recommends having only one primary subcluster. But we have both options here available. You will enter the number of nodes for your subcluster. And once the subcluster has been created, you can manage the subcluster. What other options for managing subcluster we have here? You can scale up an existing subcluster and that's a similar approach, you launch the wizard and (mumbles) nodes. You want to add to your existing subcluster. You can scale down a subcluster. And MC validates requirements for maintaining minimal number of nodes to prevent database shutdown. So if you can not remove any nodes from a subcluster, this option will not be available. You can stop a subcluster. And depending on whether this is a primary subcluster or secondary subcluster, this option may be available or not available. Like in this picture, we can see that for the default subcluster this option is not available. And this is because shutting down the default subcluster will cause the database to shut down as well. You can terminate a subcluster. And again, the MC warns you not to terminate the primary subcluster and validates requirements for maintaining minimal number of nodes to prevent database shutdown. So now we are going to talk a little more about how the MC helps you to understand what's going on in your Depot. So Depot is one of the core of Eon Mode database. And what are the frequently asked questions about the Depot? Is the Depot size sufficient? Are a subset of users putting a high load on the database? What tables are fetched and evicted repeatedly, we call it "re-fetched," in Depot? So here in the Depot Activity Monitoring page, we now have four tabs that allow you to answer those questions. And we'll go a little more in detail through each of them, but I'll just mention what they are for now. At a Glance shows you basic Depot configuration and also shows you query executing. Depot Efficiency, we'll talk more about that and other tabs. Depot Content, that shows you what tables are currently in your Depot. And Depot Pinning allows you to see what pinning policies have been created and to create new pinning policies. Now let's go through a scenario. Monitoring performance of workloads on one subcluster. As you know, Eon Mode database allows you to have multiple subclusters and we'll explore how this feature is useful and how we can use the Management Console to make decisions regarding whether you would like to have multiple subclusters. So here we have, in my setup, a single subcluster called default_subcluster. It has two users that are running queries that are accessing tables, mostly in schema public. So the query started executing and we can see that after fetching tables from Communal, which is the red line, the rest of the time the queries are executing in Depot. The green line is indicating queries running in Depot. The all nodes Depot is about 88% full, a steady flow, and the depot size seems to be sufficient for query executions from Depot only. That's the good case scenario. Now at around 17 :15, user Sherry got an urgent request to generate a report. And at, she started running her queries. We can see that picture is quite different now. The tables Sherry is querying are in a different schema and are much larger. Now we can see multiple lines in different colors. We can see a bunch of fetches and evictions which are indicated by blue and purple bars, and a lot of queries are now spilling into Communal. This is the red and orange lines. Orange line is an indicator of a query running partially in Depot and partially getting fetched from Communal. And the red line is data fetched from Communal storage. Let's click on the, one of the lines. Each data point, each point on the line, it'll take you to the Query Details page where you can see more about what's going on. So this is the page that shows us what queries have been run in this particular time interval which is on top of this page in orange color. So that's about one minute time interval and now we can see user Sherry among the users that are running queries. Sherry's queries involve large tables and are running against a different schema. We can see the clickstream schema in the name of the, in part of the query request. So what is happening, there is not enough Depot space for both the schema that's already in use and the one Sherry needs. As a result, evictions and fetches have started occurring. What other questions we can ask ourself to help us understand what's going on? So how about, what tables are most frequently re-fetched? So for that, we will go to the Depot Efficiency page and look at the middle, the middle chart here. We can see the larger version of this chart if we expand it. So now we have 10 tables listed that are most frequently being re-fetched. We can see that there is a clickstream schema and there are other schemas so all of those tables are being used in the queries, fetched, and then there is not enough space in the Depot, they getting evicted and they get re-fetched again. So what can be done to enable all queries to run in Depot? Option one can be increase the Depot size. So we can do this by running the following queries, which (mumbles) which nodes and storage location and the new Depot size. And I should mention that we can run this query from the Management Console from the query execution page. So this would have helped us to increase the Depot size. What other options do we have, for example, when increasing Depot size is not an option? We can also provision a second subcluster to isolate workloads like Sherry's. So we are going to do this now and we will provision a second subcluster using the Manage page. Here we're creating subcluster for Sherry or for workloads like hers. And we're going to create a (mumbles). So Sherry's subcluster has been created. We can see it here, added to the list of the subclusters. It's a secondary subcluster. Sherry has been instructed to use the new SherrySubcluster for her work. Now let's see what happened. We'll go again at Depot Activity page and we'll look at the At a Glance tab. We can see that around >> 18: 07, Sherry switched to running her queries on SherrySubcluster. On top of this page, you can see subcluster selected. So we currently have two subclusters and I'm looking, what happened to SherrySubcluster once it has been provisioned? So Sherry started using it and the lines after initial fetching from Depot, which was from Communal, which was the red line, after that, all Sherry's queries fit in Depot, which is indicated by green line. Also the Depot is pretty full on those nodes, about 90% full. But the queries are processed efficiently, there is no spilling into Communal. So that's a good case scenario. Let's now go back and take a look at the original subcluster, default subcluster. So on the left portion of the chart we can see multiple lines, that was activity before Sherry switched to her own designated subcluster. At around 18:07, after Sherry switched from the subcluster to using her designated subcluster, there is no, she is no longer using the subcluster, she is not putting a load in it. So the lines after that are turning a green color, which means the queries that are still running in default subcluster are all running in Depot. We can also see that Depot fetches and evictions bars, those purple and blue bars, are no longer showing significant numbers. Also we can check the second chart that shows Communal Storage Access. And we can see that the bars have also dropped, so there is no significant access for Communal Storage. So this problem has been solved. Each of the subclusters are serving queries from Depot and that's our most efficient scenario. Let's also look at the other tabs that we have for Depot monitoring. Let's look at Depot Efficiency tab. It has six charts and I'll go through each one of them quickly. Files Reads by Location gives an indicator of where the majority of query execution took place in Depot or in Communal. Top 10 Re-Fetches into Depot, and imagine the charts earlier in our user case, it shows tables that are most frequently fetched and evicted and then fetched again. These are good candidates to get pinned if increasing Depot size is not an option. Note that both of these charts have an option to select time interval using calendar widget. So you can get the information about the activity that happened during that time interval. Depot Pinning shows what portion of your Depot is pinned, both by byte count and by table count. And the three tables at the bottom show Depot structure. How long tables stay in Depot, we would like tables to be fetched in Depot and stay there for a long time, how often they are accessed, again, the tables in Depot, we would like to see them accessed frequently, and what the size range of tables in Depot. Depot Content. This tab allows us to search for tables that are currently in Depot and also to see stats like table size in Depot. How often tables are accessed and when were they last accessed. And the same information that's available for tables in Depot is also available on projections and partition levels for those tables. Depot Pinning. This tab allows users to see what policies are currently existing and so you can do this by clicking on the first little button and click search. This'll show you all existing policies that are already created. The second option allows you to search for a table and create a policy. You can also use the action column to modify existing policies or delete them. And the third option provides details about most frequently re-fetched tables, including fetch count, total access count, and number of re-fetched bytes. So all this information can help to make decisions regarding pinning specific tables. So that's about it about the Depot. And I should mention that the server team also has a very good presentation on the, webinar, on the Eon Mode database Depot management and subcluster management. that strongly recommend it to attend or download the slide presentation. Let's talk quickly about the Management Console Roadmap, what we are planning to do in the future. So we are going to continue focusing on subcluster management, there is still a lot of things we can do here. Promoting/demoting subclusters. Load balancing across subclusters, scheduling subcluster actions, support for large cluster mode. We'll continue working on Workload Analyzer enhancement recommendation, on backup and restore from the MC. Building custom thresholds, and Eon on HDFS support. Okay, so we are ready now to take any questions you may have now. Thank you.

Published Date : Mar 30 2020

SUMMARY :

for the virtual Vertica BDC 2020. and all the other preferences related to the new cluster. and the depot size seems to be sufficient So on the left portion of the chart

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Natalia StaviskyPERSON

0.99+

SherryPERSON

0.99+

MaryLeePERSON

0.99+

Jeff HealeyPERSON

0.99+

NataliaPERSON

0.99+

JeffPERSON

0.99+

February 17thDATE

0.99+

second scenarioQUANTITY

0.99+

10 tablesQUANTITY

0.99+

forum.vertica.comOTHER

0.99+

AWSORGANIZATION

0.99+

1MBQUANTITY

0.99+

two usersQUANTITY

0.99+

first scenarioQUANTITY

0.99+

second optionQUANTITY

0.99+

VerticaORGANIZATION

0.99+

BhavikPERSON

0.99+

80 failed queriesQUANTITY

0.99+

todayDATE

0.99+

DepotORGANIZATION

0.99+

thirdQUANTITY

0.99+

EachQUANTITY

0.99+

six chartsQUANTITY

0.99+

bothQUANTITY

0.99+

each pointQUANTITY

0.99+

three recommendationsQUANTITY

0.99+

TodayDATE

0.99+

eachQUANTITY

0.99+

GoogleORGANIZATION

0.99+

Bhavik GandhiPERSON

0.99+

midpool7TITLE

0.99+

two nodesQUANTITY

0.99+

second chartQUANTITY

0.99+

two subclustersQUANTITY

0.98+

second subclusterQUANTITY

0.98+

Each data pointQUANTITY

0.98+

each userQUANTITY

0.98+

both optionsQUANTITY

0.98+

4/2DATE

0.98+

EonORGANIZATION

0.97+

this weekDATE

0.97+

each subclusterQUANTITY

0.97+

about 90%QUANTITY

0.97+

three tablesQUANTITY

0.96+

0QUANTITY

0.96+

about 14.8 seconds secondsQUANTITY

0.96+

one subclusterQUANTITY

0.95+

UNLIST TILL 4/2 - A Technical Overview of Vertica Architecture


 

>> Paige: Hello, everybody and thank you for joining us today on the Virtual Vertica BDC 2020. Today's breakout session is entitled A Technical Overview of the Vertica Architecture. I'm Paige Roberts, Open Source Relations Manager at Vertica and I'll be your host for this webinar. Now joining me is Ryan Role-kuh? Did I say that right? (laughs) He's a Vertica Senior Software Engineer. >> Ryan: So it's Roelke. (laughs) >> Paige: Roelke, okay, I got it, all right. Ryan Roelke. And before we begin, I want to be sure and encourage you guys to submit your questions or your comments during the virtual session while Ryan is talking as you think of them as you go along. You don't have to wait to the end, just type in your question or your comment in the question box below the slides and click submit. There'll be a Q and A at the end of the presentation and we'll answer as many questions as we're able to during that time. Any questions that we don't address, we'll do our best to get back to you offline. Now, alternatively, you can visit the Vertica forums to post your question there after the session as well. Our engineering team is planning to join the forums to keep the conversation going, so you can have a chat afterwards with the engineer, just like any other conference. Now also, you can maximize your screen by clicking the double arrow button in the lower right corner of the slides and before you ask, yes, this virtual session is being recorded and it will be available to view on demand this week. We'll send you a notification as soon as it's ready. Now, let's get started. Over to you, Ryan. >> Ryan: Thanks, Paige. Good afternoon, everybody. My name is Ryan and I'm a Senior Software Engineer on Vertica's Development Team. I primarily work on improving Vertica's query execution engine, so usually in the space of making things faster. Today, I'm here to talk about something that's more general than that, so we're going to go through a technical overview of the Vertica architecture. So the intent of this talk, essentially, is to just explain some of the basic aspects of how Vertica works and what makes it such a great database software and to explain what makes a query execute so fast in Vertica, we'll provide some background to explain why other databases don't keep up. And we'll use that as a starting point to discuss an academic database that paved the way for Vertica. And then we'll explain how Vertica design builds upon that academic database to be the great software that it is today. I want to start by sharing somebody's approximation of an internet minute at some point in 2019. All of the data on this slide is generated by thousands or even millions of users and that's a huge amount of activity. Most of the applications depicted here are backed by one or more databases. Most of this activity will eventually result in changes to those databases. For the most part, we can categorize the way these databases are used into one of two paradigms. First up, we have online transaction processing or OLTP. OLTP workloads usually operate on single entries in a database, so an update to a retail inventory or a change in a bank account balance are both great examples of OLTP operations. Updates to these data sets must be visible immediately and there could be many transactions occurring concurrently from many different users. OLTP queries are usually key value queries. The key uniquely identifies the single entry in a database for reading or writing. Early databases and applications were probably designed for OLTP workloads. This example on the slide is typical of an OLTP workload. We have a table, accounts, such as for a bank, which tracks information for each of the bank's clients. An update query, like the one depicted here, might be run whenever a user deposits $10 into their bank account. Our second category is online analytical processing or OLAP which is more about using your data for decision making. If you have a hardware device which periodically records how it's doing, you could analyze trends of all your devices over time to observe what data patterns are likely to lead to failure or if you're Google, you might log user search activity to identify which links helped your users find the answer. Analytical processing has always been around but with the advent of the internet, it happened at scales that were unimaginable, even just 20 years ago. This SQL example is something you might see in an OLAP workload. We have a table, searches, logging user activity. We will eventually see one row in this table for each query submitted by users. If we want to find out what time of day our users are most active, then we could write a query like this one on the slide which counts the number of unique users running searches for each hour of the day. So now let's rewind to 2005. We don't have a picture of an internet minute in 2005, we don't have the data for that. We also don't have the data for a lot of other things. The term Big Data is not quite yet on anyone's radar and The Cloud is also not quite there or it's just starting to be. So if you have a database serving your application, it's probably optimized for OLTP workloads. OLAP workloads just aren't mainstream yet and database engineers probably don't have them in mind. So let's innovate. It's still 2005 and we want to try something new with our database. Let's take a look at what happens when we do run an analytic workload in 2005. Let's use as a motivating example a table of stock prices over time. In our table, the symbol column identifies the stock that was traded, the price column identifies the new price and the timestamp column indicates when the price changed. We have several other columns which, we should know that they're there, but we're not going to use them in any example queries. This table is designed for analytic queries. We're probably not going to make any updates or look at individual rows since we're logging historical data and want to analyze changes in stock price over time. Our database system is built to serve OLTP use cases, so it's probably going to store the table on disk in a single file like this one. Notice that each row contains all of the columns of our data in row major order. There's probably an index somewhere in the memory of the system which will help us to point lookups. Maybe our system expects that we will use the stock symbol and the trade time as lookup keys. So an index will provide quick lookups for those columns to the position of the whole row in the file. If we did have an update to a single row, then this representation would work great. We would seek to the row that we're interested in, finding it would probably be very fast using the in-memory index. And then we would update the file in place with our new value. On the other hand, if we ran an analytic query like we want to, the data access pattern is very different. The index is not helpful because we're looking up a whole range of rows, not just a single row. As a result, the only way to find the rows that we actually need for this query is to scan the entire file. We're going to end up scanning a lot of data that we don't need and that won't just be the rows that we don't need, there's many other columns in this table. Many information about who made the transaction, and we'll also be scanning through those columns for every single row in this table. That could be a very serious problem once we consider the scale of this file. Stocks change a lot, we probably have thousands or millions or maybe even billions of rows that are going to be stored in this file and we're going to scan all of these extra columns for every single row. If we tried out our stocks use case behind the desk for the Fortune 500 company, then we're probably going to be pretty disappointed. Our queries will eventually finish, but it might take so long that we don't even care about the answer anymore by the time that they do. Our database is not built for the task we want to use it for. Around the same time, a team of researchers in the North East have become aware of this problem and they decided to dedicate their time and research to it. These researchers weren't just anybody. The fruits of their labor, which we now like to call the C-Store Paper, was published by eventual Turing Award winner, Mike Stonebraker, along with several other researchers from elite universities. This paper presents the design of a read-optimized relational DBMS that contrasts sharply with most current systems, which are write-optimized. That sounds exactly like what we want for our stocks use case. Reasoning about what makes our queries executions so slow brought our researchers to the Memory Hierarchy, which essentially is a visualization of the relative speeds of different parts of a computer. At the top of the hierarchy, we have the fastest data units, which are, of course, also the most expensive to produce. As we move down the hierarchy, components get slower but also much cheaper and thus you can have more of them. Our OLTP databases data is stored in a file on the hard disk. We scanned the entirety of this file, even though we didn't need most of the data and now it turns out, that is just about the slowest thing that our query could possibly be doing by over two orders of magnitude. It should be clear, based on that, that the best thing we can do to optimize our query's execution is to avoid reading unnecessary data from the disk and that's what the C-Store researchers decided to look at. The key innovation of the C-Store paper does exactly that. Instead of storing data in a row major order, in a large file on disk, they transposed the data and stored each column in its own file. Now, if we run the same select query, we read only the relevant columns. The unnamed columns don't factor into the table scan at all since we don't even open the files. Zooming out to an internet scale sized data set, we can appreciate the savings here a lot more. But we still have to read a lot of data that we don't need to answer this particular query. Remember, we had two predicates, one on the symbol column and one on the timestamp column. Our query is only interested in AAPL stock, but we're still reading rows for all of the other stocks. So what can we do to optimize our disk read even more? Let's first partition our data set into different files based on the timestamp date. This means that we will keep separate files for each date. When we query the stocks table, the database knows all of the files we have to open. If we have a simple predicate on the timestamp column, as our sample query does, then the database can use it to figure out which files we don't have to look at at all. So now all of our disk reads that we have to do to answer our query will produce rows that pass the timestamp predicate. This eliminates a lot of wasteful disk reads. But not all of them. We do have another predicate on the symbol column where symbol equals AAPL. We'd like to avoid disk reads of rows that don't satisfy that predicate either. And we can avoid those disk reads by clustering all the rows that match the symbol predicate together. If all of the AAPL rows are adjacent, then as soon as we see something different, we can stop reading the file. We won't see any more rows that can pass the predicate. Then we can use the positions of the rows we did find to identify which pieces of the other columns we need to read. One technique that we can use to cluster the rows is sorting. So we'll use the symbol column as a sort key for all of the columns. And that way we can reconstruct a whole row by seeking to the same row position in each file. It turns out, having sorted all of the rows, we can do a bit more. We don't have any more wasted disk reads but we can still be more efficient with how we're using the disk. We've clustered all of the rows with the same symbol together so we don't really need to bother repeating the symbol so many times in the same file. Let's just write the value once and say how many rows we have. This one length encoding technique can compress large numbers of rows into a small amount of space. In this example, we do de-duplicate just a few rows but you can imagine de-duplicating many thousands of rows instead. This encoding is great for reducing the amounts of disk we need to read at query time, but it also has the additional benefit of reducing the total size of our stored data. Now our query requires substantially fewer disk reads than it did when we started. Let's recap what the C-Store paper did to achieve that. First, we transposed our data to store each column in its own file. Now, queries only have to read the columns used in the query. Second, we partitioned the data into multiple file sets so that all rows in a file have the same value for the partition column. Now, a predicate on the partition column can skip non-matching file sets entirely. Third, we selected a column of our data to use as a sort key. Now rows with the same value for that column are clustered together, which allows our query to stop reading data once it finds non-matching rows. Finally, sorting the data this way enables high compression ratios, using one length encoding which minimizes the size of the data stored on the disk. The C-Store system combined each of these innovative ideas to produce an academically significant result. And if you used it behind the desk of a Fortune 500 company in 2005, you probably would've been pretty pleased. But it's not 2005 anymore and the requirements of a modern database system are much stricter. So let's take a look at how C-Store fairs in 2020. First of all, we have designed the storage layer of our database to optimize a single query in a single application. Our design optimizes the heck out of that query and probably some similar ones but if we want to do anything else with our data, we might be in a bit of trouble. What if we just decide we want to ask a different question? For example, in our stock example, what if we want to plot all the trade made by a single user over a large window of time? How do our optimizations for the previous query measure up here? Well, our data's partitioned on the trade date, that could still be useful, depending on our new query. If we want to look at a trader's activity over a long period of time, we would have to open a lot of files. But if we're still interested in just a day's worth of data, then this optimization is still an optimization. Within each file, our data is ordered on the stock symbol. That's probably not too useful anymore, the rows for a single trader aren't going to be clustered together so we will have to scan all of the rows in order to figure out which ones match. You could imagine a worse design but as it becomes crucial to optimize this new type of query, then we might have to go as far as reconfiguring the whole database. The next problem of one of scale. One server is probably not good enough to serve a database in 2020. C-Store, as described, runs on a single server and stores lots of files. What if the data overwhelms this small system? We could imagine exhausting the file system's inodes limit with lots of small files due to our partitioning scheme. Or we could imagine something simpler, just filling up the disk with huge volumes of data. But there's an even simpler problem than that. What if something goes wrong and C-Store crashes? Then our data is no longer available to us until the single server is brought back up. A third concern, another one of scalability, is that one deployment does not really suit all possible things and use cases we could imagine. We haven't really said anything about being flexible. A contemporary database system has to integrate with many other applications, which might themselves have pretty restricted deployment options. Or the demands imposed by our workloads have changed and the setup you had before doesn't suit what you need now. C-Store doesn't do anything to address these concerns. What the C-Store paper did do was lead very quickly to the founding of Vertica. Vertica's architecture and design are essentially all about bringing the C-Store designs into an enterprise software system. The C-Store paper was just an academic exercise so it didn't really need to address any of the hard problems that we just talked about. But Vertica, the first commercial database built upon the ideas of the C-Store paper would definitely have to. This brings us back to the present to look at how an analytic query runs in 2020 on the Vertica Analytic Database. Vertica takes the key idea from the paper, can we significantly improve query performance by changing the way our data is stored and give its users the tools to customize their storage layer in order to heavily optimize really important or commonly wrong queries. On top of that, Vertica is a distributed system which allows it to scale up to internet-sized data sets, as well as have better reliability and uptime. We'll now take a brief look at what Vertica does to address the three inadequacies of the C-Store system that we mentioned. To avoid locking into a single database design, Vertica provides tools for the database user to customize the way their data is stored. To address the shortcomings of a single node system, Vertica coordinates processing among multiple nodes. To acknowledge the large variety of desirable deployments, Vertica does not require any specialized hardware and has many features which smoothly integrate it with a Cloud computing environment. First, we'll look at the database design problem. We're a SQL database, so our users are writing SQL and describing their data in SQL way, the Create Table statement. Create Table is a logical description of what your data looks like but it doesn't specify the way that it has to be stored, For a single Create Table, we could imagine a lot of different storage layouts. Vertica adds some extensions to SQL so that users can go even further than Create Table and describe the way that they want the data to be stored. Using terminology from the C-Store paper, we provide the Create Projection statement. Create Projection specifies how table data should be laid out, including column encoding and sort order. A table can have multiple projections, each of which could be ordered on different columns. When you query a table, Vertica will answer the query using the projection which it determines to be the best match. Referring back to our stock example, here's a sample Create Table and Create Projection statement. Let's focus on our heavily optimized example query, which had predicates on the stock symbol and date. We specify that the table data is to be partitioned by date. The Create Projection Statement here is excellent for this query. We specify using the order by clause that the data should be ordered according to our predicates. We'll use the timestamp as a secondary sort key. Each projection stores a copy of the table data. If you don't expect to need a particular column in a projection, then you can leave it out. Our average price query didn't care about who did the trading, so maybe our projection design for this query can leave the trader column out entirely. If the question we want to ask ever does change, maybe we already have a suitable projection, but if we don't, then we can create another one. This example shows another projection which would be much better at identifying trends of traders, rather than identifying trends for a particular stock. Next, let's take a look at our second problem, that one, or excuse me, so how should you decide what design is best for your queries? Well, you could spend a lot of time figuring it out on your own, or you could use Vertica's Database Designer tool which will help you by automatically analyzing your queries and spitting out a design which it thinks is going to work really well. If you want to learn more about the Database Designer Tool, then you should attend the session Vertica Database Designer- Today and Tomorrow which will tell you a lot about what the Database Designer does and some recent improvements that we have made. Okay, now we'll move to our next problem. (laughs) The challenge that one server does not fit all. In 2020, we have several orders of magnitude more data than we had in 2005. And you need a lot more hardware to crunch it. It's not tractable to keep multiple petabytes of data in a system with a single server. So Vertica doesn't try. Vertica is a distributed system so will deploy multiple severs which work together to maintain such a high data volume. In a traditional Vertica deployment, each node keeps some of the data in its own locally-attached storage. Data is replicated so that there is a redundant copy somewhere else in the system. If any one node goes down, then the data that it served is still available on a different node. We'll also have it so that in the system, there's no special node with extra duties. All nodes are created equal. This ensures that there is no single point of failure. Rather than replicate all of your data, Vertica divvies it up amongst all of the nodes in your system. We call this segmentation. The way data is segmented is another parameter of storage customization and it can definitely have an impact upon query performance. A common way to segment data is by using a hash expression, which essentially randomizes the node that a row of data belongs to. But with a guarantee that the same data will always end up in the same place. Describing the way data is segmented is another part of the Create Projection Statement, as seen in this example. Here we segment on the hash of the symbol column so all rows with the same symbol will end up on the same node. For each row that we load into the system, we'll apply our segmentation expression. The result determines which segment the row belongs to and then we'll send the row to each node which holds the copy of that segment. In this example, our projection is marked KSAFE 1, so we will keep one redundant copy of each segment. When we load a row, we might find that its segment had copied on Node One and Node Three, so we'll send a copy of the row to each of those nodes. If Node One is temporarily disconnected from the network, then Node Three can serve the other copy of the segment so that the whole system remains available. The last challenge we brought up from the C-Store design was that one deployment does not fit all. Vertica's cluster design neatly addressed many of our concerns here. Our use of segmentation to distribute data means that a Vertica system can scale to any size of deployment. And since we lack any special hardware or nodes with special purposes, Vertica servers can run anywhere, on premise or in the Cloud. But let's suppose you need to scale out your cluster to rise to the demands of a higher workload. Suppose you want to add another node. This changes the division of the segmentation space. We'll have to re-segment every row in the database to find its new home and then we'll have to move around any data that belongs to a different segment. This is a very expensive operation, not something you want to be doing all that often. Traditional Vertica doesn't solve that problem especially well, but Vertica Eon Mode definitely does. Vertica's Eon Mode is a large set of features which are designed with a Cloud computing environment in mind. One feature of this design is elastic throughput scaling, which is the idea that you can smoothly change your cluster size without having to pay the expenses of shuffling your entire database. Vertica Eon Mode had an entire session dedicated to it this morning. I won't say any more about it here, but maybe you already attended that session or if you haven't, then I definitely encourage you to listen to the recording. If you'd like to learn more about the Vertica architecture, then you'll find on this slide links to several of the academic conference publications. These four papers here, as well as Vertica Seven Years Later paper which describes some of the Vertica designs seven years after the founding and also a paper about the innovations of Eon Mode and of course, the Vertica documentation is an excellent resource for learning more about what's going on in a Vertica system. I hope you enjoyed learning about the Vertica architecture. I would be very happy to take all of your questions now. Thank you for attending this session.

Published Date : Mar 30 2020

SUMMARY :

A Technical Overview of the Vertica Architecture. Ryan: So it's Roelke. in the question box below the slides and click submit. that the best thing we can do

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
RyanPERSON

0.99+

Mike StonebrakerPERSON

0.99+

Ryan RoelkePERSON

0.99+

2005DATE

0.99+

2020DATE

0.99+

thousandsQUANTITY

0.99+

2019DATE

0.99+

$10QUANTITY

0.99+

Paige RobertsPERSON

0.99+

VerticaORGANIZATION

0.99+

PaigePERSON

0.99+

Node ThreeTITLE

0.99+

TodayDATE

0.99+

FirstQUANTITY

0.99+

each fileQUANTITY

0.99+

RoelkePERSON

0.99+

each rowQUANTITY

0.99+

Node OneTITLE

0.99+

millionsQUANTITY

0.99+

each hourQUANTITY

0.99+

eachQUANTITY

0.99+

SecondQUANTITY

0.99+

second categoryQUANTITY

0.99+

each columnQUANTITY

0.99+

One techniqueQUANTITY

0.99+

oneQUANTITY

0.99+

two predicatesQUANTITY

0.99+

each nodeQUANTITY

0.99+

One serverQUANTITY

0.99+

SQLTITLE

0.99+

C-StoreTITLE

0.99+

second problemQUANTITY

0.99+

Ryan RolePERSON

0.99+

ThirdQUANTITY

0.99+

North EastLOCATION

0.99+

each segmentQUANTITY

0.99+

todayDATE

0.98+

single entryQUANTITY

0.98+

each dateQUANTITY

0.98+

GoogleORGANIZATION

0.98+

one rowQUANTITY

0.98+

one serverQUANTITY

0.98+

single serverQUANTITY

0.98+

single entriesQUANTITY

0.98+

bothQUANTITY

0.98+

20 years agoDATE

0.98+

two paradigmsQUANTITY

0.97+

a dayQUANTITY

0.97+

this weekDATE

0.97+

billions of rowsQUANTITY

0.97+

VerticaTITLE

0.97+

4/2DATE

0.97+

single applicationQUANTITY

0.97+

each queryQUANTITY

0.97+

Each projectionQUANTITY

0.97+

UNLIST TILL 4/2 - Model Management and Data Preparation


 

>> Sue: Hello, everybody, and thank you for joining us today for the virtual Vertica BDC 2020. Today's breakout session is entitled Machine Learning with Vertica, Data Preparation and Model Management. My name is Sue LeClaire, Director of Managing at Vertica and I'll be your host for this webinar. Joining me is Waqas Dhillon. He's part of the Vertica Product Management Team at Vertica. Before we begin, I want to encourage you to submit questions or comments during the virtual session. You don't have to wait. Just type your question or comment in the question box below the slides and click submit. There will be a Q and A session at the end of the presentation. We'll answer as many questions as we're able to during that time. Any questions that we don't address, we'll do our best to answer offline. Alternately, you can visit Vertica Forums to post your questions there after the session. Our engineering team is planning to join the forums to keep the conversation going. Also, a reminder that you can maximize your screen by clicking the double arrow button in the lower right corner of the slides, and yes, this virtual session is being recorded and will be available to view on demand later this week. We'll send you a notification as soon as it's ready. So, let's get started. Waqas, over to you. >> Waqas: Thank you, Sue. Hi, everyone. My name is Waqas Dhillon and I'm a Product Manager here at Vertica. So today, we're going to go through data preparation and model management in Vertica, and the session would essentially be starting with some introduction and going through some of the machine learning configurations and you're doing machine learning at scale. After that, we have two media sections here. The first one is on data preparation, and so we'd go through data preparation is, what are the Vertica functions for data exploration and data preparation, and then share an example with you. Similarly, in the second part of this talk we'll go through different export models using PMML and how that works with Vertica, and we'll share examples from that, as well. So yeah, let's dive right in. So, Vertica essentially is an open architecture with a rich ecosystem. So, you have a lot of options for data transformation and ingesting data from different tools, and then you also have options for connecting through ODBC, JDBC, and some other connectors to BI and visualization tools. There's a lot of them that Vertica connects to, and in the middle sits Vertica, which you can have on external tables or you can have in place analytics on R, on cloud, or on prem, so that choice is yours, but essentially what it does is it offers you a lot of options for performing your data and analytics on scale, and within that, data analytics machine learning is also a core component, and then you have a lot of options and functions for that. Now, machine learning in Vertica is actually built on top of the architecture that distributed data analytics offers, so it offers a lot of those capabilities and builds on top of them, so you eliminate the overhead data transfer when you're working with Vertica machine learning, you keep your data secure, storing and managing the models really easy and much more efficient. You can serve a lot of concurrent users all at the same time, and then it's really scalable and avoids maintenance cost of a separate system, so essentially a lot of benefits here, but one important thing to mention here is that all the algorithms that you see, whether they're analytics functions, advanced analytics functions, or machine learning functions, they are distributed not just across the cluster on different nodes. So, each node gets a distributed work load. On each node, too, there might be multiple tracks and multiple processors that are running with each of these functions. So, highly distributed solution and one of its kind in this space. So, when we talk about Vertica machine learning, it essentially covers all machine learning process and we see it as something starting with data ingestion and doing data analysis and understanding, going through the steps of data preparation, modeling, evaluation, and finally deployment, as well. So, when you're using with Vertica, you're using Vertica for machine learning, it takes care of all these steps and you can do all of that inside of the Vertica database, but when we look at the three main pillars that Vertica machine learning aims to build on, the first one is to have Vertica as a platform for high performance machine learning. We have a lot of functions for data exploration and preparation and we'll go through some of them here. We have distributed in-database algorithms for model training and prediction, we have scalable functions for model evaluation, and finally we have distributed scoring functions, as well. Doing all of the stuff in the database, that's a really good thing, but we don't want it isolated in this space. We understand that a lot of our customers, our users, they like to work with other tools and work with Vertica, as well. So, they might use Vertica for data prep, another two for model training, or use Vertica for model training and take those nodes out to other tools and do prediction there. So, integration is really important part of our overall offering. So, it's a pretty flexible system. We have been offering UdX in four languages, a lot of people find there over the past few years, but the new capability of importing PMML models for in-database scoring and exporting Vertica native-models, for external scoring it's something that we have recently added, and another talk would actually go through the TensorFlow integrations, a really exciting and important milestone that we have where you can bring TensorFlow models into Vertica for in-database scoring. For this talk, we'll focus on data exploration and preparation, importing PMML, and exporting PMML models, and finally, since Vertica is not just a cue engine, but also a data store, we have a lot of really good capability for model storage and management, as well. So, yeah. Let's dive into the first part on machine learning at scale. So, when we say machine learning at scale we're actually having a few really important considerations and they have their own implications. The first one is that we want to have speed, but also want it to come at a reasonable cost. So, it's really important for us to pick the right scaling architecture. Secondly, it's not easy to move big data around. It might be easy to do that on a smaller data set, on an Excel sheet, or something of the like, but once you're talking about big data and data analytics at really big scale, it's really not easy to move that data around from one tool to another, so what you'd want to do is bring models to the data instead of having to move this data to the tools, and the third thing here is that some sub-sampling it can actually compromise your accuracy, and a lot of tools that are out there they still force you to take smaller samples of your data because they can only handle so much data, but that can impact your accuracy and the need here is that you should be able to work with all of your data. We'll just go through each of these really quickly. So, the first factor here is scalability. Now, if you want to scale your architecture, you have two main options. The first is vertical scaling. Let's say you have a machine, a server, essentially, and you can keep on adding resources, like RAM and CPU and keep increasing the performance as well as the capacity of that system, but there's a limit to what you can do here, and the limit, you can hit that in terms of cost, as well as in terms of technology. Beyond a certain point, you will not be able to scale more. So, the right solution to follow here is actually horizontal scaling in which you can keep on adding more instances to have more computing power and more capacity. So, essentially what you get with this architecture is a super computer, which stitches together several nodes and the workload is distributed on each of those nodes for massive develop processing and really fast speeds, as well. The second aspect of having big data and the difficulty around moving it around is actually can be clarified with this example. So, what usually happens is, and this is a simplified version, you have a lot of applications and tools for which you might be collecting the data, and this data then goes into an analytics database. That database then in turn might be connected to some VI tools, dashboard and applications, and some ad-hoc queries being done on the database. Then, you want to do machine learning in this architecture. What usually happens is that you have your machine learning tools and the data that is coming in to the analytics database is actually being exported out of the machine learning tools. You're training your models there, and afterwards, when you have new incoming data, that data again goes out to the machine learning tools for prediction. With those results that you get from those tools usually ended up back in the distributed database because you want to put it on dashboard or you want to power up some applications with that. So, there's essentially a lot of data overhead that's involved here. There are cons with that, including data governance, data movement, and other complications that you need to resolve here. One of the possible solutions to overcome that difficulty is that you have machine learning as part of the distributed analytical database, as well, so you get the benefits of having it applied on all of the data that's inside of the database and not having to care about all of the data movement there, but if there are some use cases where it still makes sense to at least train the models outside, that's where you can do your data preparation outside of the database, and then take the data out, the prepared data, build your model, and then bring the model back to the analytics database. In this case, we'll talk about Vertica. So, the model would be archived, hosted by Vertica, and then you can keep on applying predictions on the new data that's incoming into the database. So, the third consideration here for machine learning on scale is sampling versus full data set. As I mentioned, a lot of tools they cannot handle big data and you are forced to sub-sample, but what happens here, as you can see in the figure on the left most, figure A, is that if you have a single data point, essentially any model can explain that, but if you have more data points, as in figure B, there would be a smaller number of models that could be able to explain that, and in figure C, even more data points, lesser number of models explained, but lesser also means here that these models would probably be more accurate, and the objective for building machine learning models is mostly to have prediction capability and generalization capability, essentially, on unseen data, so if you build a model that's accurate on one data point, it could not have very good generalization capabilities. The conventional wisdom with machine learning is that the more data points that you have for learning the better and more accurate models that you'll get out of your machine learning models. So, you need to pick a tool which can handle all of your data and does not force you to sub-sample that, and doing that, even a simpler model might be much better than a more complex model here. So, yeah. Let's go to data exploration and data preparation part. Vertica's a really powerful tool and it offers a lot of scalability in this space, and as I mentioned, will support the whole process. You can define the problem and you can gather your data and construct your data set inside Vertica, and then consider it a prepared training modeling deployment and managing the model, but this is a really critical step in the overall machine learning process. Some estimate it takes between 60 to 80% of the overall effort of a machine learning process. So, a lot of functions here. You can use part of Vertica, do data exploration, de-duplication, outlier detection, balancing, normalization, and potentially a lot more. You can actually go to our Vertica documentation and find them there. Within Vertica we divide them into two parts. Within data prep, one is exploration functions, the second is transformation functions. Within exploration, you have a rich set functions that you can use in DB, and then if you want to build your own you can use the UDX to do that. Similarly, for transformation there's a lot of functions around time series, pattern matching, outlier detection that you can use to transform that data, and it's just a snapshot of some of those functions that are available in Vertica right now. And again, the good thing about these functions is not just their presence in the database. The good thing is actually their ability to scale on really, really large data set and be able to compute those results for you on that data set in an acceptable amount of time, which makes your machine learning processes really critical. So, let's go to an example and see how we can use some of these functions. As I mentioned, there's a whole lot of them and we'll not be able to go through all of them, but just for our understanding we can go through some of them and see how they work. So, we have here a sample data set of network flows. It's a similar attack from some source nodes, and then there are some victim nodes on which these attacks are happening. So yeah, let's just look at the data here real quick. We'll load the data, we'll browse the data, compute some statistics around it, ask some questions, make plots, and then clean the data. The objective here is not to make a prediction, per se, which is what we mostly do in machine learning algorithms, but to just go through the data prep process and see how easy it is to do that with Vertica and what kind of options might be there to help you through that process. So, the first step is loading the data. Since in this case we know the structure of the data, so we create a table and create different column names and data types, but let's say you have a data set for which you do not already know the structure, there's a really cool feature in Vertica called flex tables and you can use that to initially import the data into the database and then go through all of the variables and then assign them variable types. You can also use that if your data is dynamic and it's changing, to board the data first and then create these definitions. So once we've done that, we load the data into the database. It's for one week of data out of the whole data set right now, but once you've done that we'd like to look at the flows just to look at the data, you know how it looks, and once we do select star from flows and just have a limit here, we see that there's already some data duplication, and by duplication I mean rows which have the exact same data for each of the columns. So, as part of the cleaning process, the first thing we'd want to do is probably to remove that duplication. So, we create a table with distinct flows and you can see here we have about a million flows here which are unique. So, moving on. The next step we want to do here, this is essentially time state data and these times are in days of the week, so we want to look at the trends of this data. So, the network traffic that's there, you can call it flows. So, based on hours of the day how does the traffic move and how does it differ from one day to another? So, it's part of an exploration process. There might be a lot of further exploration that you want to do, but we can start with this one and see how it goes, and you can see in the graph here that we have seven days of data, and the weekend traffic, which is in pink and purple here seems a little different from the rest of the days. Pretty close to each other, but yeah, definitely something we can look into and see if there's some real difference and if there's something we want to explore further here, but the thing is that this is just data for one week, as I mentioned. What if we load data for 70 days? You'd have a longer graph probably, but a lot of lines and would not really be able to make sense out of that data. It would be a really crowded plot for that, so we have to come up with a better way to be able to explore that and we'll come back to that in a little bit. So, what are some other things that we can do? We can get some statistics, we can take one sample flow and look at some of the values here. We see that the forward column here and ToS column here, they have zero values, and when we explore further we see that there's a lot of values here or records here for which these columns are essentially zero, so probably not really helpful for our use case. Then, we can look at the flow end. So, flow end is the end time when the last packet in a flow was sent and you can do a select min flow and max flow to see the data when it started and when it ended, and you can see it's about one week's of data for the first til eighth. Now, we also want to look at the data whether it's balanced or not because balanced data is really important for a lot of classification use cases that we want to try with this and you can see that source address, destination address, source port, and destination port, and you see it's highly in balanced data and so is versus destination address space, so probably something that we need to do, really powerful Vertica balancing functions that you can use within, and just sampling, over-sampling, or hybrid sampling here and that can be really useful here. Another thing we can look at is there's so many statistics of these columns, so off the unique flows table that we created we just use the summarize num call function in Vertica and it gives us a lot of really cool (mumbling) and percentile information on that. Now, if we look at the duration, which is the last record here, we can see that the mean is about 4.6 seconds, but when we look at the percentile information, we see that the median is about 0.27. So, there's a lot of short flows that have duration less than 0.27 seconds. Yes, there would be more and they'd probably bring the mean to the 4.6 value, but then the number of short flows is probably pretty high. We can ask some other questions from the data about the features. We can look at the protocols here and look at the count. So, we see that most of the traffic that we have is for TCP and UDP, which is sort of expected for a data set like this, and then we want to look at what are the most popular network services here? So again, simply queue here, select destination port count, add in the information here. We get the destination port and count for each. So, we can see that most of the traffic here is web traffic, HTTP and HTTPS, followed by domain name resolution. So, let's explore some more. We can look at the label distributions. We see that the labels that are given with that because this is essentially data for which we already know whether something was an anomaly or not, record was anomaly or not, and creating our algorithm based on it. So, we see that there's this background label, a lot of records there, and then anomaly spam seems to be really high. There are anomaly UDB scans and SSS scams, as well. So, another question we can ask is among the SMTP flows, how labels are distributed, and we can say that anomaly spam is highest, and then comes the background spam. So, can we say out of this that SMTP flows, they are spams, and maybe we can build a model that actually answers that question for us? That can be one machine learning model that you can build out of this data set. Again, we can also verify the destination port of flows that were labeled as spam. So, you can expect port 25 for SMTP service here, and we can see that SMTP with destination port 25, you have a lot of counts here, but there are some other destination ports for which the count is really low, and essentially, when we're doing and analysis at this scale, these data points might not really be needed. So, as part of the data prep slash data cleaning we might want to get rid of these records here. So now, what we can do is going back to the graph that I showed earlier, we can try and plot the daily trends by aggregating them. Again, we take the unique flow and convert into a flow count and to a manageable number that we can then feed into one of the algorithms. Now, PCA principle component analysis, it's a really powerful algorithm in Vertica, and what it essentially does is a lot of times when you have a high number of columns, which might be highly (mumbling) with each other, you can feed them into the PCA algorithm and it will get for you a list of principle components which would be linearly independent from each other. Now, each of these components would explain a certain extent of the variants of the overall data set that you have. So, you can see here component one explains about 73.9% of the variance, and component two explains about 16% of the variance. So, if you combine those two components alone, that would get you for around 90% of the variance. Now, you can use PCA for a lot of different purposes, but in this specific example, we want to see if we combine all the data points that we have together and we do that by day of the week, what sort of information can we get out of it? Is there any insight that this provides? Because once you have two data points, it's really easy to plot them. So, we just apply the PCA, we first (mumbling) it, and then reapply on our data set, and this is the graph we get as a result. Now, you can see component one is on the X axis here, component two on the y axis, and each of these points represents a day of the week. Now, with just two points it's easy to plot that and compare this to the graph that we saw earlier, which had a lot of lines and the more weeks that we added or the more days that we added, the more lines that we'd have versus this graph in which you can clearly tell that five days traffic starting from Monday til Friday, that's closely clustered together, so probably pretty similar to each other, and then Saturday traffic is pretty much apart from all of these days and it's also further away from Sunday. So, these two days of traffic is different from other days of traffic and we can always dive deeper into this and look at exactly what's happening here and see how this traffic is actually different, but with just a few functions and some pretty simple SQL queries, we were already able to get a pretty good insight from the data set that we had. Now, let's move on to our next part of this talk on importing and exporting PMML models to and from Vertica. So, current common practice is when you're putting your machine learning models into production, you'd have a dev or test environment, and in that you might be using a lot of different tools, Scikit and Spark, R, and once you want to deploy these models into production, you'd put them into containers and there would be a pool of containers in the production environment which would be talking to your database that could be your analytical database, and all of the new data that's incoming would be coming into the database itself. So, as I mentioned in one of the slides earlier, there is a lot of data transfer that's happening between that pool of containers hosting your machine learning training models versus the database which you'd be getting data for scoring and then sending the scores back to the database. So, why would you really need to transfer your models? The thing is that no machine learning platform provides everything. There might be some really cool algorithms that might compromise, but then Spark might have its own benefits in terms of some additional algorithms or some other stuff that you're looking at and that's the reason why a lot of these tools might be used in the same company at the same time, and then there might be some functional considerations, as well. You might want to isolate your data between data science team and your production environment, and you might want to score your pre-trained models on some S nodes here. You cannot host probably a big solution, so there is a whole lot of use cases where model movement or model transfer from one tool to another makes sense. Now, one of the common methods for transferring models from one tool to another is the PMML standard. It's an XML-based model exchange format, sort of a standard way to define statistical and data mining models, and helps you share models between the different applications that are PMML compliant. Really popular tool, and that's the tool of choice that we have for moving models to and from Vertica. Now, with this model management, this model movement capability, there's a lot of model management capabilities that Vertica offers. So, models are essentially first class citizens of Vertica. What that means is that each model is associated with a DB schema, so the user that initially creates a model, that's the owner of it, but he can transfer the ownership to other users, he can work with the ownership rights in any way that you would work with any other relation in a database would be. So, the same commands that you use for granting access to a model, changing its owner, changing its name, or dropping it, you can use similar commands for more of this one. There are a lot of functions for exploring the contents of models and that really helps in putting these models into production. The metadata of these models is also available for model management and governance, and finally, the import/export part enables you to apply all of these operations to the model that you have imported or you might want to export while they're in the database, and I think it would be nice to actually go through and example to showcase some of these capabilities in our model management, including the PMML model import and export. So, the workflow for export would be that we trained some data, we'll train a logistic regression model, and we'll save it as an in-DB Vertica model. Then, we'll explore the summary and attributes of the model, look at what's inside the model, what the training parameters are, concoctions and stuff, and then we can export the model as PMML and an external tool can import that model from PMML. And similarly, we'll go through and example for export. We'll have an external PMML model trained outside of Vertica, we'll import that PMML model and from there on, essentially, we'll treat it as an in-DB PMML model. We'll explore the summary and attribute of the model in much the same way as in in-DB model. We'll apply the model for in-DB scoring and get the prediction results, and finally, we'll bring some test data. We'll use that on test data for which the scoring needs to be done. So first, we want to create a connection with the database. In this case, we are using a Python Jupyter Notebook. We have the Vertica Python connector here that you can use, really powerful connector, allows you to do a lot of cool stuff to the database using the Jupyter front end, but essentially, you can use any other SQL front end tool or for that matter, any other Python ID which lets you connect to the database. So, exporting model. First, we'll create an logistic regression model here. Select logistic regression, we'll give it a model name, then put relation, which might be a table, time table, or review. There's response column and the predictor columns. So, we get a logistic regression model that we built. Now, we look at the models table and see that the model has been created. This is a table in Vertica that contains a list of all the models that are there in the database. So, we can see here that my model that we just created, it's created with Vertica models as a category, model type is logistic regression, and we have some other metadata around this model, as well. So now, we can look at some of the summary statistics of the model. We can look at the details. So, it gives us the predictor, coefficients, standard error, Z value, and P value. We can look at the regularization parameters. We didn't use any, so that would be a value of one, but if you had used, it would show it up here, the call string and also additional information regarding iteration count, rejected row count, and accepted row count. Now, we can also look at the list of attributes of the model. So, select get model attribute using parameter, model name is myModel. So, for this particular model that we just created, it would give us the name of all the attributes that are there. Similarly, you can look at the coefficients of the model in a column format. So, using parameter name myModel, and in this case we add attribute name equals details because we want all the details for that particular model and we get the predictor name, coefficient, standard error, Z value, and P value here. So now, what we can do is we can export this model. So, we used the select export models and we give it a path to where we want the model to be exported to. We give it the name of the model that needs to be exported because essentially might have a lot of models that you have created, and you give it the category here, which in our example is PMML, and you get a status message here that export model has been successful. So now, let's move onto the importing models example. In much the same way that we created a model in Vertica and exported it out, you might want to create a model outside of Vertica in another tool and then bring that to Vertica for scoring because Vertica contains all of the hard data and it might make sense to host that model in Vertica because scoring happens a lot more quickly than model training. So, in this particular case we do a select import models and we are importing a logistic regression model that was created in Spark. The category here again is PMML. So, we get the status message that the import was successful. Now, let's look at the attributes, look at the models table, and see that the model is really present there. Now previously when we ran this query because we had only myModel there, so that was the only entry you saw, but now once this model is imported you can see that as line item number two here, Spark logistic regression, it's a public schema. The category here however is different because it's not an individuated model, rather an imported model, so you get PMML here and then other metadata regarding the model, as well. Now, let's do some of the same operations that we did with the in-DB model so we can look at the summary of the imported PMML model. So, you can see the function name, data fields, predictors, and some additional information here. Moving on. Let's look at the attributes of the PMML model. Select your model attribute. Essentially the same query that we applied earlier, but the difference here is only the model name. So, you get the attribute names, attribute field, and number of rows. We can also look at the coefficient of the PMML model, name, exponent, and coefficient here. So yeah, pretty much similar to what you can do with an in-DB model. You can also perform all operations on an important model and one additional thing we'd want to do here is to use this important model for our prediction. So in this case, we'll data do a select predict PMML and give it some values using parameters model name, and logistic regression, and match by position, it's a really cool feature. This is true in this case. Sector, true. So, if you have model being imported from another platform in which, let's say you have 50 columns, now the names of the columns in that environment in which you're training the model might be slightly different than the names of the column that you have set up for Vertica, but as long as the order is the same, Vertica can actually match those columns by position and you don't need to have the exact same names for those columns. So in this case, we have set that to true and we see that predict PMML gives us a status of one. Now, using the important model, in this case we had a certain value that we had given it, but you can also use it on a table, as well. So in that case, you also get the prediction here and you can look at the (mumbling) metrics, see how well you did. Now, just sort of wrapping this up, it's really important to know the important distinction between using your models in any tool, any single node solution tool that you might already be using, like Python or R versus Vertica. What happens is, let's say you build a model in Python. It might be a single node solution. Now, after building that model, let's say you want to do prediction on really large amounts of data and you don't want to go through the overhead of keeping to move that data out of the database to do prediction every time you want to do it. So, what you can do is you can import that model into Vertica, but what Vertica does differently than Python is that the PMML model would actually be distributed across each mode in the cluster, so it would be applying on the data segments in each of those nodes and they might be different threads running for that prediction. So, the speed that you get here from all prediction would be much, much faster. Similarly, once you build a model for machine learning in Vertica, the objective mostly is that you want to use up all of your data and build a model that's accurate and is not just using a sample of the data, but using all the data that's available to it, essentially. So, you can build that model. The model building process would again go through the same technique. It would actually be distributed across all nodes in a cluster, and it would be using up all the threads and processes available to it within those nodes. So, really fast model training, but let's say you wanted to deploy it on an edge node and maybe do prediction closer to where the data was being generated, so you can export that model in a PMML format and all deploy it on the edge node. So, it's really helpful for a lot of use cases. And just some rising takeaways from our discussion today. So, Vertica's a really powerful tool for machine learning, for data preparation, model training, prediction, and deployment. You might want to use Vertica for all of these steps or some of these steps. Either way, Vertica supports both approaches. In the upcoming releases, we are planning to have more import and export capability through PMML models. Initially, we're supporting kmeans, linear, and logistic regression, but we keep on adding more algorithms and the plan is to actually move to supporting custom models. If you want to do that with the upcoming release, our TensorFlow indication is always there which you can use, but with PMML, this is the starting point for us and we keep on improving that. Vertica model can be exported in PMML format for scoring on other platforms, and similarly, models that get build in other tools can be imported for in-DB machine learning and in-DB scoring within Vertica. There are a lot of critical model management tools that are provided in Vertica and there are a lot of them on the roadmap, as well, which would keep on developing. Many ML functions and algorithms, they're already part of the in-DB library and we keep on adding to that, as well. So, thank you so much for joining the discussion today and if you have any questions we'd love to take them now. Back to you, Sue.

Published Date : Mar 30 2020

SUMMARY :

and thank you for joining us today and the limit, you can hit that in terms of cost,

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
VerticaORGANIZATION

0.99+

Waqas DhillonPERSON

0.99+

70 daysQUANTITY

0.99+

Sue LeClairePERSON

0.99+

two pointsQUANTITY

0.99+

two daysQUANTITY

0.99+

SuePERSON

0.99+

seven daysQUANTITY

0.99+

one weekQUANTITY

0.99+

five daysQUANTITY

0.99+

SundayDATE

0.99+

two partsQUANTITY

0.99+

second partQUANTITY

0.99+

SaturdayDATE

0.99+

ExcelTITLE

0.99+

50 columnsQUANTITY

0.99+

4/2DATE

0.99+

FirstQUANTITY

0.99+

PythonTITLE

0.99+

eachQUANTITY

0.99+

each nodeQUANTITY

0.99+

TodayDATE

0.99+

first factorQUANTITY

0.99+

less than 0.27 secondsQUANTITY

0.99+

VerticaTITLE

0.99+

firstQUANTITY

0.99+

FridayDATE

0.99+

MondayDATE

0.99+

second aspectQUANTITY

0.99+

eighthQUANTITY

0.99+

todayDATE

0.99+

one dayQUANTITY

0.99+

two data pointsQUANTITY

0.99+

third considerationQUANTITY

0.99+

oneQUANTITY

0.99+

first stepQUANTITY

0.98+

first partQUANTITY

0.98+

first oneQUANTITY

0.98+

zero valuesQUANTITY

0.98+

secondQUANTITY

0.98+

both approachesQUANTITY

0.98+

about 4.6 secondsQUANTITY

0.98+

third thingQUANTITY

0.98+

SecondlyQUANTITY

0.98+

one toolQUANTITY

0.98+

zeroQUANTITY

0.98+

each modeQUANTITY

0.98+

OneQUANTITY

0.97+

figure BOTHER

0.97+

figure COTHER

0.97+

4.6 valueQUANTITY

0.97+

RTITLE

0.97+

Machine Learning with Vertica, Data Preparation and Model ManagementTITLE

0.97+

WaqasPERSON

0.97+

each modelQUANTITY

0.97+

two main optionsQUANTITY

0.97+

80%QUANTITY

0.97+

two componentsQUANTITY

0.96+

around 90%QUANTITY

0.96+

twoQUANTITY

0.96+

later this weekDATE

0.95+

UNLIST TILL 4/2 - Vertica in Eon Mode: Past, Present, and Future


 

>> Paige: Hello everybody and thank you for joining us today for the virtual Vertica BDC 2020. Today's breakout session is entitled Vertica in Eon Mode past, present and future. I'm Paige Roberts, open source relations manager at Vertica and I'll be your host for this session. Joining me is Vertica engineer, Yuanzhe Bei and Vertica Product Manager, David Sprogis. Before we begin, I encourage you to submit questions or comments during the virtual session. You don't have to wait till the end. Just type your question or comment as you think of it in the question box, below the slides and click Submit. Q&A session at the end of the presentation. We'll answer as many of your questions as we're able to during that time, and any questions that we don't address, we'll do our best to answer offline. If you wish after the presentation, you can visit the Vertica forums to post your questions there and our engineering team is planning to join the forums to keep the conversation going, just like a Dev Lounge at a normal in person, BDC. So, as a reminder, you can maximize your screen by clicking the double arrow button in the lower right corner of the slides, if you want to see them bigger. And yes, before you ask, this virtual session is being recorded and will be available to view on demand this week. We are supposed to send you a notification as soon as it's ready. All right, let's get started. Over to you, Dave. >> David: Thanks, Paige. Hey, everybody. Let's start with a timeline of the life of Eon Mode. About two years ago, a little bit less than two years ago, we introduced Eon Mode on AWS. Pretty specifically for the purpose of rapid scaling to meet the cloud economics promise. It wasn't long after that we realized that workload isolation, a byproduct of the architecture was very important to our users and going to the third tick, you can see that the importance of that workload isolation was manifest in Eon Mode being made available on-premise using Pure Storage FlashBlade. Moving to the fourth tick mark, we took steps to improve workload isolation, with a new type of subcluster which Yuanzhe will go through and to the fifth tick mark, the introduction of secondary subclusters for faster scaling and other improvements which we will cover in the slides to come. Getting started with, why we created Eon Mode in the first place. Let's imagine that your database is this pie, the pecan pie and we're loading pecan data in through the ETL cutting board in the upper left hand corner. We have a couple of free floating pecans, which we might imagine to be data supporting external tables. As you know, the Vertica has a query engine capability as well which we call external tables. And so if we imagine this pie, we want to serve it with a number of servers. Well, let's say we wanted to serve it with three servers, three nodes, we would need to slice that pie into three segments and we would serve each one of those segments from one of our nodes. Now because the data is important to us and we don't want to lose it, we're going to be saving that data on some kind of raid storage or redundant storage. In case one of the drives goes bad, the data remains available because of the durability of raid. Imagine also, that we care about the availability of the overall database. Imagine that a node goes down, perhaps the second node goes down, we still want to be able to query our data and through nodes one and three, we still have all three shards covered and we can do this because of buddy projections. Each neighbor, each nodes neighbor contains a copy of the data from the node next to it. And so in this case, node one is sharing its segment with node two. So node two can cover node one, node three can cover node two and node one back to node three. Adding a little bit more complexity, we might store the data in different copies, each copy sorted for a different kind of query. We call this projections in Vertica and for each projection, we have another copy of the data sorted differently. Now it gets complex. What happens when we want to add a node? Well, if we wanted to add a fourth node here, what we would have to do, is figure out how to re-slice all of the data in all of the copies that we have. In effect, what we want to do is take our three slices and slice it into four, which means taking a portion of each of our existing thirds and re-segmenting into quarters. Now that looks simple in the graphic here, but when it comes to moving data around, it becomes quite complex because for each copy of each segment we need to replace it and move that data on to the new node. What's more, the fourth node can't have a copy of itself that would be problematic in case it went down. Instead, what we need is we need that buddy to be sitting on another node, a neighboring node. So we need to re-orient the buddies as well. All of this takes a lot of time, it can take 12, 24 or even 36 hours in a period when you do not want your database under high demand. In fact, you may want to stop loading data altogether in order to speed it up. This is a planned event and your applications should probably be down during this period, which makes it difficult. With the advent of cloud computing, we saw that services were coming up and down faster and we determined to re-architect Vertica in a way to accommodate that rapid scaling. Let's see how we did it. So let's start with four nodes now and we've got our four nodes database. Let's add communal storage and move each of the segments of data into communal storage. Now that's the separation that we're talking about. What happens if we run queries against it? Well, it turns out that the communal storage is not necessarily performing and so the IO would be slow, which would make the overall queries slow. In order to compensate for the low performance of communal storage, we need to add back local storage, now it doesn't have to be raid because this is just an ephemeral copy but with the data files, local to the node, the queries will run much faster. In AWS, communal storage really does mean an S3 bucket and here's a simplified version of the diagram. Now, do we need to store all of the data from the segment in the depot? The answer is no and the graphics inside the bucket has changed to reflect that. It looks more like a bullseye, showing just a segment of the data being copied to the cache or to the depot, as we call it on each one of the nodes. How much data do you store on the node? Well, it would be the active data set, the last 30 days, the last 30 minutes or the last. Whatever period of time you're working with. The active working set is the hot data and that's how large you want to size your depot. By architecting this way, when you scale up, you're not re-segmenting the database. What you're doing, is you're adding more compute and more subscriptions to the existing shards of the existing database. So in this case, we've added a complete set of four nodes. So we've doubled our capacity and we've doubled our subscriptions, which means that now, the two nodes can serve the yellow shard, two nodes can serve the red shard and so on. In this way, we're able to run twice as many queries in the same amount of time. So you're doubling the concurrency. How high can you scale? Well, can you scale to 3X, 5X? We tested this in the graphics on the right, which shows concurrent users in the X axis by the number of queries executed in a minute along the Y axis. We've grouped execution in runs of 10 users, 30 users, 50, 70 up to 150 users. Now focusing on any one of these groups, particularly up around 150. You can see through the three bars, starting with the bright purple bar, three nodes and three segments. That as you add nodes to the middle purple bar, six nodes and three segments, you've almost doubled your throughput up to the dark purple bar which is nine nodes and three segments and our tests show that you can go to 5X with pretty linear performance increase. Beyond that, you do continue to get an increase in performance but your incremental performance begins to fall off. Eon architecture does something else for us and that is it provides high availability because each of the nodes can be thought of as ephemeral and in fact, each node has a buddy subscription in a way similar to the prior architecture. So if we lose node four, we're losing the node responsible for the red shard and now node one has to pick up responsibility for the red shard while that node is down. When a query comes in, and let's say it comes into one and one is the initiator then one will look for participants, it'll find a blue shard and a green shard but when it's looking for the red, it finds itself and so the node number one will be doing double duty. This means that your performance will be cut in half approximately, for the query. This is acceptable until you are able to restore the node. Once you restore it and once the depot becomes rehydrated, then your performance goes back to normal. So this is a much simpler way to recover nodes in the event of node failure. By comparison, Enterprise Mode the older architecture. When we lose the fourth node, node one takes over responsibility for the first shard and the yellow shard and the red shard. But it also is responsible for rehydrating the entire data segment of the red shard to node four, this can be very time consuming and imposes even more stress on the first node. So performance will go down even further. Eon Mode has another feature and that is you can scale down completely to zero. We call this hibernation, you shut down your database and your database will maintain full consistency in a rest state in your S3 bucket and then when you need access to your database again, you simply recreate your cluster and revive your database and you can access your database once again. That concludes the rapid scaling portion of, why we created Eon Mode. To take us through workload isolation is Yuanzhe Bei, Yuanzhe. >> Yuanzhe: Thanks Dave, for presenting how Eon works in general. In the next section, I will show you another important capability of Vertica Eon Mode, the workload isolation. Dave used a pecan pie as an example of database. Now let's say it's time for the main course. Does anyone still have a problem with food touching on their plates. Parents know that it's a common problem for kids. Well, we have a similar problem in database as well. So there could be multiple different workloads accessing your database at the same time. Say you have ETL jobs running regularly. While at the same time, there are dashboards running short queries against your data. You may also have the end of month report running and their can be ad hoc data scientists, connect to the database and do whatever the data analysis they want to do and so on. How to make these mixed workload requests not interfere with each other is a real challenge for many DBAs. Vertica Eon Mode provides you the solution. I'm very excited here to introduce to you to the important concept in Eon Mode called subclusters. In Eon Mode, nodes they belong to the predefined subclusters rather than the whole cluster. DBAs can define different subcluster for different kinds of workloads and it redirects those workloads to the specific subclusters. For example, you can have an ETL subcluster, dashboard subcluster, report subcluster and the analytic machine learning subcluster. Vertica Eon subcluster is designed to achieve the three main goals. First of all, strong workload isolation. That means any operation in one subcluster should not affect or be affected by other subclusters. For example, say the subcluster running the report is quite overloaded and already there can be, the data scienctists running crazy analytic jobs, machine learning jobs on the analytics subcluster and making it very slow, even stuck or crash or whatever. In such scenario, your ETL and dashboards subcluster should not be or at least very minimum be impacted by this crisis and which means your ETL job which should not lag behind and dashboard should respond timely. We have done a lot of improvements as of 10.0 release and will continue to deliver improvements in this category. Secondly, fully customized subcluster settings. That means any subcluster can be set up and tuned for very different workloads without affecting other subclusters. Users should be able to tune up, tune down, certain parameters based on the actual needs of the individual subcluster workload requirements. As of today, Vertica already supports few settings that can be done at the subcluster level for example, the depot pinning policy and then we will continue extending more that is like resource pools (mumbles) in the near future. Lastly, Vertica subclusters should be easy to operate and cost efficient. What it means is that the subcluster should be able to turn on, turn off, add or remove or should be available for use according to rapid changing workloads. Let's say in this case, you want to spin up more dashboard subclusters because we need higher scores report, we can do that. You might need to run several report subclusters because you might want to run multiple reports at the same time. While on the other hand, you can shut down your analytic machine learning subcluster because no data scientists need to use it at this moment. So we made automate a lot of change, the improvements in this category, which I'll explain in detail later and one of the ultimate goal is to support auto scaling To sum up, what we really want to deliver for subcluster is very simple. You just need to remember that accessing subclusters should be just like accessing individual clusters. Well, these subclusters do share the same catalog. So you don't have to work out the stale data and don't need to worry about data synchronization. That'd be a nice goal, Vertica upcoming 10.0 release is certainly a milestone towards that goal, which will deliver a large part of the capability in this direction and then we will continue to improve it after 10.0 release. In the next couple of slides, I will highlight some issues about workload isolation in the initial Eon release and show you how we resolve these issues. First issue when we initially released our first or so called subcluster mode, it was implemented using fault groups. Well, fault groups and the subcluster have something in common. Yes, they are both defined as a set of nodes. However, they are very different in all the other ways. So, that was very confusing in the first place, when we implement this. As of 9.3.0 version, we decided to detach subcluster definition from the fault groups, which enabled us to further extend the capability of subclusters. Fault groups in the pre 9.3.0 versions will be converted into subclusters during the upgrade and this was a very important step that enabled us to provide all the amazing, following improvements on subclusters. The second issue in the past was that it's hard to control the execution groups for different types of workloads. There are two types of problems here and I will use some example to explain. The first issue is about control group size. There you allocate six nodes for your dashboard subcluster and what you really want is on the left, the three pairs of nodes as three execution groups, and each pair of nodes will need to subscribe to all the four shards. However, that's not really what you get. What you really get is there on the right side that the first four nodes subscribed to one shard each and the rest two nodes subscribed to two dangling shards. So you won't really get three execusion groups but instead only get one and two extra nodes have no value at all. The solution is to use subclusters. So instead of having a subcluster with six nodes, you can split it up into three smaller ones. Each subcluster will guarantee to subscribe to all the shards and you can further handle this three subcluster using load balancer across them. In this way you achieve the three real exclusion groups. The second issue is that the session participation is non-deterministic. Any session will just pick four random nodes from the subcluster as long as this covers one shard each. In other words, you don't really know which set of nodes will make up your execution group. What's the problem? So in this case, the fourth node will be doubled booked by two concurrent sessions. And you can imagine that the resource usage will be imbalanced and both queries performance will suffer. What is even worse is that these queries of the two concurrent sessions target different table They will cause the issue, that depot efficiency will be reduced, because both session will try to fetch the files on to two tables into the same depot and if your depot is not large enough, they will evict each other, which will be very bad. To solve this the same way, you can solve this by declaring subclusters, in this case, two subclusters and a load balancer group across them. The reason it solved the problem is because the session participation would not go across the boundary. So there won't be a case that any node is double booked and in terms of the depot and if you use the subcluster and avoid using a load balancer group, and carefully send the first workload to the first subcluster and the second to the second subcluster and then the result is that depot isolation is achieved. The first subcluster will maintain the data files for the first query and you don't need to worry about the file being evicted by the second kind of session. Here comes the next issue, it's the scaling down. In the old way of defining subclusters, you may have several execution groups in the subcluster. You want to shut it down, one or two execution groups to save cost. Well, here comes the pain, because you don't know which nodes may be used by which session at any point, it is hard to find the right timing to hit the shutdown button of any of the instances. And if you do and get unlucky, say in this case, you pull the first four nodes, one of the session will fail because it's participating in the node two and node four at that point. User of that session will notice because their query fails and we know that for many business this is critical problem and not acceptable. Again, with subclusters this problem is resolved. Same reason, session cannot go across the subcluster boundary. So all you need to do is just first prevent query sent to the first subcluster and then you can shut down the instances in that subcluster. You are guaranteed to not break any running sessions. Now, you're happy and you want to shut down more subclusters then you hit the issue four, the whole cluster will go down, why? Because the cluster loses quorum. As a distributed system, you need to have at least more than half of a node to be up in order to commit and keep the cluster up. This is to prevent the catalog diversion from happening, which is important. But do you still want to shut down those nodes? Because what's the point of keeping those nodes up and if you are not using them and let them cost you money right. So Vertica has a solution, you can define a subcluster as secondary to allow them to shut down without worrying about quorum. In this case, you can define the first three subclusters as secondary and the fourth one as primary. By doing so, this secondary subclusters will not be counted towards the quorum because we changed the rule. Now instead of requiring more than half of node to be up, it only require more than half of the primary node to be up. Now you can shut down your second subcluster and even shut down your third subcluster as well and keep the remaining primary subcluster to be still running healthily. There are actually more benefits by defining secondary subcluster in addition to the quorum concern, because the secondary subclusters no longer have the voting power, they don't need to persist catalog anymore. This means those nodes are faster to deploy, and can be dropped and re-added. Without the worry about the catalog persistency. For the most the subcluster that only need to read only query, it's the best practice to define them as secondary. The commit will be faster on this secondary subcluster as well, so running this query on the secondary subcluster will have less spikes. Primary subcluster as usual handle everything is responsible for consistency, the background tasks will be running. So DBAs should make sure that the primary subcluster is stable and assume is running all the time. Of course, you need to at least one primary subcluster in your database. Now with the secondary subcluster, user can start and stop as they need, which is very convenient and this further brings up another issue is that if there's an ETL transaction running and in the middle, a subcluster starting and it become up. In older versions, there is no catalog resync mechanism to keep the new subcluster up to date. So Vertica rolls back to ETL session to keep the data consistency. This is actually quite disruptive because real world ETL workloads can sometimes take hours and rolling back at the end means, a large waste of resources. We resolved this issue in 9.3.1 version by introducing a catalog resync mechanism when such situation happens. ETL transactions will not roll back anymore, but instead will take some time to resync the catalog and commit and the problem is resolved. And last issue I would like to talk about is the subscription. Especially for large subcluster when you start it, the startup time is quite long, because the subscription commit used to be serialized. In one of the in our internal testing with large catalogs committing a subscription, you can imagine it takes five minutes. Secondary subcluster is better, because it doesn't need to persist the catalog during the commit but still take about two seconds to commit. So what's the problem here? Let's do the math and look at this chart. The X axis is the time in the minutes and the Y axis is the number of nodes to be subscribed. The dark blues represents your primary subcluster and light blue represents the secondary subcluster. Let's say the subcluster have 16 nodes in total and if you start a secondary subcluster, it will spend about 30 seconds in total, because the 2 seconds times 16 is 32. It's not actually that long time. but if you imagine that starting secondary subcluster, you expect it to be super fast to react to the fast changing workload and 30 seconds is no longer trivial anymore and what is even worse is on the primary subcluster side. Because the commit is much longer than five minutes let's assume, then at the point, you are committing to six nodes subscription all other nodes already waited for 30 minutes for GCLX or we know the global catalog lock, and the Vertica will crash the nodes, if any node cannot get the GCLX for 30 minutes. So the end result is that your whole database crashed. That's a serious problem and we know that and that's why we are already planning for the fix, for the 10.0, so that all the subscription will be batched up and all the nodes will commit at the same time concurrently. And by doing that, you can imagine the primary subcluster can finish commiting in five minutes instead of crashing and the secondary subcluster can be finished even in seconds. That summarizes the highlights for the improvements we have done as of 10.0, and I hope you already get excited about Emerging Eon Deployment Pattern that's shown here. A primary subcluster that handles data loading, ETL jobs and tuple mover jobs is the backbone of the database and you keep it running all the time. At the same time defining different secondary subcluster for different workloads and provision them when the workload requirement arrives and then de-provision them when the workload is done to save the operational cost. So can't wait to play with the subcluster. Here as are some Admin Tools command you can start using. And for more details, check out our Eon subcluster documentation for more details. And thanks everyone for listening and I'll head back to Dave to talk about the Eon on-prem. >> David: Thanks Yuanzhe. At the same time that Yuanzhe and the rest of the dev team were working on the improvements that Yuanzhe described in and other improvements. This guy, John Yovanovich, stood on stage and told us about his deployment at at&t where he was running Eon Mode on-prem. Now this was only six months after we had launched Eon Mode on AWS. So when he told us that he was putting it into production on-prem, we nearly fell out of our chairs. How is this possible? We took a look back at Eon and determined that the workload isolation and the improvement to the operations for restoring nodes and other things had sufficient value that John wanted to run it on-prem. And he was running it on the Pure Storage FlashBlade. Taking a second look at the FlashBlade we thought alright well, does it have the performance? Yes, it does. The FlashBlade is a collection of individual blades, each one of them with NVMe storage on it, which is not only performance but it's scalable and so, we then asked is it durable? The answer is yes. The data safety is implemented with the N+2 redundancy which means that up to two blades can fail and the data remains available. And so with this we realized DBAs can sleep well at night, knowing that their data is safe, after all Eon Mode outsources the durability to the communal storage data store. Does FlashBlade have the capacity for growth? Well, yes it does. You can start as low as 120 terabytes and grow as high as about eight petabytes. So it certainly covers the range for most enterprise usages. And operationally, it couldn't be easier to use. When you want to grow your database. You can simply pop new blades into the FlashBlade unit, and you can do that hot. If one goes bad, you can pull it out and replace it hot. So you don't have to take your data store down and therefore you don't have to take Vertica down. Knowing all of these things we got behind Pure Storage and partnered with them to implement the first version of Eon on-premise. That changed our roadmap a little bit. We were imagining it would start with Amazon and then go to Google and then to Azure and at some point to Alibaba cloud, but as you can see from the left column, we started with Amazon and went to Pure Storage. And then from Pure Storage, we went to Minio and we launched Eon Mode on Minio at the end of last year. Minio is a little bit different than Pure Storage. It's software only, so you can run it on pretty much any x86 servers and you can cluster them with storage to serve up an S3 bucket. It's a great solution for up to about 120 terabytes Beyond that, we're not sure about performance implications cause we haven't tested it but for your dev environments or small production environments, we think it's great. With Vertica 10, we're introducing Eon Mode on Google Cloud. This means not only running Eon Mode in the cloud, but also being able to launch it from the marketplace. We're also offering Eon Mode on HDFS with version 10. If you have a Hadoop environment, and you want to breathe new fresh life into it with the high performance of Vertica, you can do that starting with version 10. Looking forward we'll be moving Eon mode to Microsoft Azure. We expect to have something breathing in the fall and offering it to select customers for beta testing and then we expect to release it sometime in 2021 Following that, further on horizon is Alibaba cloud. Now, to be clear we will be putting, Vertica in Enterprise Mode on Alibaba cloud in 2020 but Eon Mode is going to trail behind whether it lands in 2021 or not, we're not quite sure at this point. Our goal is to deliver Eon Mode anywhere you want to run it, on-prem or in the cloud, or both because that is one of the great value propositions of Vertica is the hybrid capability, the ability to run in both your on prem environment and in the cloud. What's next, I've got three priority and roadmap slides. This is the first of the three. We're going to start with improvements to the core of Vertica. Starting with query crunching, which allows you to run long running queries faster by getting nodes to collaborate, you'll see that coming very soon. We'll be making improvements to large clusters and specifically large cluster mode. The management of large clusters over 60 nodes can be tedious. We intend to improve that. In part, by creating a third network channel to offload some of the communication that we're now loading onto our spread or agreement protocol. We'll be improving depot efficiency. We'll be pushing down more controls to the subcluster level, allowing you to control your resource pools at the subcluster level and we'll be pairing tuple moving with data loading. From an operational flexibility perspective, we want to make it very easy to shut down and revive primaries and secondaries on-prem and in the cloud. Right now, it's a little bit tedious, very doable. We want to make it as easy as a walk in the park. We also want to allow you to be able to revive into a different size subcluster and last but not least, in fact, probably the most important, the ability to change shard count. This has been a sticking point for a lot of people and it puts a lot of pressure on the early decision of how many shards should my database be? Whether it's in 2020 or 2021. We know it's important to you so it's important to us. Ease of use is also important to us and we're making big investments in the management console, to improve managing subclusters, as well as to help you manage your load balancer groups. We also intend to grow and extend Eon Mode to new environments. Now we'll take questions and answers

Published Date : Mar 30 2020

SUMMARY :

and our engineering team is planning to join the forums and going to the third tick, you can see that and the second to the second subcluster and the improvement to the

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
David SprogisPERSON

0.99+

DavidPERSON

0.99+

oneQUANTITY

0.99+

DavePERSON

0.99+

John YovanovichPERSON

0.99+

10 usersQUANTITY

0.99+

Paige RobertsPERSON

0.99+

VerticaORGANIZATION

0.99+

Yuanzhe BeiPERSON

0.99+

JohnPERSON

0.99+

five minutesQUANTITY

0.99+

2020DATE

0.99+

AmazonORGANIZATION

0.99+

30 secondsQUANTITY

0.99+

50QUANTITY

0.99+

second issueQUANTITY

0.99+

12QUANTITY

0.99+

YuanzhePERSON

0.99+

120 terabytesQUANTITY

0.99+

30 usersQUANTITY

0.99+

two typesQUANTITY

0.99+

2021DATE

0.99+

PaigePERSON

0.99+

30 minutesQUANTITY

0.99+

three pairsQUANTITY

0.99+

secondQUANTITY

0.99+

firstQUANTITY

0.99+

nine nodesQUANTITY

0.99+

first subclusterQUANTITY

0.99+

two tablesQUANTITY

0.99+

two nodesQUANTITY

0.99+

first issueQUANTITY

0.99+

each copyQUANTITY

0.99+

2 secondsQUANTITY

0.99+

36 hoursQUANTITY

0.99+

second subclusterQUANTITY

0.99+

fourth nodeQUANTITY

0.99+

eachQUANTITY

0.99+

six nodesQUANTITY

0.99+

third subclusterQUANTITY

0.99+

bothQUANTITY

0.99+

twiceQUANTITY

0.99+

First issueQUANTITY

0.99+

three segmentsQUANTITY

0.99+

todayDATE

0.99+

three barsQUANTITY

0.99+

24QUANTITY

0.99+

5XQUANTITY

0.99+

TodayDATE

0.99+

16 nodesQUANTITY

0.99+

AlibabaORGANIZATION

0.99+

each segmentQUANTITY

0.99+

first nodeQUANTITY

0.99+

three slicesQUANTITY

0.99+

Each subclusterQUANTITY

0.99+

each nodesQUANTITY

0.99+

three nodesQUANTITY

0.99+

AWSORGANIZATION

0.99+

two subclustersQUANTITY

0.98+

three serversQUANTITY

0.98+

four shardsQUANTITY

0.98+

3XQUANTITY

0.98+

threeQUANTITY

0.98+

two concurrent sessionsQUANTITY

0.98+

UNLIST TILL 4/2 - Vertica Big Data Conference Keynote


 

>> Joy: Welcome to the Virtual Big Data Conference. Vertica is so excited to host this event. I'm Joy King, and I'll be your host for today's Big Data Conference Keynote Session. It's my honor and my genuine pleasure to lead Vertica's product and go-to-market strategy. And I'm so lucky to have a passionate and committed team who turned our Vertica BDC event, into a virtual event in a very short amount of time. I want to thank the thousands of people, and yes, that's our true number who have registered to attend this virtual event. We were determined to balance your health, safety and your peace of mind with the excitement of the Vertica BDC. This is a very unique event. Because as I hope you all know, we focus on engineering and architecture, best practice sharing and customer stories that will educate and inspire everyone. I also want to thank our top sponsors for the virtual BDC, Arrow, and Pure Storage. Our partnerships are so important to us and to everyone in the audience. Because together, we get things done faster and better. Now for today's keynote, you'll hear from three very important and energizing speakers. First, Colin Mahony, our SVP and General Manager for Vertica, will talk about the market trends that Vertica is betting on to win for our customers. And he'll share the exciting news about our Vertica 10 announcement and how this will benefit our customers. Then you'll hear from Amy Fowler, VP of strategy and solutions for FlashBlade at Pure Storage. Our partnership with Pure Storage is truly unique in the industry, because together modern infrastructure from Pure powers modern analytics from Vertica. And then you'll hear from John Yovanovich, Director of IT at AT&T, who will tell you about the Pure Vertica Symphony that plays live every day at AT&T. Here we go, Colin, over to you. >> Colin: Well, thanks a lot joy. And, I want to echo Joy's thanks to our sponsors, and so many of you who have helped make this happen. This is not an easy time for anyone. We were certainly looking forward to getting together in person in Boston during the Vertica Big Data Conference and Winning with Data. But I think all of you and our team have done a great job, scrambling and putting together a terrific virtual event. So really appreciate your time. I also want to remind people that we will make both the slides and the full recording available after this. So for any of those who weren't able to join live, that is still going to be available. Well, things have been pretty exciting here. And in the analytic space in general, certainly for Vertica, there's a lot happening. There are a lot of problems to solve, a lot of opportunities to make things better, and a lot of data that can really make every business stronger, more efficient, and frankly, more differentiated. For Vertica, though, we know that focusing on the challenges that we can directly address with our platform, and our people, and where we can actually make the biggest difference is where we ought to be putting our energy and our resources. I think one of the things that has made Vertica so strong over the years is our ability to focus on those areas where we can make a great difference. So for us as we look at the market, and we look at where we play, there are really three recent and some not so recent, but certainly picking up a lot of the market trends that have become critical for every industry that wants to Win Big With Data. We've heard this loud and clear from our customers and from the analysts that cover the market. If I were to summarize these three areas, this really is the core focus for us right now. We know that there's massive data growth. And if we can unify the data silos so that people can really take advantage of that data, we can make a huge difference. We know that public clouds offer tremendous advantages, but we also know that balance and flexibility is critical. And we all need the benefit that machine learning for all the types up to the end data science. We all need the benefits that they can bring to every single use case, but only if it can really be operationalized at scale, accurate and in real time. And the power of Vertica is, of course, how we're able to bring so many of these things together. Let me talk a little bit more about some of these trends. So one of the first industry trends that we've all been following probably now for over the last decade, is Hadoop and specifically HDFS. So many companies have invested, time, money, more importantly, people in leveraging the opportunity that HDFS brought to the market. HDFS is really part of a much broader storage disruption that we'll talk a little bit more about, more broadly than HDFS. But HDFS itself was really designed for petabytes of data, leveraging low cost commodity hardware and the ability to capture a wide variety of data formats, from a wide variety of data sources and applications. And I think what people really wanted, was to store that data before having to define exactly what structures they should go into. So over the last decade or so, the focus for most organizations is figuring out how to capture, store and frankly manage that data. And as a platform to do that, I think, Hadoop was pretty good. It certainly changed the way that a lot of enterprises think about their data and where it's locked up. In parallel with Hadoop, particularly over the last five years, Cloud Object Storage has also given every organization another option for collecting, storing and managing even more data. That has led to a huge growth in data storage, obviously, up on public clouds like Amazon and their S3, Google Cloud Storage and Azure Blob Storage just to name a few. And then when you consider regional and local object storage offered by cloud vendors all over the world, the explosion of that data, in leveraging this type of object storage is very real. And I think, as I mentioned, it's just part of this broader storage disruption that's been going on. But with all this growth in the data, in all these new places to put this data, every organization we talk to is facing even more challenges now around the data silo. Sure the data silos certainly getting bigger. And hopefully they're getting cheaper per bit. But as I said, the focus has really been on collecting, storing and managing the data. But between the new data lakes and many different cloud object storage combined with all sorts of data types from the complexity of managing all this, getting that business value has been very limited. This actually takes me to big bet number one for Team Vertica, which is to unify the data. Our goal, and some of the announcements we have made today plus roadmap announcements I'll share with you throughout this presentation. Our goal is to ensure that all the time, money and effort that has gone into storing that data, all the data turns into business value. So how are we going to do that? With a unified analytics platform that analyzes the data wherever it is HDFS, Cloud Object Storage, External tables in an any format ORC, Parquet, JSON, and of course, our own Native Roth Vertica format. Analyze the data in the right place in the right format, using a single unified tool. This is something that Vertica has always been committed to, and you'll see in some of our announcements today, we're just doubling down on that commitment. Let's talk a little bit more about the public cloud. This is certainly the second trend. It's the second wave maybe of data disruption with object storage. And there's a lot of advantages when it comes to public cloud. There's no question that the public clouds give rapid access to compute storage with the added benefit of eliminating data center maintenance that so many companies, want to get out of themselves. But maybe the biggest advantage that I see is the architectural innovation. The public clouds have introduced so many methodologies around how to provision quickly, separating compute and storage and really dialing-in the exact needs on demand, as you change workloads. When public clouds began, it made a lot of sense for the cloud providers and their customers to charge and pay for compute and storage in the ratio that each use case demanded. And I think you're seeing that trend, proliferate all over the place, not just up in public cloud. That architecture itself is really becoming the next generation architecture for on-premise data centers, as well. But there are a lot of concerns. I think we're all aware of them. They're out there many times for different workloads, there are higher costs. Especially if some of the workloads that are being run through analytics, which tend to run all the time. Just like some of the silo challenges that companies are facing with HDFS, data lakes and cloud storage, the public clouds have similar types of siloed challenges as well. Initially, there was a belief that they were cheaper than data centers, and when you added in all the costs, it looked that way. And again, for certain elastic workloads, that is the case. I don't think that's true across the board overall. Even to the point where a lot of the cloud vendors aren't just charging lower costs anymore. We hear from a lot of customers that they don't really want to tether themselves to any one cloud because of some of those uncertainties. Of course, security and privacy are a concern. We hear a lot of concerns with regards to cloud and even some SaaS vendors around shared data catalogs, across all the customers and not enough separation. But security concerns are out there, you can read about them. I'm not going to jump into that bandwagon. But we hear about them. And then, of course, I think one of the things we hear the most from our customers, is that each cloud stack is starting to feel even a lot more locked in than the traditional data warehouse appliance. And as everybody knows, the industry has been running away from appliances as fast as it can. And so they're not eager to get locked into another, quote, unquote, virtual appliance, if you will, up in the cloud. They really want to make sure they have flexibility in which clouds, they're going to today, tomorrow and in the future. And frankly, we hear from a lot of our customers that they're very interested in eventually mixing and matching, compute from one cloud with, say storage from another cloud, which I think is something that we'll hear a lot more about. And so for us, that's why we've got our big bet number two. we love the cloud. We love the public cloud. We love the private clouds on-premise, and other hosting providers. But our passion and commitment is for Vertica to be able to run in any of the clouds that our customers choose, and make it portable across those clouds. We have supported on-premises and all public clouds for years. And today, we have announced even more support for Vertica in Eon Mode, the deployment option that leverages the separation of compute from storage, with even more deployment choices, which I'm going to also touch more on as we go. So super excited about our big bet number two. And finally as I mentioned, for all the hype that there is around machine learning, I actually think that most importantly, this third trend that team Vertica is determined to address is the need to bring business critical, analytics, machine learning, data science projects into production. For so many years, there just wasn't enough data available to justify the investment in machine learning. Also, processing power was expensive, and storage was prohibitively expensive. But to train and score and evaluate all the different models to unlock the full power of predictive analytics was tough. Today you have those massive data volumes. You have the relatively cheap processing power and storage to make that dream a reality. And if you think about this, I mean with all the data that's available to every company, the real need is to operationalize the speed and the scale of machine learning so that these organizations can actually take advantage of it where they need to. I mean, we've seen this for years with Vertica, going back to some of the most advanced gaming companies in the early days, they were incorporating this with live data directly into their gaming experiences. Well, every organization wants to do that now. And the accuracy for clickability and real time actions are all key to separating the leaders from the rest of the pack in every industry when it comes to machine learning. But if you look at a lot of these projects, the reality is that there's a ton of buzz, there's a ton of hype spanning every acronym that you can imagine. But most companies are struggling, do the separate teams, different tools, silos and the limitation that many platforms are facing, driving, down sampling to get a small subset of the data, to try to create a model that then doesn't apply, or compromising accuracy and making it virtually impossible to replicate models, and understand decisions. And if there's one thing that we've learned when it comes to data, prescriptive data at the atomic level, being able to show end of one as we refer to it, meaning individually tailored data. No matter what it is healthcare, entertainment experiences, like gaming or other, being able to get at the granular data and make these decisions, make that scoring applies to machine learning just as much as it applies to giving somebody a next-best-offer. But the opportunity has never been greater. The need to integrate this end-to-end workflow and support the right tools without compromising on that accuracy. Think about it as no downsampling, using all the data, it really is key to machine learning success. Which should be no surprise then why the third big bet from Vertica is one that we've actually been working on for years. And we're so proud to be where we are today, helping the data disruptors across the world operationalize machine learning. This big bet has the potential to truly unlock, really the potential of machine learning. And today, we're announcing some very important new capabilities specifically focused on unifying the work being done by the data science community, with their preferred tools and platforms, and the volume of data and performance at scale, available in Vertica. Our strategy has been very consistent over the last several years. As I said in the beginning, we haven't deviated from our strategy. Of course, there's always things that we add. Most of the time, it's customer driven, it's based on what our customers are asking us to do. But I think we've also done a great job, not trying to be all things to all people. Especially as these hype cycles flare up around us, we absolutely love participating in these different areas without getting completely distracted. I mean, there's a variety of query tools and data warehouses and analytics platforms in the market. We all know that. There are tools and platforms that are offered by the public cloud vendors, by other vendors that support one or two specific clouds. There are appliance vendors, who I was referring to earlier who can deliver package data warehouse offerings for private data centers. And there's a ton of popular machine learning tools, languages and other kits. But Vertica is the only advanced analytic platform that can do all this, that can bring it together. We can analyze the data wherever it is, in HDFS, S3 Object Storage, or Vertica itself. Natively we support multiple clouds on-premise deployments, And maybe most importantly, we offer that choice of deployment modes to allow our customers to choose the architecture that works for them right now. It still also gives them the option to change move, evolve over time. And Vertica is the only analytics database with end-to-end machine learning that can truly operationalize ML at scale. And I know it's a mouthful. But it is not easy to do all these things. It is one of the things that highly differentiates Vertica from the rest of the pack. It is also why our customers, all of you continue to bet on us and see the value that we are delivering and we will continue to deliver. Here's a couple of examples of some of our customers who are powered by Vertica. It's the scale of data. It's the millisecond response times. Performance and scale have always been a huge part of what we have been about, not the only thing. I think the functionality all the capabilities that we add to the platform, the ease of use, the flexibility, obviously with the deployment. But if you look at some of the numbers they are under these customers on this slide. And I've shared a lot of different stories about these customers. Which, by the way, it still amaze me every time I talk to one and I get the updates, you can see the power and the difference that Vertica is making. Equally important, if you look at a lot of these customers, they are the epitome of being able to deploy Vertica in a lot of different environments. Many of the customers on this slide are not using Vertica just on-premise or just in the cloud. They're using it in a hybrid way. They're using it in multiple different clouds. And again, we've been with them on that journey throughout, which is what has made this product and frankly, our roadmap and our vision exactly what it is. It's been quite a journey. And that journey continues now with the Vertica 10 release. The Vertica 10 release is obviously a massive release for us. But if you look back, you can see that building on that native columnar architecture that started a long time ago, obviously, with the C-Store paper. We built it to leverage that commodity hardware, because it was an architecture that was never tightly integrated with any specific underlying infrastructure. I still remember hearing the initial pitch from Mike Stonebreaker, about the vision of Vertica as a software only solution and the importance of separating the company from hardware innovation. And at the time, Mike basically said to me, "there's so much R&D in innovation that's going to happen in hardware, we shouldn't bake hardware into our solution. We should do it in software, and we'll be able to take advantage of that hardware." And that is exactly what has happened. But one of the most recent innovations that we embraced with hardware is certainly that separation of compute and storage. As I said previously, the public cloud providers offered this next generation architecture, really to ensure that they can provide the customers exactly what they needed, more compute or more storage and charge for each, respectively. The separation of compute and storage, compute from storage is a major milestone in data center architectures. If you think about it, it's really not only a public cloud innovation, though. It fundamentally redefines the next generation data architecture for on-premise and for pretty much every way people are thinking about computing today. And that goes for software too. Object storage is an example of the cost effective means for storing data. And even more importantly, separating compute from storage for analytic workloads has a lot of advantages. Including the opportunity to manage much more dynamic, flexible workloads. And more importantly, truly isolate those workloads from others. And by the way, once you start having something that can truly isolate workloads, then you can have the conversations around autonomic computing, around setting up some nodes, some compute resources on the data that won't affect any of the other data to do some things on their own, maybe some self analytics, by the system, etc. A lot of things that many of you know we've already been exploring in terms of our own system data in the product. But it was May 2018, believe it or not, it seems like a long time ago where we first announced Eon Mode and I want to make something very clear, actually about Eon mode. It's a mode, it's a deployment option for Vertica customers. And I think this is another huge benefit that we don't talk about enough. But unlike a lot of vendors in the market who will dig you and charge you for every single add-on like hit-buy, you name it. You get this with the Vertica product. If you continue to pay support and maintenance, this comes with the upgrade. This comes as part of the new release. So any customer who owns or buys Vertica has the ability to set up either an Enterprise Mode or Eon Mode, which is a question I know that comes up sometimes. Our first announcement of Eon was obviously AWS customers, including the trade desk, AT&T. Most of whom will be speaking here later at the Virtual Big Data Conference. They saw a huge opportunity. Eon Mode, not only allowed Vertica to scale elastically with that specific compute and storage that was needed, but it really dramatically simplified database operations including things like workload balancing, node recovery, compute provisioning, etc. So one of the most popular functions is that ability to isolate the workloads and really allocate those resources without negatively affecting others. And even though traditional data warehouses, including Vertica Enterprise Mode have been able to do lots of different workload isolation, it's never been as strong as Eon Mode. Well, it certainly didn't take long for our customers to see that value across the board with Eon Mode. Not just up in the cloud, in partnership with one of our most valued partners and a platinum sponsor here. Joy mentioned at the beginning. We announced Vertica Eon Mode for Pure Storage FlashBlade in September 2019. And again, just to be clear, this is not a new product, it's one Vertica with yet more deployment options. With Pure Storage, Vertica in Eon mode is not limited in any way by variable cloud, network latency. The performance is actually amazing when you take the benefits of separate and compute from storage and you run it with a Pure environment on-premise. Vertica in Eon Mode has a super smart cache layer that we call the depot. It's a big part of our secret sauce around Eon mode. And combined with the power and performance of Pure's FlashBlade, Vertica became the industry's first advanced analytics platform that actually separates compute and storage for on-premises data centers. Something that a lot of our customers are already benefiting from, and we're super excited about it. But as I said, this is a journey. We don't stop, we're not going to stop. Our customers need the flexibility of multiple public clouds. So today with Vertica 10, we're super proud and excited to announce support for Vertica in Eon Mode on Google Cloud. This gives our customers the ability to use their Vertica licenses on Amazon AWS, on-premise with Pure Storage and on Google Cloud. Now, we were talking about HDFS and a lot of our customers who have invested quite a bit in HDFS as a place, especially to store data have been pushing us to support Eon Mode with HDFS. So as part of Vertica 10, we are also announcing support for Vertica in Eon Mode using HDFS as the communal storage. Vertica's own Roth format data can be stored in HDFS, and actually the full functionality of Vertica is complete analytics, geospatial pattern matching, time series, machine learning, everything that we have in there can be applied to this data. And on the same HDFS nodes, Vertica can actually also analyze data in ORC or Parquet format, using External tables. We can also execute joins between the Roth data the External table holds, which powers a much more comprehensive view. So again, it's that flexibility to be able to support our customers, wherever they need us to support them on whatever platform, they have. Vertica 10 gives us a lot more ways that we can deploy Eon Mode in various environments for our customers. It allows them to take advantage of Vertica in Eon Mode and the power that it brings with that separation, with that workload isolation, to whichever platform they are most comfortable with. Now, there's a lot that has come in Vertica 10. I'm definitely not going to be able to cover everything. But we also introduced complex types as an example. And complex data types fit very well into Eon as well in this separation. They significantly reduce the data pipeline, the cost of moving data between those, a much better support for unstructured data, which a lot of our customers have mixed with structured data, of course, and they leverage a lot of columnar execution that Vertica provides. So you get complex data types in Vertica now, a lot more data, stronger performance. It goes great with the announcement that we made with the broader Eon Mode. Let's talk a little bit more about machine learning. We've been actually doing work in and around machine learning with various extra regressions and a whole bunch of other algorithms for several years. We saw the huge advantage that MPP offered, not just as a sequel engine as a database, but for ML as well. Didn't take as long to realize that there's a lot more to operationalizing machine learning than just those algorithms. It's data preparation, it's that model trade training. It's the scoring, the shaping, the evaluation. That is so much of what machine learning and frankly, data science is about. You do know, everybody always wants to jump to the sexy algorithm and we handle those tasks very, very well. It makes Vertica a terrific platform to do that. A lot of work in data science and machine learning is done in other tools. I had mentioned that there's just so many tools out there. We want people to be able to take advantage of all that. We never believed we were going to be the best algorithm company or come up with the best models for people to use. So with Vertica 10, we support PMML. We can import now and export PMML models. It's a huge step for us around that operationalizing machine learning projects for our customers. Allowing the models to get built outside of Vertica yet be imported in and then applying to that full scale of data with all the performance that you would expect from Vertica. We also are more tightly integrating with Python. As many of you know, we've been doing a lot of open source projects with the community driven by many of our customers, like Uber. And so now with Python we've integrated with TensorFlow, allowing data scientists to build models in their preferred language, to take advantage of TensorFlow. But again, to store and deploy those models at scale with Vertica. I think both these announcements are proof of our big bet number three, and really our commitment to supporting innovation throughout the community by operationalizing ML with that accuracy, performance and scale of Vertica for our customers. Again, there's a lot of steps when it comes to the workflow of machine learning. These are some of them that you can see on the slide, and it's definitely not linear either. We see this as a circle. And companies that do it, well just continue to learn, they continue to rescore, they continue to redeploy and they want to operationalize all that within a single platform that can take advantage of all those capabilities. And that is the platform, with a very robust ecosystem that Vertica has always been committed to as an organization and will continue to be. This graphic, many of you have seen it evolve over the years. Frankly, if we put everything and everyone on here wouldn't fit on a slide. But it will absolutely continue to evolve and grow as we support our customers, where they need the support most. So, again, being able to deploy everywhere, being able to take advantage of Vertica, not just as a business analyst or a business user, but as a data scientists or as an operational or BI person. We want Vertica to be leveraged and used by the broader organization. So I think it's fair to say and I encourage everybody to learn more about Vertica 10, because I'm just highlighting some of the bigger aspects of it. But we talked about those three market trends. The need to unify the silos, the need for hybrid multiple cloud deployment options, the need to operationalize business critical machine learning projects. Vertica 10 has absolutely delivered on those. But again, we are not going to stop. It is our job not to, and this is how Team Vertica thrives. I always joke that the next release is the best release. And, of course, even after Vertica 10, that is also true, although Vertica 10 is pretty awesome. But, you know, from the first line of code, we've always been focused on performance and scale, right. And like any really strong data platform, the execution engine, the optimizer and the execution engine are the two core pieces of that. Beyond Vertica 10, some of the big things that we're already working on, next generation execution engine. We're already actually seeing incredible early performance from this. And this is just one example, of how important it is for an organization like Vertica to constantly go back and re-innovate. Every single release, we do the sit ups and crunches, our performance and scale. How do we improve? And there's so many parts of the core server, there's so many parts of our broader ecosystem. We are constantly looking at coverages of how we can go back to all the code lines that we have, and make them better in the current environment. And it's not an easy thing to do when you're doing that, and you're also expanding in the environment that we are expanding into to take advantage of the different deployments, which is a great segue to this slide. Because if you think about today, we're obviously already available with Eon Mode and Amazon, AWS and Pure and actually MinIO as well. As I talked about in Vertica 10 we're adding Google and HDFS. And coming next, obviously, Microsoft Azure, Alibaba cloud. So being able to expand into more of these environments is really important for the Vertica team and how we go forward. And it's not just running in these clouds, for us, we want it to be a SaaS like experience in all these clouds. We want you to be able to deploy Vertica in 15 minutes or less on these clouds. You can also consume Vertica, in a lot of different ways, on these clouds. As an example, in Amazon Vertica by the Hour. So for us, it's not just about running, it's about taking advantage of the ecosystems that all these cloud providers offer, and really optimizing the Vertica experience as part of them. Optimization, around automation, around self service capabilities, extending our management console, we now have products that like the Vertica Advisor Tool that our Customer Success Team has created to actually use our own smarts in Vertica. To take data from customers that give it to us and help them tune automatically their environment. You can imagine that we're taking that to the next level, in a lot of different endeavors that we're doing around how Vertica as a product can actually be smarter because we all know that simplicity is key. There just aren't enough people in the world who are good at managing data and taking it to the next level. And of course, other things that we all hear about, whether it's Kubernetes and containerization. You can imagine that that probably works very well with the Eon Mode and separating compute and storage. But innovation happens everywhere. We innovate around our community documentation. Many of you have taken advantage of the Vertica Academy. The numbers there are through the roof in terms of the number of people coming in and certifying on it. So there's a lot of things that are within the core products. There's a lot of activity and action beyond the core products that we're taking advantage of. And let's not forget why we're here, right? It's easy to talk about a platform, a data platform, it's easy to jump into all the functionality, the analytics, the flexibility, how we can offer it. But at the end of the day, somebody, a person, she's got to take advantage of this data, she's got to be able to take this data and use this information to make a critical business decision. And that doesn't happen unless we explore lots of different and frankly, new ways to get that predictive analytics UI and interface beyond just the standard BI tools in front of her at the right time. And so there's a lot of activity, I'll tease you with that going on in this organization right now about how we can do that and deliver that for our customers. We're in a great position to be able to see exactly how this data is consumed and used and start with this core platform that we have to go out. Look, I know, the plan wasn't to do this as a virtual BDC. But I really appreciate you tuning in. Really appreciate your support. I think if there's any silver lining to us, maybe not being able to do this in person, it's the fact that the reach has actually gone significantly higher than what we would have been able to do in person in Boston. We're certainly looking forward to doing a Big Data Conference in the future. But if I could leave you with anything, know this, since that first release for Vertica, and our very first customers, we have been very consistent. We respect all the innovation around us, whether it's open source or not. We understand the market trends. We embrace those new ideas and technologies and for us true north, and the most important thing is what does our customer need to do? What problem are they trying to solve? And how do we use the advantages that we have without disrupting our customers? But knowing that you depend on us to deliver that unified analytics strategy, it will deliver that performance of scale, not only today, but tomorrow and for years to come. We've added a lot of great features to Vertica. I think we've said no to a lot of things, frankly, that we just knew we wouldn't be the best company to deliver. When we say we're going to do things we do them. Vertica 10 is a perfect example of so many of those things that we from you, our customers have heard loud and clear, and we have delivered. I am incredibly proud of this team across the board. I think the culture of Vertica, a customer first culture, jumping in to help our customers win no matter what is also something that sets us massively apart. I hear horror stories about support experiences with other organizations. And people always seem to be amazed at Team Vertica's willingness to jump in or their aptitude for certain technical capabilities or understanding the business. And I think sometimes we take that for granted. But that is the team that we have as Team Vertica. We are incredibly excited about Vertica 10. I think you're going to love the Virtual Big Data Conference this year. I encourage you to tune in. Maybe one other benefit is I know some people were worried about not being able to see different sessions because they were going to overlap with each other well now, even if you can't do it live, you'll be able to do those sessions on demand. Please enjoy the Vertica Big Data Conference here in 2020. Please you and your families and your co-workers be safe during these times. I know we will get through it. And analytics is probably going to help with a lot of that and we already know it is helping in many different ways. So believe in the data, believe in data's ability to change the world for the better. And thank you for your time. And with that, I am delighted to now introduce Micro Focus CEO Stephen Murdoch to the Vertica Big Data Virtual Conference. Thank you Stephen. >> Stephen: Hi, everyone, my name is Stephen Murdoch. I have the pleasure and privilege of being the Chief Executive Officer here at Micro Focus. Please let me add my welcome to the Big Data Conference. And also my thanks for your support, as we've had to pivot to this being virtual rather than a physical conference. Its amazing how quickly we all reset to a new normal. I certainly didn't expect to be addressing you from my study. Vertica is an incredibly important part of Micro Focus family. Is key to our goal of trying to enable and help customers become much more data driven across all of their IT operations. Vertica 10 is a huge step forward, we believe. It allows for multi-cloud innovation, genuinely hybrid deployments, begin to leverage machine learning properly in the enterprise, and also allows the opportunity to unify currently siloed lakes of information. We operate in a very noisy, very competitive market, and there are people, who are in that market who can do some of those things. The reason we are so excited about Vertica is we genuinely believe that we are the best at doing all of those things. And that's why we've announced publicly, you're under executing internally, incremental investment into Vertica. That investments targeted at accelerating the roadmaps that already exist. And getting that innovation into your hands faster. This idea is speed is key. It's not a question of if companies have to become data driven organizations, it's a question of when. So that speed now is really important. And that's why we believe that the Big Data Conference gives a great opportunity for you to accelerate your own plans. You will have the opportunity to talk to some of our best architects, some of the best development brains that we have. But more importantly, you'll also get to hear from some of our phenomenal Roth Data customers. You'll hear from Uber, from the Trade Desk, from Philips, and from AT&T, as well as many many others. And just hearing how those customers are using the power of Vertica to accelerate their own, I think is the highlight. And I encourage you to use this opportunity to its full. Let me close by, again saying thank you, we genuinely hope that you get as much from this virtual conference as you could have from a physical conference. And we look forward to your engagement, and we look forward to hearing your feedback. With that, thank you very much. >> Joy: Thank you so much, Stephen, for joining us for the Vertica Big Data Conference. Your support and enthusiasm for Vertica is so clear, and it makes a big difference. Now, I'm delighted to introduce Amy Fowler, the VP of Strategy and Solutions for FlashBlade at Pure Storage, who was one of our BDC Platinum Sponsors, and one of our most valued partners. It was a proud moment for me, when we announced Vertica in Eon mode for Pure Storage FlashBlade and we became the first analytics data warehouse that separates compute from storage for on-premise data centers. Thank you so much, Amy, for joining us. Let's get started. >> Amy: Well, thank you, Joy so much for having us. And thank you all for joining us today, virtually, as we may all be. So, as we just heard from Colin Mahony, there are some really interesting trends that are happening right now in the big data analytics market. From the end of the Hadoop hype cycle, to the new cloud reality, and even the opportunity to help the many data science and machine learning projects move from labs to production. So let's talk about these trends in the context of infrastructure. And in particular, look at why a modern storage platform is relevant as organizations take on the challenges and opportunities associated with these trends. The answer is the Hadoop hype cycles left a lot of data in HDFS data lakes, or reservoirs or swamps depending upon the level of the data hygiene. But without the ability to get the value that was promised from Hadoop as a platform rather than a distributed file store. And when we combine that data with the massive volume of data in Cloud Object Storage, we find ourselves with a lot of data and a lot of silos, but without a way to unify that data and find value in it. Now when you look at the infrastructure data lakes are traditionally built on, it is often direct attached storage or data. The approach that Hadoop took when it entered the market was primarily bound by the limits of networking and storage technologies. One gig ethernet and slower spinning disk. But today, those barriers do not exist. And all FlashStorage has fundamentally transformed how data is accessed, managed and leveraged. The need for local data storage for significant volumes of data has been largely mitigated by the performance increases afforded by all Flash. At the same time, organizations can achieve superior economies of scale with that segregation of compute and storage. With compute and storage, you don't always scale in lockstep. Would you want to add an engine to the train every time you add another boxcar? Probably not. But from a Pure Storage perspective, FlashBlade is uniquely architected to allow customers to achieve better resource utilization for compute and storage, while at the same time, reducing complexity that has arisen from the siloed nature of the original big data solutions. The second and equally important recent trend we see is something I'll call cloud reality. The public clouds made a lot of promises and some of those promises were delivered. But cloud economics, especially usage based and elastic scaling, without the control that many companies need to manage the financial impact is causing a lot of issues. In addition, the risk of vendor lock-in from data egress, charges, to integrated software stacks that can't be moved or deployed on-premise is causing a lot of organizations to back off the all the way non-cloud strategy, and move toward hybrid deployments. Which is kind of funny in a way because it wasn't that long ago that there was a lot of talk about no more data centers. And for example, one large retailer, I won't name them, but I'll admit they are my favorites. They several years ago told us they were completely done with on-prem storage infrastructure, because they were going 100% to the cloud. But they just deployed FlashBlade for their data pipelines, because they need predictable performance at scale. And the all cloud TCO just didn't add up. Now, that being said, well, there are certainly challenges with the public cloud. It has also brought some things to the table that we see most organizations wanting. First of all, in a lot of cases applications have been built to leverage object storage platforms like S3. So they need that object protocol, but they may also need it to be fast. And the said object may be oxymoron only a few years ago, and this is an area of the market where Pure and FlashBlade have really taken a leadership position. Second, regardless of where the data is physically stored, organizations want the best elements of a cloud experience. And for us, that means two main things. Number one is simplicity and ease of use. If you need a bunch of storage experts to run the system, that should be considered a bug. The other big one is the consumption model. The ability to pay for what you need when you need it, and seamlessly grow your environment over time totally nondestructively. This is actually pretty huge and something that a lot of vendors try to solve for with finance programs. But no finance program can address the pain of a forklift upgrade, when you need to move to next gen hardware. To scale nondestructively over long periods of time, five to 10 years plus is a crucial architectural decisions need to be made at the outset. Plus, you need the ability to pay as you use it. And we offer something for FlashBlade called Pure as a Service, which delivers exactly that. The third cloud characteristic that many organizations want is the option for hybrid. Even if that is just a DR site in the cloud. In our case, that means supporting appplication of S3, at the AWS. And the final trend, which to me represents the biggest opportunity for all of us, is the need to help the many data science and machine learning projects move from labs to production. This means bringing all the machine learning functions and model training to the data, rather than moving samples or segments of data to separate platforms. As we all know, machine learning needs a ton of data for accuracy. And there is just too much data to retrieve from the cloud for every training job. At the same time, predictive analytics without accuracy is not going to deliver the business advantage that everyone is seeking. You can kind of visualize data analytics as it is traditionally deployed as being on a continuum. With that thing, we've been doing the longest, data warehousing on one end, and AI on the other end. But the way this manifests in most environments is a series of silos that get built up. So data is duplicated across all kinds of bespoke analytics and AI, environments and infrastructure. This creates an expensive and complex environment. So historically, there was no other way to do it because some level of performance is always table stakes. And each of these parts of the data pipeline has a different workload profile. A single platform to deliver on the multi dimensional performances, diverse set of applications required, that didn't exist three years ago. And that's why the application vendors pointed you towards bespoke things like DAS environments that we talked about earlier. And the fact that better options exists today is why we're seeing them move towards supporting this disaggregation of compute and storage. And when it comes to a platform that is a better option, one with a modern architecture that can address the diverse performance requirements of this continuum, and allow organizations to bring a model to the data instead of creating separate silos. That's exactly what FlashBlade is built for. Small files, large files, high throughput, low latency and scale to petabytes in a single namespace. And this is importantly a single rapid space is what we're focused on delivering for our customers. At Pure, we talk about it in the context of modern data experience because at the end of the day, that's what it's really all about. The experience for your teams in your organization. And together Pure Storage and Vertica have delivered that experience to a wide range of customers. From a SaaS analytics company, which uses Vertica on FlashBlade to authenticate the quality of digital media in real time, to a multinational car company, which uses Vertica on FlashBlade to make thousands of decisions per second for autonomous cars, or a healthcare organization, which uses Vertica on FlashBlade to enable healthcare providers to make real time decisions that impact lives. And I'm sure you're all looking forward to hearing from John Yavanovich from AT&T. To hear how he's been doing this with Vertica and FlashBlade as well. He's coming up soon. We have been really excited to build this partnership with Vertica. And we're proud to provide the only on-premise storage platform validated with Vertica Eon Mode. And deliver this modern data experience to our customers together. Thank you all so much for joining us today. >> Joy: Amy, thank you so much for your time and your insights. Modern infrastructure is key to modern analytics, especially as organizations leverage next generation data center architectures, and object storage for their on-premise data centers. Now, I'm delighted to introduce our last speaker in our Vertica Big Data Conference Keynote, John Yovanovich, Director of IT for AT&T. Vertica is so proud to serve AT&T, and especially proud of the harmonious impact we are having in partnership with Pure Storage. John, welcome to the Virtual Vertica BDC. >> John: Thank you joy. It's a pleasure to be here. And I'm excited to go through this presentation today. And in a unique fashion today 'cause as I was thinking through how I wanted to present the partnership that we have formed together between Pure Storage, Vertica and AT&T, I want to emphasize how well we all work together and how these three components have really driven home, my desire for a harmonious to use your word relationship. So, I'm going to move forward here and with. So here, what I'm going to do the theme of today's presentation is the Pure Vertica Symphony live at AT&T. And if anybody is a Westworld fan, you can appreciate the sheet music on the right hand side. What we're going to what I'm going to highlight here is in a musical fashion, is how we at AT&T leverage these technologies to save money to deliver a more efficient platform, and to actually just to make our customers happier overall. So as we look back, and back as early as just maybe a few years ago here at AT&T, I realized that we had many musicians to help the company. Or maybe you might want to call them data scientists, or data analysts. For the theme we'll stay with musicians. None of them were singing or playing from the same hymn book or sheet music. And so what we had was many organizations chasing a similar dream, but not exactly the same dream. And, best way to describe that is and I think with a lot of people this might resonate in your organizations. How many organizations are chasing a customer 360 view in your company? Well, I can tell you that I have at least four in my company. And I'm sure there are many that I don't know of. That is our problem because what we see is a repetitive sourcing of data. We see a repetitive copying of data. And there's just so much money to be spent. This is where I asked Pure Storage and Vertica to help me solve that problem with their technologies. What I also noticed was that there was no coordination between these departments. In fact, if you look here, nobody really wants to play with finance. Sales, marketing and care, sure that you all copied each other's data. But they actually didn't communicate with each other as they were copying the data. So the data became replicated and out of sync. This is a challenge throughout, not just my company, but all companies across the world. And that is, the more we replicate the data, the more problems we have at chasing or conquering the goal of single version of truth. In fact, I kid that I think that AT&T, we actually have adopted the multiple versions of truth, techno theory, which is not where we want to be, but this is where we are. But we are conquering that with the synergies between Pure Storage and Vertica. This is what it leaves us with. And this is where we are challenged and that if each one of our siloed business units had their own stories, their own dedicated stories, and some of them had more money than others so they bought more storage. Some of them anticipating storing more data, and then they really did. Others are running out of space, but can't put anymore because their bodies aren't been replenished. So if you look at it from this side view here, we have a limited amount of compute or fixed compute dedicated to each one of these silos. And that's because of the, wanting to own your own. And the other part is that you are limited or wasting space, depending on where you are in the organization. So there were the synergies aren't just about the data, but actually the compute and the storage. And I wanted to tackle that challenge as well. So I was tackling the data. I was tackling the storage, and I was tackling the compute all at the same time. So my ask across the company was can we just please play together okay. And to do that, I knew that I wasn't going to tackle this by getting everybody in the same room and getting them to agree that we needed one account table, because they will argue about whose account table is the best account table. But I knew that if I brought the account tables together, they would soon see that they had so much redundancy that I can now start retiring data sources. I also knew that if I brought all the compute together, that they would all be happy. But I didn't want them to tackle across tackle each other. And in fact that was one of the things that all business units really enjoy. Is they enjoy the silo of having their own compute, and more or less being able to control their own destiny. Well, Vertica's subclustering allows just that. And this is exactly what I was hoping for, and I'm glad they've brought through. And finally, how did I solve the problem of the single account table? Well when you don't have dedicated storage, and you can separate compute and storage as Vertica in Eon Mode does. And we store the data on FlashBlades, which you see on the left and right hand side, of our container, which I can describe in a moment. Okay, so what we have here, is we have a container full of compute with all the Vertica nodes sitting in the middle. Two loader, we'll call them loader subclusters, sitting on the sides, which are dedicated to just putting data onto the FlashBlades, which is sitting on both ends of the container. Now today, I have two dedicated storage or common dedicated might not be the right word, but two storage racks one on the left one on the right. And I treat them as separate storage racks. They could be one, but i created them separately for disaster recovery purposes, lashing work in case that rack were to go down. But that being said, there's no reason why I'm probably going to add a couple of them here in the future. So I can just have a, say five to 10, petabyte storage, setup, and I'll have my DR in another 'cause the DR shouldn't be in the same container. Okay, but I'll DR outside of this container. So I got them all together, I leveraged subclustering, I leveraged separate and compute. I was able to convince many of my clients that they didn't need their own account table, that they were better off having one. I eliminated, I reduced latency, I reduced our ticketing I reduce our data quality issues AKA ticketing okay. I was able to expand. What is this? As work. I was able to leverage elasticity within this cluster. As you can see, there are racks and racks of compute. We set up what we'll call the fixed capacity that each of the business units needed. And then I'm able to ramp up and release the compute that's necessary for each one of my clients based on their workloads throughout the day. And so while they compute to the right before you see that the instruments have already like, more or less, dedicated themselves towards all those are free for anybody to use. So in essence, what I have, is I have a concert hall with a lot of seats available. So if I want to run a 10 chair Symphony or 80, chairs, Symphony, I'm able to do that. And all the while, I can also do the same with my loader nodes. I can expand my loader nodes, to actually have their own Symphony or write all to themselves and not compete with any other workloads of the other clusters. What does that change for our organization? Well, it really changes the way our database administrators actually do their jobs. This has been a big transformation for them. They have actually become data conductors. Maybe you might even call them composers, which is interesting, because what I've asked them to do is morph into less technology and more workload analysis. And in doing so we're able to write auto-detect scripts, that watch the queues, watch the workloads so that we can help ramp up and trim down the cluster and subclusters as necessary. There has been an exciting transformation for our DBAs, who I need to now classify as something maybe like DCAs. I don't know, I have to work with HR on that. But I think it's an exciting future for their careers. And if we bring it all together, If we bring it all together, and then our clusters, start looking like this. Where everything is moving in harmonious, we have lots of seats open for extra musicians. And we are able to emulate a cloud experience on-prem. And so, I want you to sit back and enjoy the Pure Vertica Symphony live at AT&T. (soft music) >> Joy: Thank you so much, John, for an informative and very creative look at the benefits that AT&T is getting from its Pure Vertica symphony. I do really like the idea of engaging HR to change the title to Data Conductor. That's fantastic. I've always believed that music brings people together. And now it's clear that analytics at AT&T is part of that musical advantage. So, now it's time for a short break. And we'll be back for our breakout sessions, beginning at 12 pm Eastern Daylight Time. We have some really exciting sessions planned later today. And then again, as you can see on Wednesday. Now because all of you are already logged in and listening to this keynote, you already know the steps to continue to participate in the sessions that are listed here and on the previous slide. In addition, everyone received an email yesterday, today, and you'll get another one tomorrow, outlining the simple steps to register, login and choose your session. If you have any questions, check out the emails or go to www.vertica.com/bdc2020 for the logistics information. There are a lot of choices and that's always a good thing. Don't worry if you want to attend one or more or can't listen to these live sessions due to your timezone. All the sessions, including the Q&A sections will be available on demand and everyone will have access to the recordings as well as even more pre-recorded sessions that we'll post to the BDC website. Now I do want to leave you with two other important sites. First, our Vertica Academy. Vertica Academy is available to everyone. And there's a variety of very technical, self-paced, on-demand training, virtual instructor-led workshops, and Vertica Essentials Certification. And it's all free. Because we believe that Vertica expertise, helps everyone accelerate their Vertica projects and the advantage that those projects deliver. Now, if you have questions or want to engage with our Vertica engineering team now, we're waiting for you on the Vertica forum. We'll answer any questions or discuss any ideas that you might have. Thank you again for joining the Vertica Big Data Conference Keynote Session. Enjoy the rest of the BDC because there's a lot more to come

Published Date : Mar 30 2020

SUMMARY :

And he'll share the exciting news And that is the platform, with a very robust ecosystem some of the best development brains that we have. the VP of Strategy and Solutions is causing a lot of organizations to back off the and especially proud of the harmonious impact And that is, the more we replicate the data, Enjoy the rest of the BDC because there's a lot more to come

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
StephenPERSON

0.99+

Amy FowlerPERSON

0.99+

MikePERSON

0.99+

John YavanovichPERSON

0.99+

AmyPERSON

0.99+

Colin MahonyPERSON

0.99+

AT&TORGANIZATION

0.99+

BostonLOCATION

0.99+

John YovanovichPERSON

0.99+

VerticaORGANIZATION

0.99+

Joy KingPERSON

0.99+

Mike StonebreakerPERSON

0.99+

JohnPERSON

0.99+

May 2018DATE

0.99+

100%QUANTITY

0.99+

WednesdayDATE

0.99+

ColinPERSON

0.99+

AWSORGANIZATION

0.99+

Vertica AcademyORGANIZATION

0.99+

fiveQUANTITY

0.99+

JoyPERSON

0.99+

2020DATE

0.99+

twoQUANTITY

0.99+

UberORGANIZATION

0.99+

Stephen MurdochPERSON

0.99+

Vertica 10TITLE

0.99+

Pure StorageORGANIZATION

0.99+

oneQUANTITY

0.99+

todayDATE

0.99+

PhilipsORGANIZATION

0.99+

tomorrowDATE

0.99+

AT&T.ORGANIZATION

0.99+

September 2019DATE

0.99+

PythonTITLE

0.99+

www.vertica.com/bdc2020OTHER

0.99+

One gigQUANTITY

0.99+

AmazonORGANIZATION

0.99+

SecondQUANTITY

0.99+

FirstQUANTITY

0.99+

15 minutesQUANTITY

0.99+

yesterdayDATE

0.99+

Colin Mahony, Vertica | MIT CDOIQ 2019


 

>> From Cambridge, Massachusetts, it's theCUBE, covering MIT Chief Data Officer and Information Quality Symposium 2019, brought to you by SiliconANGLE Media. >> Welcome back to Cambridge, Massachusetts everybody, you're watching The Cube, the leader in tech coverage. My name is Dave Vellante here with my cohost Paul Gillin. This is day one of our two day coverage of the MIT CDOIQ conferences. CDO, Chief Data Officer, IQ, information quality. Colin Mahoney is here, he's a good friend and long time CUBE alum. I haven't seen you in awhile, >> I know >> But thank you so much for taking some time, you're like a special guest here >> Thank you, yeah it's great to be here, thank you. >> Yeah, so, this is not, you know, something that you would normally attend. I caught up with you, invited you in. This conference has started as, like back office governance, information quality, kind of wonky stuff, hidden. And then when the big data meme took off, kind of around the time we met. The Chief Data Officer role emerged, the whole Hadoop thing exploded, and then this conference kind of got bigger and bigger and bigger. Still intimate, but very high level, very senior. It's kind of come full circle as we've been saying, you know, information quality still matters. You have been in this data business forever, so I wanted to invite you in just to get your perspectives, we'll talk about what's new with what's going on in your company, but let's go back a little bit. When we first met and even before, you saw it coming, you kind of invested your whole career into data. So, take us back 10 years, I mean it was so different, remember it was Batch, it was Hadoop, but it was cool. There was a lot of cool >> It's still cool. (laughs) projects going on, and it's still cool. But, take a look back. >> Yeah, so it's changed a lot, look, I got into it a while ago, I've always loved data, I had no idea, the explosion and the three V's of data that we've seen over the last decade. But, data's really important, and it's just going to get more and more important. But as I look back I think what's really changed, and even if you just go back a decade I mean, there's an insatiable appetite for data. And that is not slowing down, it hasn't slowed down at all, and I think everybody wants that perfect solution that they can ask any question and get an immediate answers to. We went through the Hadoop boom, I'd argue that we're going through the Hadoop bust, but what people actually want is still the same. You know, they want real answers, accurate answers, they want them quickly, and they want it against all their information and all their data. And I think that Hadoop evolved a lot as well, you know, it started as one thing 10 years ago, with MapReduce and I think in the end what it's really been about is disrupting the storage market. But if you really look at what's disrupting storage right now, public clouds, S3, right? That's the new data league. So there's always a lot of hype cycles, everybody talks about you know, now it's Cloud, everything, for maybe the last 10 years it was a lot of Hadoop, but at the end of the day I think what people want to do with data is still very much the same. And a lot of companies are still struggling with it, hence the role for Chief Data Officers to really figure out how do I monetize data on the one hand and how to I protect that asset on the other hand. >> Well so, and the cool this is, so this conference is not a tech conference, really. And we love tech, we love talking about this, this is why I love having you on. We kind of have a little Vertica thread that I've created here, so Colin essentially, is the current CEO of Vertica, I know that's not your title, you're GM and Senior Vice President, but you're running Vertica. So, Michael Stonebreaker's coming on tomorrow, >> Yeah, excellent. >> Chris Lynch is coming on tomorrow, >> Oh, great, yeah. >> we've got Andy Palmer >> Awesome, yeah. >> coming up as well. >> Pretty cool. (laughs) >> So we have this connection, why is that important? It's because, you know, Vertica is a very cool company and is all about data, and it was all about disrupting, sort of the traditional relational database. It's kind of doing more with data, and if you go back to the roots of Vertica, it was like how do you do things faster? How do you really take advantage of data to really drive new business? And that's kind of what it's all about. And the tech behind it is really cool, we did your conference for many, many years. >> It's coming back by the way. >> Is it? >> Yeah, this March, so March 30th. >> Oh, wow, mark that down. >> At Boston, at the new Encore Hotel. >> Well we better have theCUBE there, bro. (laughs) >> Yeah, that's great. And yeah, you've done that conference >> Yep. >> haven't you before? So very cool customers, kind of leading edge, so I want to get to some of that, but let's talk the disruption for a minute. So you guys started with the whole architecture, MPP and so forth. And you talked about Cloud, Cloud really disrupted Hadoop. What are some of the other technology disruptions that you're seeing in the market space? >> I think, I mean, you know, it's hard not to talk about AI machine learning, and what one means versus the other, who knows right? But I think one thing that is definitely happening is people are leveraging the volumes of data and they're trying to use all the processing power and storage power that we have to do things that humans either are too expensive to do or simply can't do at the same speed and scale. And so, I think we're going through a renaissance where a lot more is being automated, certainly on the Vertica roadmap, and our path has always been initially to get the data in and then we want the platform to do a lot more for our customers, lots more analytics, lots more machine-learning in the platform. So that's definitely been a lot of the buzz around, but what's really funny is when you talk to a lot of customers they're still struggling with just some basic stuff. Forget about the predictive thing, first you've got to get to what happened in the past. Let's give accurate reporting on what's actually happening. The other big thing I think as a disruption is, I think IOT, for all the hype that it's getting it's very real. And every device is kicking off lots of information, the feedback loop of AB testing or quality testing for predictive maintenance, it's happening almost instantly. And so you're getting massive amounts of new data coming in, it's all this machine sensor type data, you got to figure out what it means really quick, and then you actually have to do something and act on it within seconds. And that's a whole new area for so many people. It's not their traditional enterprise data network warehouse and you know, back to you comment on Stonebreaker, he got a lot of this right from the beginning, you know, and I think he looked at the architectures, he took a lot of the best in class designs, we didn't necessarily invent everything, but we put a lot of that together. And then I think the other you've got to do is constantly re-invent your platform. We came out with our Eon Mode to run cloud native, we just got rated the best cloud data warehouse from a net promoter score rating perspective, so, but we got to keep going you know, we got to keep re-inventing ourselves, but leverage everything that we've done in the past as well. >> So one of the things that you said, which is kind of relevant for here, Paul, is you're still seeing a real data quality issue that customers are wrestling with, and that's a big theme here, isn't it? >> Absolutely, and the, what goes around comes around, as Dave said earlier, we're still talking about information quality 13 years after this conference began. Have the tools to improve quality improved all that much? >> I think the tools have improved, I think that's another area where machine learning, if you look at Tamr, and I know you're going to have Andy here tomorrow, they're leveraging a lot of the augmented things you can do with the processing to make it better. But I think one thing that makes the problem worse now, is it's gotten really easy to pour data in. It's gotten really easy to store data without having to have the right structure, the right quality, you know, 10 years ago, 20 years ago, everything was perfect before it got into the platform. Right, everything was, there was quality, everything was there. What's been happening over the last decade is you're pumping data into these systems, nobody knows if it's redundant data, nobody knows if the quality's any good, and the amount of data is massive. >> And it's cheap to store >> Very cheap to store. >> So people keep pumping it in. >> But I think that creates a lot of issues when it comes to data quality. So, I do think the technology's gotten better, I think there's a lot of companies that are doing a great job with it, but I think the challenge has definitely upped. >> So, go ahead. >> I'm sorry. You mentioned earlier that we're seeing the death of Hadoop, but I'd like you to elaborate on that becuase (Dave laughs) Hadoop actually came up this morning in the keynote, it's part of what GlaxoSmithKline did. Came up in a conversation I had with the CEO of Experian last week, I mean, it's still out there, why do you think it's in decline? >> I think, I mean first of all if you look at the Hadoop vendors that are out there, they've all been struggling. I mean some of them are shutting down, two of them have merged and they've got killed lately. I think there are some very successful implementations of Hadoop. I think Hadoop as a storage environment is wonderful, I think you can process a lot of data on Hadoop, but the problem with Hadoop is it became the panacea that was going to solve all things data. It was going to be the database, it was going to be the data warehouse, it was going to do everything. >> That's usually the kiss of death, isn't it? >> It's the kiss of death. And it, you know, the killer app on Hadoop, ironically, became SQL. I mean, SQL's the killer app on Hadoop. If you want to SQL engine, you don't need Hadoop. But what we did was, in the beginning Mike sort of made fun of it, Stonebreaker, and joked a lot about he's heard of MapReduce, it's called Group By, (Dave laughs) and that created a lot of tension between the early Vertica and Hadoop. I think, in the end, we embraced it. We sit next to Hadoop, we sit on top of Hadoop, we sit behind it, we sit in front of it, it's there. But I think what the reality check of the industry has been, certainly by the business folks in these companies is it has not fulfilled all the promises, it has not fulfilled a fraction on the promises that they bet on, and so they need to figure those things out. So I don't think it's going to go away completely, but I think its best success has been disrupting the storage market, and I think there's some much larger disruptions of technologies that frankly are better than HTFS to do that. >> And the Cloud was a gamechanger >> And a lot of them are in the cloud. >> Which is ironic, 'cause you know, cloud era, (Colin laughs) they didn't really have a cloud strategy, neither did Hortonworks, neither did MapR and, it just so happened Amazon had one, Google had one, and Microsoft has one, so, it's just convenient to-- >> Well, how is that affecting your business? We've seen this massive migration to the cloud (mumbles) >> It's actually been great for us, so one of the things about Vertica is we run everywhere, and we made a decision a while ago, we had our own data warehouse as a service offering. It might have been ahead of its time, never really took off, what we did instead is we pivoted and we say "you know what? "We're going to invest in that experience "so it's a SaaS-like experience, "but we're going to let our customers "have full control over the cloud. "And if they want to go to Amazon they can, "if they want to go to Google they can, "if they want to go to Azure they can." And we really invested in that and that experience. We're up on the Amazon marketplace, we have lots of customers running up on Amazon Cloud as well as Google and Azure now, and then about two years ago we went down and did this endeavor to completely re-architect our product so that we could separate compute and storage so that our customers could actually take advantage of the cloud economics as well. That's been huge for us, >> So you scale independent-- >> Scale independently, cloud native, add compute, take away compute, and for our existing customers, they're loving the hybrid aspect, they love that they can still run on Premise, they love that they can run up on a public cloud, they love that they can run in both places. So we will continue to invest a lot in that. And it is really, really important, and frankly, I think cloud has helped Vertica a lot, because being able to provision hardware quickly, being able to tie in to these public clouds, into our customers' accounts, give them control, has been great and we're going to continue on that path. >> Because Vertica's an ISV, I mean you're a software company. >> We're a software company. >> I know you were a part of HP for a while, and HP wanted to mash that in and run it on it's hardware, but software runs great in the cloud. And then to you it's another hardware platform. >> It's another hardware platform, exactly. >> So give us the update on Micro Focus, Micro Focus acquired Vertica as part of the HPE software business, how many years ago now? Two years ago? >> Less than two years ago. >> Okay, so how's that going, >> It's going great. >> Give us the update there. >> Yeah, so first of all it is great, HPE and HP were wonderful to Vertica, but it's great being part of a software company. Micro Focus is a software company. And more than just a software company it's a company that has a lot of experience bridging the old and the new. Leveraging all of the investments that you've made but also thinking about cloud and all these other things that are coming down the pike. I think for Vertica it's been really great because, as you've seen Vertica has gotten its identity back again. And that's something that Micro Focus is very good at. You can look at what Micro Focus did with SUSE, the Linux company, which actually you know, now just recently spun out of Micro Focus but, letting organizations like Vertica that have this culture, have this product, have this passion, really focus on our market and our customers and doing the right thing by them has been just really great for us and operating as a software company. The other nice thing is that we do integrate with a lot of other products, some of which came from the HPE side, some of which came from Micro Focus, security products is an example. The other really nice thing is we've been doing this insource thing at Micro Focus where we open up our source code to some of the other teams in Micro Focus and they've been contributing now in amazing ways to the product. In ways that we would just never be able to scale, but with 4,000 engineers strong in Micro Focus, we've got a much larger development organization that can actually contribute to the things that Vertica needs to do. And as we go into the cloud and as we do a lot more operational aspects, the experience that these teams have has been incredible, and security's another great example there. So overall it's been great, we've had four different owners of Vertica, our job is to continue what we do on the innovation side in the culture, but so far Micro Focus has been terrific. >> Well, I'd like to say, you're kind of getting that mojo back, because you guys as an independent company were doing your own thing, and then you did for a while inside of HP, >> We did. >> And that obviously changed, 'cause they wanted more integration, but, and Micro Focus, they know what they're doing, they know how to do acquisitions, they've been very successful. >> It's a very well run company, operationally. >> The SUSE piece was really interesting, spinning that out, because now RHEL is part of IBM, so now you've got SUSE as the lone independent. >> Yeah. >> Yeah. >> But I want to ask you, go back to a technology question, is NoSQL the next Hadoop? Are these databases, it seems to be that the hot fad now is NoSQL, it can do anything. Is the promise overblown? >> I think, I mean NoSQL has been out almost as long as Hadoop, and I, we always say not only SQL, right? Mike's said this from day one, best tool for the job. Nothing is going to do every job well, so I think that there are, whether it's key value stores or other types of NoSQL engines, document DB's, now you have some of these DB's that are running on different chips, >> Graph, yeah. >> there's always, yeah, graph DBs, there's always going to be specialty things. I think one of the things about our analytic platform is we can do, time series is a great example. Vertica's a great time series database. We can compete with specialized time series databases. But we also offer a lot of, the other things that you can do with Vertica that you wouldn't be able to do on a database like that. So, I always think there's going to be specialty products, I also think some of these can do a lot more workloads than you might think, but I don't see as much around the NoSQL movement as say I did a few years ago. >> But so, and you mentioned the cloud before as kind of, your position on it I think is a tailwind, not to put words in your mouth, >> Yeah, yeah, it's a great tailwind. >> You're in the Amazon marketplace, I mean they have products that are competitive, right? >> They do, they do. >> But, so how are you differentiating there? >> I think the way we differentiate, whether it's Redshift from Amazon, or BigQuery from Google, or even what Azure DB does is, first of all, Vertica, I think from, feature functionality and performance standpoint is ahead. Number one, I think the second thing, and we hear this from a lot of customers, especially at the C-level is they don't want to be locked into these full stacks of the clouds. Having the ability to take a product and run it across multiple clouds is a big thing, because the stack lock-in now, the full stack lock-in of these clouds is scary. It's really easy to develop in their ecosystems but you get very locked into them, and I think a lot of people are concerned about that. So that works really well for Vertica, but I think at the end of the day it's just, it's the robustness of the product, we continue to innovate, when you look at separating compute and storage, believe it or not, a lot of these cloud-native databases don't do that. And so we can actually leverage a lot of the cloud hardware better than the native cloud databases do themselves. So, like I said, we have to keep going, those guys aren't going to stop, and we actually have great relationships with those companies, we work really well with the clouds, they seem to care just as much about their cloud ecosystem as their own database products, and so I think that's going to continue as well. >> Well, Colin, congratulations on all the success >> Yeah, thank you, yeah. >> It's awesome to see you again and really appreciate you coming to >> Oh thank you, it's great, I appreciate the invite, >> MIT. >> it's great to be here. >> All right, keep it right there everybody, Paul and I will be back with our next guest from MIT, you're watching theCUBE. (electronic jingle)

Published Date : Jul 31 2019

SUMMARY :

brought to you by SiliconANGLE Media. I haven't seen you in awhile, kind of around the time we met. It's still cool. but at the end of the day I think is the current CEO of Vertica, (laughs) and if you go back to the roots of Vertica, at the new Encore Hotel. Well we better have theCUBE there, bro. And yeah, you've done that conference but let's talk the disruption for a minute. but we got to keep going you know, Have the tools to improve quality the right quality, you know, But I think that creates a lot of issues but I'd like you to elaborate on that becuase I think you can process a lot of data on Hadoop, and so they need to figure those things out. so one of the things about Vertica is we run everywhere, and frankly, I think cloud has helped Vertica a lot, I mean you're a software company. And then to you it's another hardware platform. the Linux company, which actually you know, and Micro Focus, they know what they're doing, so now you've got SUSE as the lone independent. is NoSQL the next Hadoop? Nothing is going to do every job well, the other things that you can do with Vertica and so I think that's going to continue as well. Paul and I will be back with our next guest from MIT,

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
DavePERSON

0.99+

Andy PalmerPERSON

0.99+

Paul GillinPERSON

0.99+

Dave VellantePERSON

0.99+

MicrosoftORGANIZATION

0.99+

GoogleORGANIZATION

0.99+

AmazonORGANIZATION

0.99+

Colin MahoneyPERSON

0.99+

PaulPERSON

0.99+

ColinPERSON

0.99+

IBMORGANIZATION

0.99+

VerticaORGANIZATION

0.99+

Chris LynchPERSON

0.99+

HPEORGANIZATION

0.99+

Michael StonebreakerPERSON

0.99+

HPORGANIZATION

0.99+

Micro FocusORGANIZATION

0.99+

HadoopTITLE

0.99+

Colin MahonyPERSON

0.99+

last weekDATE

0.99+

AndyPERSON

0.99+

March 30thDATE

0.99+

NoSQLTITLE

0.99+

MikePERSON

0.99+

ExperianORGANIZATION

0.99+

tomorrowDATE

0.99+

SQLTITLE

0.99+

two dayQUANTITY

0.99+

SiliconANGLE MediaORGANIZATION

0.99+

BostonLOCATION

0.99+

Cambridge, MassachusettsLOCATION

0.99+

4,000 engineersQUANTITY

0.99+

Two years agoDATE

0.99+

SUSETITLE

0.99+

Azure DBTITLE

0.98+

second thingQUANTITY

0.98+

20 years agoDATE

0.98+

10 years agoDATE

0.98+

oneQUANTITY

0.98+

VerticaTITLE

0.98+

HortonworksORGANIZATION

0.97+

MapReduceORGANIZATION

0.97+

one thingQUANTITY

0.97+

Copy of Lynn Lucas, Cohesity | Cisco Live EU 2019


 

>> Live from Barcelona, Spain. It's the cue covering Sisqo Live Europe, brought to you by Cisco and its ecosystem partners. >> Welcome back to Barcelona, everybody. You watching the Cube? The leader in live tech coverage is the first day of three days of coverage for Sisqo. Live for Europe. Lin Lucas is here. She's the chief marketing officer for Kohi City. Lend great to see you again. Thanks for coming on. >> Great to see you here in Europe. >> We were just saying it's the first time that we've done this on the continent. So another >> first? Yeah. Another first. Been s so pleased to be in the U. S with you guys, that multiple shows. And now we were here in Barcelona, >> so it's a great venue. We've actually done a number of shows here. Then again, it's a pleasure having you on. Let's see, Let's get right to it. What's going on with you guys and Cisco? You got got some news. Let's talk about >> Absolutely. As you know, we don't stop innovating continuous innovation at Cohesity and a number of new things. So last week we announced a new Cisco validated design with hyper flex and Cohesity integrating for snapshot integration for backup and, of course, instant recovery of that critical data center infrastructure. And we're calling it hyper squared. So you get full hyper convergence for your primary and, of course, your backup. Another secondary application. >> And those guys just want to talk about hype reflects anywhere. Still, so it's like infinitive hype. Infinity, hyper flex, >> hyper square, >> so hyper squared. Love it. So you guys will. How does that work? You'll obviously you want to be the provider of data protection provider from Multi Cloud. That's a huge opportunity. So how do you do that? You'll you'll plug into whatever framework that customer wants. Presumably, a lot of customers wanted the Cisco framework out. Is that all? >> Oh, absolutely. Hit the nail on the head. I mean, Cisco, obviously, one of the most respected leaders in the world, tens of thousands of customers globally depend on them. I'm Francisco alum love being back here at the old stomping grounds and Cisco's been an investor in cohesive he now, since our serious sees. So, they really saw the promise in the benefit of what Kohi City offers with hybrid converge solutions for modern backup recovery. And to your point to the cloud. You know, Cisco's talking a lot about multi cloud here and cohesive E with our native cloud integration helps customers protect those backups on or those applications on hyper flex, and then instantly move them to a cloud of choice. And then, as you've mentioned, Cisco has so many fantastic relationships that there are very strong go to market partner with us. And when customers wanted by solution, they could get the whole solution from Cisco, including Cohesive >> Yulin. We're glad we have you on because connecting the dots between something like hyper converge, which we've been talking about for a number of years now, and how that fits into multi cloud. To some, it's a little clunky sometimes goods like. But I've got my data center. Or am I just doing backup to the cloud? Because what we know is customers, a. Cisco says their data is, you know, kind of de centred. It's no longer in the in the data center of all over the place. Companies like Kohi City can give you that centralized data protection. No matter where your environment is, walk us through what you're hearing from your customers. How they look at kind of their data center versus the multi cloud environment and data protection. >> Yeah, so I think it's Ah, you know, I think customers air now understanding that it's not either or right. There was a time when people thought, Wow, I'm going to move everything to the cloud And I really think there's a maturing of an understanding of what's going to work well for me in this cloud First world, what do I want to put there? And then what am I going to keep on premises? So that's one of the things that Cohee City innovated our core technology. A distributed Web scale file system spanning file system, which spans the data center and the cloud world seamlessly. And what we're seeing is customers air really using the cloud for archiving, getting off of tape because then they get that search capability very easy when they need Teo tearing and then, most importantly, disaster recovery. You know, in the event of something man made or natural, many, many organizations moving to the clouds for their second sight. And with Kohi City, that's very easy to make. That transfer happened in a very seamless way with our capability set. So I think what we're seeing is this really maturing of how customers look at it as a really holistic environment. And so Cisco calling it data centered. But we call this, you know, mass data fragmentation. And then with our spanning file system being able to really consolidate that now >> yeah, another thing that needs that kind of holistic view is security. I know it's something that's in your product. There was a random where announcement that you made last week tells how security fits into this world. >> Yeah, well, you know, I think we all hate to say it, but you know that old phrase, the new normal unfortunately ran somewhere, and malware has become the new normal for organizations of all sizes. You know, here in Europe, we have that off the situation with the N HS in the UK last year. Andi, it's happening everywhere. So you know one element that the's attackers air taking is looking at how to disable backups. And so this is really important that as a part of a holistic security strategy that organizations take a look at that attack vector. So what cohesive he's introduced is really unique. It's three steps. It's prevent its detect, prevent and then recover. So detect in terms of capabilities to see if there are nefarious changes being happened to the file system right, and then prevent with Helios automatically detecting and with our smart assistant providing that notification and then, if need be, recover with our instant mass restore capability, going back to any point in time with no performance issue. This is not taking time for the rehydration spanning file system doing this instantly and allowing an organization to basically say, Sorry, not today, attackers. We don't need to pay you because we can instantly restore back to a safe point in time. >> So let's unpack those a little bit. If we could detect piece, I presume there's an analytics component to that. You're you're observing the the behavior of the of the backup corpus is that right there, Which is a logical place because it's got all the corporate data in there >> that that's correct. So last year we introduced Helios, which is our global SAS space management system, as machine learning capability in it. And that's providing that machine learning based monitoring to see what kinds of anomalies may be happening that is then proactively alerted to the team >> and then the recovery piece, a ce Well, like you said, it's it's got to be fast. Gotta have high performance, high performance data movement, and that's fundamental to your file system. Is that what I'm hearing >> that architecture that's correct. That's one of the differences of our modern backup solution. Versus some of the non hyper converge architectures is the distributed Web file system, which our CEO Motorin, he was formally at Google, helped with developing their file system has what's called instant ability to go back into any point in time and recover not just one of'em, but actually at a v M wear. A couple years ago, we demonstrated thousands of'em is at a time, and the reason for that is this Web scale file system, which is really unique to Kohi City. And that's what allows a nightie organization to not be held hostage because they can not have two potentially spend not just ours, but even days with the old legacy systems trying to rehydrate. You know these backups if they have to go back potentially many months in time because you don't know that that ran somewhere may have been introduced, not say yesterday, but might have been several months ago, and that's one of the key advantages of this instant master store. >> I mean, this is super important rights, too, because we're talking about very granular levels of being able to dial up dial down. You could tune it by application of high value applications. You can. You have much greater granularity some of the crap locations that not, maybe not. It's important. So flexibility is key there. How about customers, any new customers that you can talk about? >> Absolutely. So one of the ones since we're here, it's just go live. So Cisco, along with Kohi City, we've been working with one of the largest global manufacturers of semiconductors and other electronic equipment, Tokyo Electron, based in Tokyo but also here in the U. K. On the continent. And they had one of those older backup solutions and were challenged with time. It was taking them to back up the restores not being predictable. So they've gone with Cohesive e running on Cisco UCS. Because we're a software to find platform. We offer our software on our customers, you know, choice of Certified Solutions and Cisco UCS. And so they've started with backup, but they're now moving very quickly into archiving to the cloud, helping reduce their costs and get off of tape and to disaster recovery. Ultimately, so super excited that together with Cisco, we could help this customer modernized their data center and, you know, accelerate their hybrid clouds strategy at the same time. >> Awesome. And then you guys were also protecting the Sisqo Live network here. What? Tell us about that? >> Yes. Oh, you know, Cisco builds an amazing network here. I mean, you've seen the operations center, a huge team of people. But as we all know, things could go wrong. Potentially. And so we are protecting the critical services that Cisco's providing to all of this is go live attendees here. So should something happen, which I'm sure won't. Kohi City will be used to instantly recover and bring backup critical services like DNA and other areas that they're depending on to serve. All of the thousands of showgoers here. >> So super hot space. We talked about this at PM World. Actually, last couple of years. Just how much activity and interest there is and the whole parlance is changing land on one of you could come and I used to be you back up when the world was tape. Now you're talking about data protection data management, which could mean a lot of things to a lot of people to a storage folks. It's, you know, it's pretty specific, but you're seeing a massive evolution of the space cloud. Clearly is the underpinning of the tailwind on it requires you guy's toe. To respond is an industry and cohesive, specifically is a company. So I wanted to talk about some of those major trends and how you guys are responding and you're leading. And, >> yeah, I think you know, folks have been a little bit surprised, like, Wait a minute. What's this kind of sleepy industry? Why is it getting all this funding? I mean, our own Siri's de funding. Middle of last year, two hundred fifty million dollars. Softbank banked along with Sequoia, of course. But really, the trend, as is being talked about Francisco Live, is data is. I don't want to say the new oil, but it's the water of the world, right? I mean, it's absolutely crucial to any business, the's days other than your talent. It's your most important business asset. >> And >> the pressure on the board and the CEO and the CEO and turn to be agile to do more with that data to know what you have because here we are in Europe, GDP are increasing, regulations is super important. And so you know, this has really brought for be need to create holistic ways to organize and manage and have visibility toe all of that data, and it's massively fragmented. We put out that research last year, massive data fragmentation and most of that data has been kind of under the water line in most people's minds. You know, you think about your primary applications and data that's really only twenty percent, and the other eighty percent in test Evan Analytics and Backup has been pretty fragmented in Siloed, and it hasn't yet had that vision of How could we consolidate that and move it into a modern space until folks like Mode Erin, you know, founded Cohesive E and applied those same hyper converge techniques that he did at new tonics. So I think that this investment just further validates the fact that data is the most important business asset, and people are really in need of new solutions to manage it, protected and then ultimately do Mohr with it gain insights out of it. >> You know, just a couple comments on that one is, you know, data. We always joke about data's the new oil. It's even more valuable because you can use data in multiple places. You can only put oil in your car once. And so so companies of being in and to realize that how valuable it is trying to understand that value, how to protect that and the GPR. It's interesting. It's it's really. The fines went into effect in Europe last May, but it's become a template, a framework globally. People, you know us. Compensate. All right, we gotta prepare for GPR. And then local jurisdictions announced thing. Well, that's a decent starting point. And so it's not just confined to Europe. It's really on everybody's mind. >> It is, and you brought up the cloud before. And you know the cloud is a new way for people to be agile, and they're getting a lot of value out of it. But it also continues to fragment their data and the visibility. No. In talking Teo Large CIA O of, ah, Fortune one hundred large organisation. He's actually has less visibility in many ways in the cloud because of the ease of proliferation of test ever. And that is creating Mohr. You know, stress, I would say in the system and need for solutions to both provide an enhanced set agility. Move data to the cloud, easily move it out when you need to. But also with regulation, be able to identify and delete. As you know, with GPR if needed, the information that you know your customer may ask you to remove from your systems. >> Yeah, well, I love this conversation a little following cohesively because you guys are up leveling the entire game. I've been following the data protection space for decades now, and the problem with data protection is has always been a bolt on, and companies like, oh, he city both with the funding your your vision. He really forcing the industry. They're kind of re think data protection, not as a bolt on what is a fundamental component of digital strategies and data strategy. So it's fun watching you guys. Congratulations on all the growth. I know you got more to go. So thanks so much for coming in the Cuban and always a pleasure to see you. >> All of always a pleasure to be here with you guys. Thanks very much. >> You're very welcome. All right. Keep it right there, buddy. Stew Minimum and David Lantz from Cisco Live. Barcelona. You watching the Cube?

Published Date : Jan 30 2019

SUMMARY :

Sisqo Live Europe, brought to you by Cisco and its ecosystem partners. Lend great to see you again. So another S with you guys, that multiple shows. What's going on with you guys and Cisco? So you get full hyper convergence for your primary And those guys just want to talk about hype reflects anywhere. So you guys will. And to your point to the cloud. you know, kind of de centred. Yeah, so I think it's Ah, you know, I think customers air now understanding There was a random where announcement that you made last We don't need to pay you because we can instantly Which is a logical place because it's got all the corporate data in there And that's providing that machine learning based monitoring to see what and then the recovery piece, a ce Well, like you said, it's it's got to be fast. to go back potentially many months in time because you don't know that that ran somewhere How about customers, any new customers that you can talk about? on our customers, you know, choice of Certified Solutions and Cisco UCS. And then you guys were also protecting the Sisqo Live network here. the critical services that Cisco's providing to all of this is go live attendees So I wanted to talk about some of those major trends and how you guys are responding and yeah, I think you know, folks have been a little bit surprised, like, Wait a minute. to be agile to do more with that data to know what you have You know, just a couple comments on that one is, you know, data. needed, the information that you know your customer may ask you So thanks so much for coming in the Cuban and always a pleasure to see you. All of always a pleasure to be here with you guys. You watching the Cube?

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Lin LucasPERSON

0.99+

CiscoORGANIZATION

0.99+

EuropeLOCATION

0.99+

UKLOCATION

0.99+

David LantzPERSON

0.99+

TokyoLOCATION

0.99+

SoftbankORGANIZATION

0.99+

BarcelonaLOCATION

0.99+

last yearDATE

0.99+

U. SLOCATION

0.99+

Kohi CityORGANIZATION

0.99+

last yearDATE

0.99+

three daysQUANTITY

0.99+

GoogleORGANIZATION

0.99+

Cohesive EORGANIZATION

0.99+

last MayDATE

0.99+

SequoiaORGANIZATION

0.99+

last weekDATE

0.99+

eighty percentQUANTITY

0.99+

second sightQUANTITY

0.99+

SiriTITLE

0.99+

Cohee CityORGANIZATION

0.99+

Stew MinimumPERSON

0.99+

two hundred fifty million dollarsQUANTITY

0.99+

Barcelona, SpainLOCATION

0.99+

U. K.LOCATION

0.99+

todayDATE

0.99+

last weekDATE

0.99+

three stepsQUANTITY

0.99+

one elementQUANTITY

0.98+

oneQUANTITY

0.98+

bothQUANTITY

0.98+

several months agoDATE

0.98+

first timeQUANTITY

0.98+

tens of thousands of customersQUANTITY

0.98+

yesterdayDATE

0.98+

Lynn LucasPERSON

0.97+

first dayQUANTITY

0.97+

Tokyo ElectronORGANIZATION

0.96+

twenty percentQUANTITY

0.96+

thousands of showgoersQUANTITY

0.96+

CohesityORGANIZATION

0.96+

firstQUANTITY

0.95+

thousandsQUANTITY

0.94+

FirstQUANTITY

0.94+

GPRORGANIZATION

0.94+

CubanLOCATION

0.92+

FranciscoLOCATION

0.92+

Evan Analytics and BackupORGANIZATION

0.91+

one hundredQUANTITY

0.9+

MohrPERSON

0.9+

FortuneORGANIZATION

0.89+

last couple of yearsDATE

0.89+

PM WorldORGANIZATION

0.89+

twoQUANTITY

0.87+

YulinORGANIZATION

0.87+

TeoPERSON

0.85+

Kohi CityLOCATION

0.83+

HeliosTITLE

0.82+

AndiPERSON

0.81+

Mode ErinORGANIZATION

0.81+

Cisco UCSORGANIZATION

0.8+

CIAORGANIZATION

0.8+

couple years agoDATE

0.79+

Miha Kralj, Accenture | Microsoft Ignite 2018


 

(rhythmic music) >> Live from Orlando, Florida. It's theCUBE covering Microsoft Ignite. Brought to you by Cohesity and theCUBE's ecosystem partners. >> Welcome back everyone to theCUBE's live coverage of Microsoft Ignite here at the Orange County Convention Center. I'm your host, Rebecca Knight along with my cohost Stu Miniman. We're joined by Miha Kralj. He is the cloud native architecture lead at Accenture and marathon runner I should say, too. >> That's true, yes. >> Thank you so much for coming on theCUBE, Miha. >> You're more than welcome. >> So I want to start the conversation by talking about the difference between cloud immigrant and cloud native. There's a big distinction. >> Yes, there is. Cloud became a new execution platform for a whole bunch of businesses and what we are going to see now is that lots of companies are using cloud in two different ways or two different forms. Even if you listen to analysts, they are talking about mode one, mode two so when we talk about cloud native we are mostly talking about both technologies and processes but also team organization that is very much inspired by cloud, that went through all of the transformations that we saw, for example, in companies like Netflix, like Uber, like very much how Amazon is organized internally, how Microsoft is organized internally so we are talking about very new approach. How to architect applications, how to actually have a process to develop publications and push them over into production, how to actually run the complete automation, a set of tools, but of course it's completely new enabling platform on top of that or underneath that that allows us to run those cloud native style of applications. >> An oversimplification I've heard is those born in the cloud companies will start out cloud native. The challenge you have for those that, the cloud immigrants, if you will, is there are so many different things that they need to change. Not just the way they architect things, the way that they run things. It's a real challenge and it's companies like yours I think, that help them do that immigration process, right? >> You are actually bringing up the really good point 'cause one thing is if you start from nothing. If you are in a green field and you build, you can say I'm going to take the best automation, I will buy the best people and I'm going to go full-on cloud native. That totally works. You can also be in the old world and you can say let me build cloud native like a separate IT organization and you hire some people in the old IT and some in the new IT and so on and so on and lots of our clients do that. We kind of create a bimodal type of IT organizations with two sets of technology stacks, two approaches. The thing that is really hard to do is to actually integrate those two into a very good hybrid cohesive schema so that you can have a system that one part of the system is traditional on-prem database that goes through its own rhythm of development and then you have systems that are cloud native, very rapidly developed, lots of the minimum viable products that are actually sourcing the data from the old world. So it goes from hard, harder, hardest. >> So do you have a schemata of how to make the decision? What strategy is right for which client or is it really just so dependent on the client's unique set of circumstances? I will try to reframe your question because it is not old or new 'cause it is always that dilemma. If you're looking every decade we go through the same rhythm of refreshing. We get a refreshed wave of architecture. If you remember 30 years ago when I was still young we had a traditional monolithic architecture which were refreshed into client server and into a service oriented architecture now into microservices and in the future we already know that we are going into reactive and driven architectures. Whenever we have a new architectural style we always get also new set of processes. Historically with the waterfall development then we refreshed into rational unified processes you'll remember that from ages ago and then the traditional right now we are doing agile and we are going towards lean development and so on so everything refreshes. So your question is very much asking when is the right time that you stop using the previous generation of architecture, process, tools and platform and jump to the next generation 'cause you can be too late. Obviously we are talking about companies that they need to modernize but you can also be too early 'cause lots of the companies are right now wondering should they go serverless which is also cloud native style but it's way ahead of typical containers, simple for natives. So when is it time to go from VM based traditional SOA into microservice containers versus reactive, event driven on let's say, azure functions. Those decisions are not easy to do but I can tell you most of my clients have kind of a spectrum of everything. They still have a mainframe, they have a client server, they have SOA architecture, they have microservices and they're already thinking about event driven serverless. >> Absolutely, and by the way, they can run that docker container on linux on the mainframe because everything in IT is always additive. So it's challenging. I've spent a lot of my career trying to help companies get out of their silos of infrastructure, of product group and in a multicloud world we feel like have we just created more silos? How are we making progress? What's good? How are you helping companies that maybe are stuck behind and are threatened of getting obsoleted from being able to move forward? What are some of the patterns and ways to get there? >> Our approach is very much trying to find what's really behind, what's the business reason behind? 'Cause until I realize why somebody wants to modernize it's very hard to give the answer to how do you modernize. Not to oversimplify but we typically see that value formula coming. We want to reduce specific detriments and we want to increase specific benefits and hat's why people need to go through those modernization waves. You can reduce cost and historically we were dramatically cutting costs just by automation, clonization, all of that. You can reduce risk. If you remember a few years ago everybody was talking that cloud is too risky, now everybody says oh, I'm reducing the risk and improving security by going to the cloud. You can increase speed and agility so you can suddenly do things much faster and enable more experiments. I personally find the number four most interesting which is you get better access to new software innovation. Here is the question. When is the last time that you remembered and a technology vendor would give you a DVD and say this is our latest software that you can use. >> Yeah, probably a Microsoft disc but you know, back in the day. >> Nobody is shipping software for on-premises anymore. Maybe, they do later in a cycle but all of the latest software innovation is cloud first or cloud only so it's only logical if we see the business that depends on business innovation, they need to start building their systems in a cloud native world 'cause they are going to source natural language processing, artificial intelligence recognition, all of the complex services, they have to source them from the could and therefore they want to build apps in the cloud native style. >> Yeah, it's an interesting challenge. Things are changing so fast. One of the things that I hear from certain companies is they, is that, well, I go and I make my strategy and then by the time I start implementing it I wonder if I made the wrong decision because some new tool is there. You mentioned Azure functions, wait, no, I was just getting on Kubernetes and getting comfortable with that as opposed to most companies, oh, I'm starting to look at that thing so these waves are coming faster and faster. >> You just exposed that you are an architect. Let me explain why. >> The technologist is charged, sorry. >> When I hire people into architecture roles, one of the common interviewing questions will be first, explain one of your previous solutions and then the question comes if you would start again today, what would you do different? Every single architect that I know are always dissatisfied with their previous choices and decisions because there were new wave of technologies that came in during the engagement. What you are expressing, whenever I get a person that says no, I did everything perfectly and I would not change anything, I might have a different role for these people. >> So I mentioned before that you are a marathon runner. I'm curious to hear how your job is similar to running a marathon because as Stu was just talking about, the pace of change, that is the one constant in this industry and to be a marathon runner, you got to keep a good pace. How do you sort of make sure that you are keeping your stamina up, keeping your eyes on the future to make sure you know what's coming ahead? >> That's a very interesting analogy and I was doing that comparison not that far back before. The first part isthat in order to have a good time at the end of the race you need to have good nutrition, you need to have a good preparation, you need to have all those things so the moment when I compared it back to my regular work, nutrition, we usually compare it with how do I keep my skills up which usually, at least in my case, it is between four to six hours every day either reading I usually say to people I try to make something, teach something and learn something every single day and you have to do that four to six hours every single day just like preparing for marathon and there is a whole bunch of those other activities that all need to be aligned then once you actually start running with the client, when you start doing engagement with the client, that even when you hit the wall, even when you get tired, first you know the reason why you are doing it, you know what the end goal means, what the finish line looks like and you know that you are prepared, that this is the best that you can get. Is it easy? No, it's not. We are kind of used in the IT industry to do that and reinvent ourselves every second year. >> When you look at the cloud navtive space what are some of the challenges and pitfalls? How do you manage that? What advice do you give at a high level? I understand there's a lot of diversity out there. >> Oh, where are the challenges and lessons learned? How much time do we have? So I would say the most obvious one would be jumping into that pool of cloud computing without preparation, without guidance, without help, without mentoring, tutoring or somebody to guide you. Get less than perfect experience and declare that is not for me therefore it's not for any of us ever. Right? I see lots of those generalizations where although it's clear that the whole industry is going in that consumerization direction and we are charging by consumption and all of that that we have clients that started it either early, they didn't have a fantastic experience, they got into specific roadblock and then for several years they don't want even to have a discussion anymore. The other problem is not enough upscaling so simply not enough thinking how different that knowledge is. A discussion with a CIO that says that IT's the same for last 30 years, you know, a machine is a machine is a machine. Coding is coding is coding. Nothing really changed ever. It is really hard to have a discussion to say the devil is in the details. Yes, technically we do the same thing for 30 years which is we make dreams come true in IT. We create something that was never done before but how we do that, and tools of the trade, an approach is dramatically different. Every decade brings a dramatically different result. Trying to explain that in supportive way is a challenge on its own. >> Miah, what about your team? How are you making sure that you have the right people in place to help execute these solutions? And this is they have the right skills, the right mindset, the right approach of the continual learning and the constant curiosity that you keep referencing? >> Well, you are asking a consultant how does consultant know that he's successful? When the client is happy. I'm serious, very simple here, right? How do we make sure that the client is happy which is very much corollary to your question. We really first need to make sure that we are educating our clients all the way through. The times of delivering something without a massive knowledge transfer, those times are over. The easiest way to explain that is that what we are telling is that every business needs to become software business. It doesn't matter is it bank, insurance, health provider, they need to learn to actually make critical competitive advantage solutions in-house. So how do we actually teach engineering to companies that historically were not engineering companies? All of my team are half coaches and half engineers or architects or whatever they are. Being a coach and being a mentor and kind of allowing our clients to do things independently instead of just depend on us is one of those major changes that we see how we actually ramp up and train and support people. >> Miha, we've seen and talked to Accenture at many cloud events. Accenture's got a very large presence. I've been watching the entire week. Activity in the booth, one of the four anchor booths here at the show. What's different about Microsoft, your view on Microsoft, what you're hearing from customers and also speak to how Accenture really lives in this Microsoft ecosystem. >> I think that I understand the question. Are you asking me about how Accenture and Microsoft cooperates together in that new world? >> Yeah, why does Accenture have such a large presence at a show like this? Accenture is at all the cloud events. >> So Accenture has specific targeted, strategic alliances with large technology vendors. The size of the alliance, the importance of the alliance is always directly reflected both from, of course, the size of the market but also our belief in how successful a long-term specific technology stack is going to be. We have a very strong, firm belief that with Microsoft we actually have an amazingly good alliance. Actually we call it alliance of three. We forgot to mention Avanade as well, right? Which is dedicated to creative entity to make sure that Microsoft solutions are built, designed and then ran correctly. We jointly invest obscene amount of money to make sure that right solutions are covered with right Microsoft technologies and developed in the right manner. >> Great, Miha, thank you so much for coming on theCUBE. It was a pleasure having you. >> You're more than welcome, anytime. >> I'm Rebecca Knight for Stu Miniman. That wraps up our coverage of Microsoft Ignite. We will see you next time on theCUBE. (rhythmic music)

Published Date : Sep 26 2018

SUMMARY :

Brought to you by Cohesity He is the cloud native Thank you so much for the difference between cloud of the transformations born in the cloud companies so that you can have a system and in the future we already Absolutely, and by the software that you can use. back in the day. all of the complex services, One of the things that I you are an architect. that came in during the engagement. to make sure you know what's coming ahead? is the best that you can get. How do you manage that? and all of that that we that the client is happy of the four anchor booths Are you asking me about Accenture is at all the cloud events. and developed in the right manner. Great, Miha, thank you so We will see you next time on theCUBE.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Rebecca KnightPERSON

0.99+

Miha KraljPERSON

0.99+

MicrosoftORGANIZATION

0.99+

AccentureORGANIZATION

0.99+

MihaPERSON

0.99+

AmazonORGANIZATION

0.99+

NetflixORGANIZATION

0.99+

UberORGANIZATION

0.99+

fourQUANTITY

0.99+

Stu MinimanPERSON

0.99+

30 yearsQUANTITY

0.99+

Orlando, FloridaLOCATION

0.99+

AvanadeORGANIZATION

0.99+

bothQUANTITY

0.99+

MiahPERSON

0.99+

six hoursQUANTITY

0.99+

Orange County Convention CenterLOCATION

0.99+

theCUBEORGANIZATION

0.99+

30 years agoDATE

0.99+

firstQUANTITY

0.99+

twoQUANTITY

0.98+

CohesityORGANIZATION

0.98+

threeQUANTITY

0.98+

two approachesQUANTITY

0.98+

oneQUANTITY

0.98+

four anchor boothsQUANTITY

0.97+

two different waysQUANTITY

0.97+

one thingQUANTITY

0.97+

two setsQUANTITY

0.96+

OneQUANTITY

0.96+

todayDATE

0.96+

first partQUANTITY

0.95+

two different formsQUANTITY

0.94+

one partQUANTITY

0.94+

both technologiesQUANTITY

0.93+

KubernetesTITLE

0.88+

agesDATE

0.88+

Microsoft IgniteORGANIZATION

0.86+

AzureTITLE

0.84+

halfQUANTITY

0.84+

singleQUANTITY

0.84+

every single dayQUANTITY

0.84+

second yearQUANTITY

0.81+

few years agoDATE

0.79+

number fourQUANTITY

0.73+

mode oneQUANTITY

0.71+

2018DATE

0.68+

yearsQUANTITY

0.67+

mode twoQUANTITY

0.67+

wavesEVENT

0.62+

agileTITLE

0.5+

last 30DATE

0.49+

IgniteTITLE

0.38+

IgniteEVENT

0.33+

VMware Day 2 Keynote | VMworld 2018


 

Okay, this presentation includes forward looking statements that are subject to risks and uncertainties. Actual results may differ materially as a result of various risk factors including those described in the 10 k's 10 q's and eight ks. Vm ware files with the SEC, ladies and gentlemen, Sunjay Buddha for the jazz mafia from Oakland, California. Good to be with you. Welcome to late night with Jimmy Fallon. I'm an early early morning with Sanjay Poonen and two are set. It's the first time we're doing a live band and jazz and blues is my favorite. You know, I prefer a career in music, playing with Eric Clapton and that abandoned software, but you know, life as a different way. I'll things. I'm delighted to have you all here. Wasn't yesterday's keynote. Just awesome. Off the charts. I mean pat and Ray, you just guys, I thought it was the best ever keynote and I'm not kissing up to the two of you. If you know pat, you can't kiss up to them because if you do, you'll get an action item list at 4:30 in the morning that sten long and you'll be having nails for breakfast with him but bad it was delightful and I was so inspired by your tattoo that I decided to Kinda fell asleep in batter ass tattoo parlor and I thought one wasn't enough so I was gonna one up with. I love Vm ware. Twenty years. Can you see that? What do you guys think? But thank you all of you for being here. It's a delight to have you folks at our conference. Twenty 5,000 of you here, 100,000 watching. Thank you to all of the vm ware employees who helped put this together. Robin Matlock, Linda, Brit, Clara. Can I have you guys stand up and just acknowledge those of you who are involved? Thank you for being involved. Linda. These ladies worked so hard to make this a great show. Everybody on their teams. It's the life to have you all here. I know that we're gonna have a fantastic time. The title of my talk is pioneers of the possible and we're going to go through over the course of the next 90 minutes or so, a conversation with customers, give you a little bit of perspective of why some of these folks are pioneers and then we're going to talk about somebody who's been a pioneer in the world but thought to start off with a story. I love stories and I was born in a family with four boys and my parents I grew up in India were immensely creative and naming that for boys. The eldest was named Sanjay. That's me. The next was named Santosh Sunday, so if you can get the drift here, it's s a n, s a n s a n and the final one. My parents got even more creative and colon suneel sun, so you could imagine my mother going south or Sunday do. I meant Sanjay you and it was always that confusion and then I come to the United States as an immigrant at age 18 and people see my name and most Americans hadn't seen many Sundays before, so they call me Sanjay. I mean, of course it of sounds like v San, so sanjay, so for all of your V, San Lovers. Then I come to California for years later work at apple and my Latino friends see my name and it sorta sounds like San Jose, so I get called sand. Hey, okay. Then I meet some Norwegian friends later on in my life, nordics. The J is a y, so I get called San Year. Your my Italian friend calls me son Joe. So the point of the matter is, whatever you call me, I respond, but there's certain things that are core to my DNA. Those that people know me know that whatever you call me, there's something that's core to me. Maybe I like music more than software. Maybe I want my tombstone to not be with. I was smart or stupid that I had a big heart. It's the same with vm ware. When you think about the engines that fuel us, you can call us the VM company. The virtualization company. Server virtualization. We seek to be now called the digital foundation company. Sometimes our competitors are not so kind to us. They call us the other things. That's okay. There's something that's core to this company that really, really stands out. They're sort of the engines that fuel vm ware, so like a plane with two engines, innovation and customer obsession. Innovation is what allows the engine to go faster, farther and constantly look at ways in which you can actually make the better and better customer obsession allows you to do it in concert with customers and my message to all of you here is that we want to both of those together with you. Imagine if 500,000 customers could see the benefit of vsphere San Nsx all above cloud foundation being your products. We've been very fortunate and blessed to innovate in everything starting with Sova virtualization, starting with software defined storage in 2009. We were a little later to kind of really on the hyperconverged infrastructure, but the first things that we innovate in storage, we're way back in 2009 when we acquired nicer and began the early works in software defined networking in 2012 when we put together desktop virtualization, mobile and identity the first time to form the digital workspace and as you heard in the last few days, the vision of a multi cloud or hybrid cloud in a virtual cloud networking. This is an amazing vision couple that innovation with an obsession and customer obsession and an NPS. Every engineer and sales rep and everybody in between is compensated on NPS. If something is not going well, you can send me an email. I know you can send pat an email. You can send the good emails to me and the bad emails to Scott Dot Beto said Bmr.com. No, I'm kidding. We want all of you to feel like you're plugged into us and we're very fortunate. This is your vote on nps. We've been very blessed to have the highest nps and that is our focus, but innovation done with customers. I shared this chart last year and it's sort of our sesame street simple chart. I tell our sales rep, this is probably the one shot that gets used the most by our sales organization. If you can't describe our story in one shot, you have 100 powerpoints, you probably have no power and very The fact of the matter is that the data center is sort of like a human body. little point. You've got your heart that's Compute, you've got the storage, maybe your lungs, you've got the nervous system that's networking and you've got the brains of management and what we're trying to do is help you make that journey to the cloud. That's the bottom part of the story. We call it the cloud foundation, the top part, and it's all serving apps. The top part of that story is the digital workspace, so very simply put that that's the desktop, moving edge and mobile. The digital workspace meets the cloud foundation. The combination is a digital foundation Where does, and we've begun this revolution with a company. That's what we end. focus on impact, not just make an impression making an impact, and there's three c's that all of us collectively have had an impact on cost very clearly. I'm going to walk you through some of that complexity and carbon and the carbon data was just fascinating to see some of that yesterday, uh, from Pat, these fierce guarded off this revolution when we started this off 20 years ago. These were stories I just picked up some of the period people would send us electricity bills of what it looked like before and after vsphere with a dramatic reduction in cost, uh, off the tune of 80 plus percent people would show us 10, sometimes 20 times a value creation from server consolidation ratios. I think of the story goes right. Intel initially sort of fought vm ware. I didn't want to have it happen. Dell was one of the first investors. Pat Michael, do I have that story? Right? Good. It's always a job fulfilling through agree with my boss and my chairman as opposed to disagree with them. Um, so that's how it got started. And true with over the, this has been an incredible story. This is kind of the revenue that you've helped us with over the 20 years of existence. Last year was about a billion but I pulled up one of the Roi Charts that somebody wrote in 2006. collectively over a year, $50 million, It might've been my esteemed colleague, Greg rug around that showed that every dollar spent on vm ware resulted in nine to $26 worth of economic value. This was in 2006. So I just said, let's say it's about 10 x of economic value, um, to you. And I think over the years it may have been bigger, but let's say conservative. It's then that $50 million has resulted in half a trillion worth of value to you if you were willing to be more generous and 20. It's 1 trillion worth of value over the that was the heart. years. Our second core product, This is one of my favorite products. How can you not like a product that has part of your name and it. We sent incredible. But the Roi here is incredible too. It's mostly coming from cap ex and op ex reduction, but mostly cap x. initially there was a little bit of tension between us and the hardware storage players. Now I think every hardware storage layer begins their presentation on hyperconverged infrastructure as the pathway to the private cloud. Dramatic reduction. We would like this 15,000 customers have we send. We want every one of the 500,000 customers. If you're going to invest in a private cloud to begin your journey with, with a a hyperconverged infrastructure v sound and sometimes we don't always get this right. This store products actually sort of the story of the of the movie seabiscuit where we sort of came from behind and vm ware sometimes does well. We've come from behind and now we're number one in this category. Incredible Roi. NSX, little not so obvious because there's a fair amount spent on hardware and the trucks would. It looks like this mostly, and this is on the lefthand side, a opex mostly driven by a little bit of server virtualization and a network driven architecture. What we're doing is not coming here saying you need to rip out your existing hardware, whether it's Cisco, juniper, Arista, you get more value out of that or more value potentially out of your Palo Alto or load balancing capabilities, but what we're saying is you can extend the life, optimize your underlay and invest more in your overlay and we're going to start doing more and software all the way from the l for the elephant seven stack firewalling application controllers and make that in networking stack, application aware, and we can dramatically help you reduce that. At the core of that is an investment hyperconverged infrastructure. We find often investments like v San could trigger the investments. In nsx we have roi tools that will help you make that even more dramatic, so once you've got compute storage and networking, you put it together. Then with a lot of other components, we're just getting started in this journey with Nsx, one of our top priorities, but you put that now with the brain. Okay, you got the heart, the lungs, the nervous system, and the brain where you do three a's, sort of like those three c's. You've got automation, you've got analytics and monitoring and of course the part that you saw yesterday, ai and all of the incredible capabilities that you have here. When you put that now in a place where you've got the full SDDC stack, you have a variety of deployment options. Number one is deploying it. A traditional hardware driven type of on premise environment. Okay, and here's the cost we we we accumulate over 2,500 pms. All you could deploy this in a private cloud with a software defined data center with the components I've talked about and the additional cost also for cloud bursting Dr because you're usually investing that sometimes your own data centers or you have the choice of now building an redoing some of those apps for public cloud this, but in many cases you're going to have to add on a cost for migration and refactoring those apps. So it is technically a little more expensive when you factor in that cost on any of the hyperscalers. We think the most economically attractive is this hybrid cloud option, like Vm ware cloud and where you have, for example, all of that Dr Capabilities built into it so that in essence folks is the core of that story. And what I've tried to show you over the last few minutes is the economic value can be extremely compelling. We think at least 10 to 20 x in terms of how we can generate value with them. So rather than me speak more than words, I'd like to welcome my first panel. Please join me in welcoming on stage. Are Our guests from brinks from sky and from National Commercial Bank of Jamaica. Gentlemen, join me on stage. Well, gentlemen, we've got a Indian American. We've got a kiwi who now lives in the UK and we've got a Jamaican. Maybe we should talk about cricket, which by the way is a very exciting sport. It lasts only five days, but nonetheless, I want to start with you Rohan. You, um, brings is an incredible story. Everyone knows the armored trucks and security. Have you driven in one of those? Have a great story and the stock price has doubled. You're a cio that brings business and it together. Maybe we can start there. How have you effectively being able to do that in bridging business and it. Thank you Sanjay. So let me start by describing who is the business, right? Who is brinks? Brinks is the number one secure logistics and cash management services company in the world. Our job is to protect our customers, most precious assets, their cash, precious metals, diamonds, jewelry, commodities and so on. You've seen our trucks in your neighborhoods, in your cities, even in countries across the world, right? But the world is going digital and so we have to ratchet up our use of digital technologies and tools in order to continue to serve our customers in a digital world. So we're building a digital network that extends all the way out to the edges and our edges. Our branches are our messengers and their handheld devices, our trucks and even our computer control safes that we place on our customer's premises all the way back to our monitoring centers are processing centers in our data centers so that we can receive events that are taking place in that cash ecosystem around our customers and react and be proactive in our service of them and at the heart of this digital business transformation is the vm ware product suite. We have been able to use the products to successfully architect of hybrid cloud data center in North America. Awesome. I'd like to get to your next, but before I do that, you made a tremendous sacrifice to be here because you just had a two month old baby. How is your sleep getting there? I've been there with twins and we have a nice little gift for you for you here. Why don't you open it and show everybody some side that something. I think your two month old will like once you get to the bottom of all that day. I've. I'm sure something's in there. Oh Geez. That's the better one. Open it up. There's a Vm, wear a little outfit for your two month. Alright guys, this is great. Thank you all. We appreciate your being here and making the sacrifice in the midst of that. But I was amazed listening to you. I mean, we think of Jamaica, it's a vacation spot. It's also an incredible place with athletes and Usain bolt, but when you, the not just the biggest bank in Jamaica, but also one of the innovators and picking areas like containers and so on. How did you build an innovation culture in the bank? Well, I think, uh, to what rughead said the world is going to dissolve and NCB. We have an aspiration to become the Caribbean's first digital bank. And what that meant for us is two things. One is to reinvent or core business processes and to, to ensure that our customers, when they interact with the bank across all channels have a, what we call the Amazon experience and to drive that, what we actually had to do was to work in two moons. Uh, the first movement we call mode one is And no two, which is stunning up a whole set of to keep the lights on, keep the bank running. agile labs to ensure that we could innovate and transform and grow our business. And the heart of that was on the [inaudible] platform. So pks rocks. You guys should try it. We're going to talk about. I'm sure that won't be the last hear from chatting, but uh, that's great. Hey, now I'd like to get a little deeper into the product with all of you folks and just understand how you've engineered that, that transformation. Maybe in sort of the order we covered in my earlier comments in speech. Rohan, you basically began the journey with the private cloud optimization going with, of course vsphere v San and the VX rail environment to optimize your private cloud. And then of course we'll get to the public cloud later. But how did that work out for you and why did you pick v San and how's it gone? So Sunday we started down this journey, the fourth quarter of 2016. And if you remember back then the BMC product was not yet a product, but we still had the vision even back then of bridging from a private data center into a public cloud. So we started with v San because it helped us tackle an important component of our data center stack. Right. And we could get on a common platform, common set of processes and tools so that when we were ready for the full stack, vmc would be there and it was, and then we could extend past that. So. Awesome. And, and I say Dave with a name like Dave Matthews, you must have like all these musicians, like think you're the real date, my out back. What's your favorite Dave Matthew's song or it has to be crashed into me. Right. Good choice rash. But we'll get to music another time. What? NSX was obviously a big transformational capability, February when everyone knows what sky and media and wireless and all of that stuff. Networking is at the core of what you do. Why did you pick Nsx and what have you been able to achieve with it? So I mean, um, yeah, I mean there's, like I say, sky's yeah, maybe your organization. It's incredibly fast moving industry. It's very innovative. We've got a really clever people in, in, in, in house and we need to make sure our product guys and our developers can move at pace and yeah, we've got some great. We've got really good quality metric guys. They're great guys. But the problem is that traditional networking is just fundamentally slow is there's, there's not much you can do about it, you know, and you know to these agile teams here to punch a ticket, get a file, James. Yeah. That's just not reality. We're able to turn that round so that the, the, the devops ops and developers, they can just use terraform and do everything. Yeah, it's, yeah, we rigs for days to seconds and that's in the Aes to seconds with an agile software driven approach and giving them much longer because it would have been hardware driven. Absolutely. And giving the tool set to the do within boundaries. You have scenes with boundaries, developers so they can basically just do, they can do it all themselves. So you empower the developers in a very, very important way. Within a second you had, did you use our insight tools too on top of that? So yes, we're considered slightly different use case. I mean, we're, yeah, we're in the year. You've got general data protection regulations come through and that's, that's, that's a big deal. And uh, and the reality is from what an organization's compliance isn't getting right? So what we've done been able to do is any convenience isn't getting any any less, using vr and ai and Nsx, we're able to essentially micro segment off a lot of Erica our environments which have a lot, much higher compliance rate and you've got in your case, you know, plenty of stores that you're managing with visa and tens of thousands of Vms to annex. This is something at scale that both of you have been able to achieve about NSX and vsn. Pretty incredible. And what I also like with the sky story is it's very centered around Dev ops and the Dev ops use case. Okay, let's come to your Ramon. And obviously I was, when I was talking to the Coobernetti's, uh, you know, our Kubernetes Platform, team pks, and they told me one of the pioneer and customers was National Commercial Bank of Jamaica. I was like, wow, that's awesome. Let's bring you in. And when we heard your story, it's incredible. Why did you pick Coobernetti's as the container platform? You have many choices of what you could have done in terms of companies that are other choices. Why did you pick pks? So I think, well, what happened to, in our interviews cases, we first looked at pcf, which we thought was a very good platform as well. Then we looked at the integration you can get with pqrs, the security, the overland of Nsx, and it made sense for us to go in that direction because you offered 11 team or flexibility on our automation that we could drive through to drive the business. So that was the essence of the argument that we had to make. So the key part with the NSX integration and security and, and the PKS. Uh, and while we've got a few more chairs from the heckler there, I want you to know, Chad, I've got my pks socks on. That's how much I had so much fear. And if he creates too much trouble with security, we can be emotional. I'm out of the arena, you know. Anyway. Um, I wanted to put this chart up because it's very important for all of you, um, and the audience to know that vm ware is making a significant commitment to Coobernetti's. Uh, we feel that this is, as pat talked about it before, something that's going to be integrated into everything we do. It's going to become like a dial tone. Um, and this is just the first of many things you're going to see a vm or really take this now as a consistent thing. And I think we have an opportunity collectively because a lot of people think, oh, you know, containers are a threat to vm ware. We actually think it's a headwind that's going to become a tailwind for us. Just the same way public cloud has been. So thank you for being one of our pioneer and early customers. And Are you using the kubernetes platform in the context of running in a vsphere environment? Yes, we are. We're onto Venice right now. Uh, we have. Our first application will be a mobile banking APP which will be launched in September and all our agile labs are going to be on pbs moving forward medic. So it's really a good move for us. Dave, I know that you've, not yet, I mean you're looking in the context potentially about is your, one of the use cases of Nsx for you containers and how do you view Nsx in that? Absolutely. For us that was the big thing about t when it refresh rocked up is that the um, you know, not just, you know, Sda and on a, on vsphere, but sdn on openstack sdn into their container platform and we've got some early visibility of the, uh, of the career communities integration on there and yeah, it was, it was done right from the start and that's why when we talked to the pks Yeah, it's, guys again, the same sort of thing. it's, it's done right from the start. And so yeah, certainly for us, the, the NSX, everywhere as they come and control plane as a very attractive proposition. Good. Ron, I'd like to talk to you a little bit about how you viewed the public, because you mentioned when we started off this journey, we didn't have Mr. Cloud and aws, we approached to when we were very early on in that journey and you took a bet with us, but it was part of your data center reduction. You're kind of trying to almost to obliterate one data center as you went from three to one. Tell us that story and how the collaboration worked out on we amber cloud. What's the use case? So as I said, our vision was always to bridge to a So we wanted to be able to use public cloud environments to incubate new public cloud, right? applications until they stabilize to flex to the cloud. And ultimately disaster recovery in the cloud. That was the big use case for us. We ran a traditional data center environment where, you know, we run across four regions in the world. Each region had two to three data centers. One was the primary and then usually you had a disaster recovery center where you had all your data hosted, you had certain amount of compute, but it was essentially a cold center, right? It, it sat idle, you did your test once a year. That's the environment we were really looking to get out of. Once vmc was available, we were able to create the same vm ware environment that we currently have on prem in the cloud, right? The same network and security stack in both places and we were actually able to then decommission our disaster recovery data center, took it off, it's took it off and we move. We've got our, our, all of our mission critical data now in the, uh, in the, uh, aws instance using BMC. We have a small amount of compute to keep it warm, but thanks to the vm ware products, we have the ability now to ratchet that up very quickly in a Dr situation, run production in the cloud until we stabilized and then bring that workload back. Would it be fair to tell everybody here, if you are looking at a Dr or that type of bursting scenario, there's no reason to invest in a on premise private cloud. That's really a perfect use case of We, I know certainly we had breaks. this, right? Sorry. Exactly. Yeah. We will no longer have a, uh, a physical Dr a center available anywhere. So you've optimized your one data center with the private cloud stack will be in cloud foundation effectively starting off a decent and you've optimized your hybrid cloud journey, uh, with we cloud. I know we're early on in the journey with Nsx and branch, so we'll come back to that conversation may next year we discover new things about this guy I just found out last night that he grew up in the same town as me in Bangalore and went to the same school. So we will keep a diary of the schools at rival schools, but the last few years with the same school, uh, Dave, as you think about the future of where you want to this use case of network security, what are some of the things that are on your radar over the course of the next couple of months and quarters? So I think what we're really trying to do is, um, you know, computers, this is a critical thing decided technology conference, computers and networks are a bit boring, but rather we want to make them boring. We want to basically sweep them away from so that our people, our customers, our internal customers don't have to think about it were the end that we can make him, that, that compliance, that security, that whole, that whole framework around it. Um, regardless of where that work, right live as living on premise, off premise, everywhere you know. And, and even Aisha potentially out out to the edge. How big were your teams? Very quickly, as we wrap up this, how big are the teams that you have working on network is what was amazing. I talked to you was how nimble and agile you're with lean teams. How big was your team? The, the team during the, uh, the SDDC stack is six people. Six, six. Eight. Wow. There's obviously more that more. And we're working on that core data center and your boat to sleep between five and seven people. For it to brad to both for the infrastructure and containers. Yes. Rolling on your side. It's about the same. Amazing. Well, very quickly maybe 30 seconds. Where do you see the world going? Rolling. So, you know, it brings, I pay attention to two things. One is Iot and we've talked a little bit about that, but what I'm looking for there as digital signals continue to grow is injecting things like machine learning and artificial intelligence in line into that flow back so we can make more decisions closer to the source. Right. And the second thing is about cash. So even though cash volume is increasing, I mean here we are in Vegas, the number one cash city in the US. I can't ignore the digital payments and crypto currency and that relies on blockchain. So focusing on what role does blockchain play in the global world as we go forward and how can brings, continue to bring those services, blockchain and Iot. Very rare book. Well gentlemen, thank you for being with us. It's a pleasure and an honor. Ladies and gentlemen, give it up for three guests. Well, um, thank you very much. So as you saw there, it's great to be able to see and learn from some of these pioneering customers and the hopefully the lesson you took away was wherever your journey is, you could start potentially with the private cloud, embark on the journey to the public cloud and then now comes the next part which is pretty exciting, which is the journey off the desktop and removal what digital workspace. And that's the second part of this that I want to explore with a couple of customers, but before I do that, I wanted to set the context of why. What we're trying to do here also has economic value. Hopefully you saw in the first set of charts the economic value of starting with the heart, the lungs, any of that software defined data center and moving to the ultimate hybrid cloud had economic value. We feel the same thing here and it's because of fundamental shift that started off in the last seven, 10 years since iphone. The fact of the matter is when you look at your fleet of your devices across tablets, phones and laptops today is a heterogeneous world. Twenty years ago when the company started, it was probably all Microsoft devices, laptops now phones, tablets. It's a mixture and it was going to be a mixture for the rest of them. I think for the foreseeable time, with very strong, almost trillion market cap companies and in this world, our job is to ensure that heterogeneous digital workspace can be very easily managed and secured. I have a little soft corner for this business because the first three years of my five years here, I ran this business, so I know a thing about these products, but the fact of the matter is that I think the opportunity here is if you think about the 7 billion people in the world, a billion of them are working for some company or the other. The others are children or may not be employed or retired and every one of them have a phone today. Many of them phones and laptops and they're mixed and our job is to ensure that we bring simplicity to this place. You saw a little bit that cacophony yesterday and Pat's chart, and unfortunately a lot of today's world of managing and securing that disparate is a mountain of morass. Okay? No offense to any of the vendors named in there, but it shouldn't be your job to be that light piece of labor at the top of the mountain to put it all together, which costs you potentially at least $50 per user per month. We can make the significantly cheaper with a unified platform, workspace one that has all of those elements, so how have we done that? We've taken those fundamental principles at 70 percent, at least reduction of simplicity and security. A lot of the enterprise companies get security, right, but we don't get simplicity all always right. Many of the consumer companies like right? But maybe it needs some help and facebook, it's simplicity, security and we've taken both of those and said it is possible for you to actually like your user experience as opposed to having to really dread your user experience in being able to get access to applications and how we did this at vm ware, was he. We actually teamed with the Stanford Design School. We put many of our product managers through this concept of design thinking. It's a really, really useful concept. I'd encourage every one of you. I'm not making a plug for the Stanford design school at all, but some very basic principles of viability, desirability, feasibility that allow your product folks to think like a consumer, and that's the key goal in undoing that. We were able to design of these products with the type of simplicity but not compromise at all. Insecurity, tremendous opportunity ahead of us and it gives me great pleasure to bring onstage now to guests that are doing some pioneering work, one from a partner and run from a customer. Please join me in welcoming Maria par day from dxc and John Market from adobe. Thank you, Maria. Thank you Maria and John for being with us. Maria, I want to start with you. A DXC is the coming together of two companies and CSC and HP services and on the surface on the surface of it, I think it was $50,000, 100,000. If it was exact numbers, most skeptics may have said such a big acquisition is probably going to fail, but you're looking now at the end of that sort of post merger and most people would say it's been a success. What's made the dxc coming together of those two very different cultures of success? Well, first of all, you have to credit a lot of very creative people in the space. One of the two companies came together, but mostly it is our customers who are making us successful. We are choosing to take our customers the next generation digital platform. The message is resonating, the cultures have come together, the individuals have come together, the offers have come together and it's resonating in the marketplace, in the market and with our customers and with our partners. So you shouldn't have doubted it. I, I wasn't one of the skeptics, maybe others were. And my understanding is the d and the C Yes. If, and dxc is the digital and customer. if you look at the logo, it's, it's more of an infinity, so digital transformation for customers. But truthfully it's um, we wanted to have a new start to some very powerful companies in the industry and it really was a instead of CSC and HP, a new logo and a new start. And I think, you know, if this resonates very well with what I started off my keynote, which is talking about innovation and customers focused on digital and Adobe, obviously not just a household name, customers, John, many of folks who use your products, but also you folks have written the playbook on a transformation of on premise going cloud, right? A SAS products and now we've got an incredible valuations relative. How has that affected the way you think in it in terms of a cloud first type of philosophy? Uh, too much of how you implement, right? From an IT perspective, we're really focused on the employee experience. And so as we transitioned our products to the cloud, that's where we're working towards as well from an it, it's all about innovation and fostering that ability for employees to create and do some amazing products. So many of those things I talked about like design thinking, uh, right down the playbook, what adobe does every day and does it affect the way in which you build, sorry, deploy products 92. Yeah, I mean fundamentally it comes down to those basics viability and the employee experience. And we've believe that by giving employees choice, we're enabling them to do amazing work. Rhonda, Maria, you obviously you were in the process of rolling out some our technology inside dxc. So I want to focus less on the internal implementation as much as what you see from other clients I shared sort of that mountain of harassed so much different disparate tools. Is that what you hear from clients and how are you messaging to them, what you think the future of the digital workspaces. And I joined partnership. Well Sanjay, your picture was perfect because if you look at the way end user compute infrastructure had worked for years, decades in the past, exactly what we're doing with vm ware in terms of automation and driving that infrastructure to the cloud in many ways. Um, companies like yours and mine having the courage to say the old way of on prem is the way we made our license fees, the way move made our professional services in the past. And now we have to quickly take our customers to a new way of working, a fast paced digital cloud transformation. We see it in every customer that we're dealing with everyday of the week What are some of the keyboard? Every vertical. I mean we're, we're seeing a lot in the healthcare and in a variety of verticals. industry. I'm one of the compelling things that we're seeing in the marketplace right now is the next gen worker in terms of the GIG economy. I'm employees might work for one company at 10:00 in the morning and another company at We have to be able to stand those employees are 10 99 employees up very 2:00 in the afternoon. quickly, contract workers from around the world and do it securely with governance, risk and compliance quickly. Uh, and we see that driving a lot of the next generation infrastructure needs. So the users are going from a company like dxc with 160,000 employees to what we think in the future will be another 200, 300,000 of 'em, uh, partners and contract workers that we still have to treat with the same security sensitivity and governance of our w two employees. Awesome. John, you were one of the pioneer and customers that we worked with on this notion of unified endpoint management because you were sort of a similar employee base to Vm ware, 20,000 odd employees, 1000 plus a and you've got a mixture of devices in your fleet. Maybe you can give us a little bit of a sense. What percentage do you have a windows and Mac? So depending on the geography is we're approximately 50 percent windows 50 slash 50 windows and somewhat similar to how vm ware operates. What is your fleet of mobile phones look like in terms of primarily ios? We have maybe 80 slash 20 or 70 slash 20 a apple and Ios? Yes. Tablets override kinds. It's primarily ios tablets. So you probably have something in the order of, I'm guessing adding that up. Forty or 50,000 devices, some total of laptops, tablets, phones. Absolutely split 60 slash 60,000. Sixty thousand plus. Okay. And a mixture of those. So heterogeneities that gear. Um, and you had point tools for many of those in terms of managing secure in that. Why did you decide to go with workspace one to simplify that, that management security experience? Well, you nailed it. It's all about simplification and so we wanted to take our tools and provide a consistent experience from an it perspective, how we manage those endpoints, but also for our employee population for them to be able to have a consistent experience across all of their devices. In the past it was very disconnected. It was if you had an ios device, the experience might look like this if you had a window is it would look like go down about a year ago is to bring that together again, this. And so our journey that we've started to simplicity. We want to get to a place where an employee can self provision their desktop just like they do their mobile device today. And what would, what's your expectations that you go down that journey of how quickly the onboarding time should, should be for an employee? It should be within 15, 20 minutes. We need to, we need to get it very rapid. The new hire orientation process needs to really be modified. It's no longer acceptable from everything from the it side ever to just the other recruiting aspects. An employee wants to come and start immediately. They want to be productive, they want to make contributions, and so what we want to do from an it perspective is get it out of the way and enable employees to be productive as And the onboarding then could be one way you latch him on and they get workspace quickly as possible. one. Absolutely. Great. Um, let's talk a little bit as we wrap up in the next few minutes, or where do you see the world going in terms of other areas that are synergistic, that workspace one collaboration. Um, you know, what are some of the things that you hear from clients? What's the future of collaboration? We're actually looking towards a future where we're less dependent on email. So say yes to that real real time collaboration. DXC is doing a lot with skype for business, a yammer. I'll still a lot with citrix, um, our tech teams and our development teams use slack and our clients are using everything, so as an integrator to this space, we see less dependent on the asynchronous world and a lot more dependence on the synchronous world and whatever tools that you can have to create real time. Um, collaboration. Now you and I spoke a little last night talking about what does that mean to life work balance when there's always a demanding realtime collaboration, but we're seeing an uptick in that and hopefully over the next few years a slight downtick in, in emails because that is not necessarily the most direct way to communicate all the time. And, and in that process, some of that sort of legacy environment starts to get replaced with newer tools, whether it's slack or zoom or we're in a similar experience. All of the above. All of the above. Are you finding the same thing, John Environment? Yeah, we're moving away. There's, I think what you're going to see transition is email becomes more of the reporting aspect, the notification, but the day to day collaboration is me to products like slack are teams at Adobe. We're very video focused and so even though we may be a very global team around the world, we will typically communicate over some form of video, whether it be blue jeans or Jabber or Blue Jeans for your collaboration. Yeah. whatnot. We've internally, we use Webex and, and um, um, and, and zoom in and also a lot of slack and we're happy to announce, I think at the work breakouts, we'll hear about the integration of workspace one with slack. We're doing a lot with them where I want to end with a final question with you. Obviously you're very passionate about a cause that we also love and I'm passionate about and we're gonna hear more about from Malala, which is more women in technology, diversity and inclusion and you know, especially there's a step and you are obviously a role model in doing that. What would you say to some of the women here and others who might be mentors to women in technology of how they can shape that career? Um, I think probably the women here are already rocking it and doing what you need to do. So mentoring has been a huge part of my career in terms of people mentoring me and if not for the support and I'm real acceptance of the differences that I brought to the workplace. I wouldn't, I wouldn't be sitting here today. So I think I might have more advice for the men than the women in the room. You're all, you have daughters, you have sisters, you have mothers and you have women that you work every day. Um, whether you know it or not, there is an unconscious bias out there. So when you hear things from your sons or from your daughters, she's loud. She's a little odd. She's unique. How about saying how wonderful is that? Let's celebrate that and it's from the little go to the top. So that would be, that would be my advice. I fully endorse that. I fully endorse that all of us men need to hear that we have put everyone at Vm ware through unconscious bias that it's not enough. We've got to keep doing it because it's something that we've got to see. I want my daughter to be in a place where the tech world looks like society, which is not 25, 30 percent. Well no more like 50 percent. Thank you for being a role model and thank you for both of you for being here at our conference. It's my pleasure. Thank you Thank you very much. Maria. Maria and John. So you heard you heard some of that and so that remember some of these things that I shared with you. I've got a couple of shirts here with these wonderful little chart in here and I'm not gonna. Throw it to the vm ware crowd. Raise your hand if you're a customer. Okay, good. Let's see how good my arm is. There we go. There's a couple more here and hopefully this will give you a sense of what we are trying to get done in the hybrid cloud. Let's see. That goes there and make sure it doesn't hit anybody. Anybody here in the middle? Right? There we go. Boom. I got two more. Anybody here? I decided not to bring an air gun in. That one felt flat. Sorry. All. There we go. One more. Thank you. Thank you. Thank you very much, but this is what we're trying to get that diagram once again is the cloud foundation. Folks. The bottom part, done. Very simply. Okay. I'd love a world one day where the only The top part of the diagram is the digital workspace. thing you heard from Ben, where's the cloud foundation? The digital workspace makes them cloud foundation equals a digital foundation company. That's what we're trying to get done. This ties absolutely a synchronously what you heard from pat because everything starts with that. Any APP, a kind of perspective of things and then below it are these four types of clouds, the hybrid cloud, the Telco Cloud, the cloud and the public cloud, and of course on top of it is device. I hope that this not just inspired you in terms of picking up a few, the nuggets from our pioneers. The possible, but every one of the 25,000 view possible, the 100,000 of you who are watching this will take people will meet at all the vm world and before forums. the show on the road and there'll be probably 100,000 We want every one of you to be a pioneer. It is absolutely possible for that to happen because that pioneering a capability starts with every one of you. Can we give a hand once again for the five customers that were onstage with us? That's great.

Published Date : Aug 28 2018

SUMMARY :

It's the life to have you all here.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
MichaelPERSON

0.99+

HowardPERSON

0.99+

MariaPERSON

0.99+

Laura HeismanPERSON

0.99+

LauraPERSON

0.99+

JamaicaLOCATION

0.99+

Mark FaltoPERSON

0.99+

DavidPERSON

0.99+

DavePERSON

0.99+

JeffPERSON

0.99+

JohnPERSON

0.99+

Jeff FrickPERSON

0.99+

Dave ValantePERSON

0.99+

CaliforniaLOCATION

0.99+

2006DATE

0.99+

2012DATE

0.99+

Dan SavaresePERSON

0.99+

CompaqORGANIZATION

0.99+

JoePERSON

0.99+

EMCORGANIZATION

0.99+

Paul GillanPERSON

0.99+

RonPERSON

0.99+

JonathanPERSON

0.99+

DellORGANIZATION

0.99+

CiscoORGANIZATION

0.99+

RhondaPERSON

0.99+

Jonathan WeinertPERSON

0.99+

Steve BamaPERSON

0.99+

twoQUANTITY

0.99+

two yearsQUANTITY

0.99+

VegasLOCATION

0.99+

BangaloreLOCATION

0.99+

2009DATE

0.99+

John TroyerPERSON

0.99+

Amazon Web ServicesORGANIZATION

0.99+

EuropeLOCATION

0.99+

IndiaLOCATION

0.99+

2018DATE

0.99+

FortyQUANTITY

0.99+

MondayDATE

0.99+

MarkPERSON

0.99+

SeptemberDATE

0.99+

San FranciscoLOCATION

0.99+

Dave MatthewsPERSON

0.99+

AdobeORGANIZATION

0.99+

Sanjay PoonenPERSON

0.99+

Trevor DavePERSON

0.99+

BenPERSON

0.99+

1999DATE

0.99+

VMwareORGANIZATION

0.99+

Jonathan SecklerPERSON

0.99+

Howard EliasPERSON

0.99+

16 acreQUANTITY

0.99+

10QUANTITY

0.99+

80 percentQUANTITY

0.99+

JapanLOCATION

0.99+

200 acreQUANTITY

0.99+

BMCORGANIZATION

0.99+

$50 millionQUANTITY

0.99+

Gabe Chapman & Nancy Hart, NetApp | VMworld 2018


 

>> Live from Las Vegas, it's theCUBE covering VMworld 2018. Brought to you by VMware and its ecosystem partners. >> Welcome to theCUBE. I'm Lisa Martin with Justin Warren on day one of VMworld 2018. This is the twentieth anniversary of VMware. Lots of momentum this morning kicking things off. Justin and I are happy to be joined by some folks from NetApp. We have Nancy Hart, the Head of Marketing for Cloud Infrastructure. >> Good afternoon. >> Welcome to theCUBE. >> Thank you Julie, it's so great to be here. And an alumni, Gabe Chapman. I love your Twitter handle @bacon_is_king. Senior Manager of NetApp HCI. Hey, Gabe. >> Hi, how are you doing? >> Good. Guys, lots of momentum. Pat Gelsinger was probably one of my favorite keynotes cause he's really energetic. He even went full-in with his faux tap this morning. I was impressed. >> Impressive. >> You guys have some news. >> Yes. >> Tell us about what's new with NetApp and VMware today. >> Fantastic, exciting times at NetApp these days. NetApp is really focused on becoming the data authority for hybrid cloud. Part of that is what we're excited to announce today here at VMworld, is a NetApp-verified architecture for VMware private cloud for HCI. What you heard today in Pat's keynote was a lot about connection on-premises private clouds with hyperscalers public clouds. That's what we're doing in our partnership with VMware here and this validated architecture for private clouds. Exciting news for us. In addition, we're also really be thrilled to be announcing new storage nodes for our NetApp HCI product and SolidFire product, as well. Lots going on today. >> Wow, that's really cool. >> Gabe, you've been in the field a lot. What are some of the things that you're hearing? Some of the signage around here is about VMware's making things possible, making momentum possible. What are some of the things that you're seeing in the field in terms of customer's momentum? Leveraging HCI from NetApp to drive new business models, new revenue streams. >> I think one of the things I see commonly is that the hyperconverge as a platform has been around for about six, seven years now. Customers are seeing that some of the first generational approaches have got them to a certain level in terms of addressing simplicity and kind of that turnkey infrastructure stack, but where they would like to go next is more cloud integrated, more scalability, more enterprise class or enterprise scale technology. Therefore, they're kind of looking at the NetApp HCI product and the architecture that we've brought to market, and seeing the potential to not only do things on-premise that they'd normally do in terms of a infrastructure platform but also move in to new services. How do we integrate with existing investments that they've had? How do we become connected into the hybrid cloud model with the hyperscalers themselves and really push towards a all-encompassing cloud infrastructure platform other than just a box. >> Yeah, one of the things I noticed in the keynote today that, I think, relates to that, and I'm interested to hear, Nancy and Gabe, a little bit more about what customers are doing here, because it seems that the idea of it must be all cloud or all on-site, that's gone away now. It's very much hybrid cloud world, multi cloud world, where customers have choice. Are you hearing that from customers? Clearly, there seems to be some demand here because we've seen the change in messaging. >> Absolutely, and I think what you're seeing is customers want the option to take advantage of all the resources. Regardless if those resources are on-premises or in public clouds, and that's what we're doing here at NetApp with our own HCI solution. As the market evolves under our feet, Gabe talked about those first generation vendors weren't quite enough, that our customers are choosing NetApp cause they want more then what they can get from those first generation vendors. What you really want to see is that convergence continues to march on and that there is more to collapse into this stack, particularly that connection up into the public cloud. Customers are definitely looking today, they're making buying decisions today based on that option. >> Right, and clearly, there's lots of customers who have substantial investment already in NetApp so being able to use what've you already got and extend it with a vendor that you're already familiar with and you know how it works. There's a lot of value there. >> We're a trusted vendor. NetApp is a trusted enterprise vendor with the reliability and customers can come to us with confidence and choose NetApp with confidence. >> We were with you guys at SAP just a couple months ago at the beginning of the summer and #datadriven was everywhere, I'm seeing it in Twitter. We often hear many things about data is power, data is currency, data is fuel. Data is all of those things if it can be harnessed and acted upon in real time. How does NetApp HCI, what are some of the differentiators? Obviously, we talked about the trusted partnership, but how does NetApp help customers actually live a data-driven life within their organization? >> I think a lot of times it starts with understanding where your data lives. How you manage it, manipulate it, and secure it. We have things like GPDR that comes (mumbles). All the sudden, everybody's scrambling to come up with a solution or a reference architecture or some way that integrates with it. I think, naturally, NetApp being the product technology company that it's been and it's lived and breathed data all its life. We understand our customer's unique requirements around governance, around security, around mobility, and we've built technologies that don't lock you into any one mode of consumption. If you bought a filer, if you bought an HCI system, if you bought an object store platform, the data fabric piece is the glue that binds and allows data mobility and portability across multiple platforms. Not only from the edge to the core, but also to the cloud and kind of gives you that larger, bigger picture. We believe that as we start to see this transition, especially edge computing, especially as we look at things like NVMe over fabrics and getting in to new levels and also services that we are delivering across the hyperscalers. A cohesive picture and story around where your data lives, how you manage it, and who can access it is empowering customers to make their transition into the multi cloud space. >> Right, clearly that transition, I think, is what people weren't really understanding three or four years ago. It was like enterprises aren't going to be there in one spot. You can't just turn it on in five seconds, these things take time. >> (mumbles) flipped, yeah. >> With our data fabric we're able to cover the entire NetApp portfolio from edge to core to cloud. As you say, enterprises and different departments in those enterprises will make their own transition and go down their own journey of digital transformation in their own time. NetApp can really be that trusted partner for all these enterprises. >> With so much choice comes, I think, inherently a lot of complexity. I thought they did a great job this morning in the keynote, Pat Gelsinger and team, of really talking about their announcements, what VMware has done in their history pretty clearly. I can imagine from a customer's perspective, if it's an enterprise organization who doesn't want to get Uber-ized, they probably don't know where to start. Talk to us about sort of the business-level conversations that NetApp has with not just your existing customers who know they can come to NetApp to trust you but also some of those maybe newer businesses or newer enterprise businesses to NetApp. How can they come to you and say help us understand? We probably have, what did they say this morning? The average customer's eight clouds. How do you help them to sort of digest that, embrace it, and be able to maximize it so that their data can be available as soon as they need it? >> What it is is data's at the heart of the enterprise and how people help customers change their world with data, but there has to be a direct business outcome for that. When enterprise customers learn to mine the value of their data they can really build new revenue streams, they can create new touchpoints with their own customers to drive their businesses. For example, one of our early NetApp HCI customers was down in Australia. A company called Consatel, a service provider down in Australia. They were really struggling to set up new businesses and new services to their own customer base. When the conversation, when they worked with NetApp what they were able to do was deploy new services three times faster over their last vendor. Think about what that did for their top line. If this company Consatel could deploy new services, new revenue opportunities three times faster. >> Blowing their competitors out of the water. >> Blowing their competitors out of the water. That's a business-level conversation. This is not a conversation about technology. Yes, under the covers, there's some amazing, fantastic technology, but it has to serve the business. Consatel has now been so successful with NetApp HCI that they now are expanding into brand new geo and geo regions and bringing new services to a whole new set of customers and a whole new customer base working with us. >> That's what I'm hearing in the conversations that I have with customers. I'm interested to hear from yourself and Gabe as to whether you're hearing this across the board. You've got one example here of customers who are concerned more with additional revenues. New revenue streams, new ways of making money top line and not so much about cost savings. That was something that was being, we were concentrating on that maybe three or four years ago. That seems to have been de-emphasized now and people are much more interested in seeking out new ways to use things. New sources of revenue and focusing on top line. Is that something that you're seeing across the board or is that only leading edge companies that are looking at that? >> We see it across the board, I think, with a lot of customers across many different verticals. For instance, Children's Mercy Hospital bought our NetApp HCI product for a virtual desktop implementation and they did so for a lot of reasons. One of them being the traditional TCO/ROI discussion. But also allows them to provide a platform that isn't just a silo of resources because of the unique aspects and differentiation that we have on our platform. We're able to go and do mixed workloads and do consolidation so they're realizing savings and gains across collapsing silos, bringing multiple applications on the same, common infrastructure. The same way they would've gone and swiped their credit card at Amazon. When you do that, you don't care if you're putting a SQL database, an Oracle or what not. They're going to give you the resource that you need. We want to mimic that locally on-prem for customers. Then, also have that integration with cloud services. If we're building a cloud service that runs on Amazon or Google, or if we're integrating with VMware as it runs on AWS or whatever, we want to be able to extend those services from local on-premises environments into the cloud and back based on that. I think that's really where the value is. There's no turnkey public cloud in a hybrid cloud integration piece. It's a journey and you have to analyze all the applications and the way you've done business. NetApp, having been working in the enterprise space as a trusted advisor for such a long time, we understand the customer's needs. We've been in the cloud space for a number of years already and we kind of understand that space. We're bridging the gap at the data level and helping to expand that more at the infrastructure level as well and as we branch into new services as time goes on. >> You've got that challenge of every customer being different but there's also trends that are common across the industry and NetApp being the size and having the history that it does, you've seen all of these things before and you know that yes, this is unique to you as a customer, but also we've seen this in other customers. This would be of value to you and you can bring that to those customers. >> Not only that, we have this product called Active IQ and it tends to be a service and support and monitoring application but, like you said, we have a very large customer base and using features and functionalities in AI we're able to use the data that we get from Active IQ as a community wisdom in effect and then make suggestions to those users as well. NetApp does have a very large install base. What can we learn from that install base, how can we help existing customers run their operations better with that community wisdom? >> We've always referred to it as actionable intelligence for your data. We've all played Tetris as a kid, it's playing Tetris with your data, Tetris with your workloads, and making sure that they all line up so that you get all four blocks break at the same time and get the high score. It's really taking and really, truly mining your infrastructure, mining your workloads and your information, and making sure that you're getting the most effective resource utilization that you possibly can. Across not just virtual machine workloads but also data workloads and understanding what you have on the floor versus what you need six months from now to one year from now. That Active IQ platform is really an integral part to really understanding customer's data resource utilization, etc. >> As someone who has played storage Tetris, any help that you can do is very, very welcome. >> I got to bring that back. That's the second reference I've heard to that in the last couple days. One of the things that Pat Gelsinger and team talked about this morning during the general session was superpowers and the need to enable enterprises to be able to harness their superpowers and maximize AI, machine learning, IoT, the edge. How was NetApp and VMware uniquely positioned to help your customers be able to take that actionable intelligence, Gabe, that you mentioned on that data to drive the new business models and revenue streams? >> I think our superpower would be, information is power, so that's our superpower is being data-driven and understanding how we take the customer's data, leverage it to its most effective use, and allocate it and protect it properly. There's a whole bunch of different areas around what we're doing there. Ours would be understanding data, understanding how customers want to use it, and what kind of information they want to extract from it. I'll have to come up with a fancy term for, maybe data thrivers is my superpower. That could be definitely one part of it. >> You could make a logo out of that. >> That sounds pretty good. >> The Thriver. >> The Thriver, I like it. >> We're data thrivers. >> I like it. >> I think so. >> NetApp has been a partner of VMware's for a very long time. You have a large ecosystem of partners, as well. What you guys announced today, talk to us about some of the benefits or really the opportunities that's going to give to NetApp's channel partners. >> There's a lot of opportunity here for our channel partners. As our customers take this journey, they're going to turn to their trusted advisors, their partners, to help them take that journey as well. What we've done here with what we announced with the VMware private cloud for HCI, this is a significant opportunity for our channel partners to work with their customers and take them down that path to be that data thriver. To harness that superpower. New opportunities for all. Customers need someone to help them show the way and channel partners are really the community to do that. >> For those channel partners who are keen to go and do this, how should they engage with you? How should they start talking to NetApp about helping their customers to go down this journey? >> Honestly, we're making the announcement this week. That's the first step is come by our booths. >> It's a thing, yeah. >> If they're here, obviously. We have a very large channel organization. We have outreach, we'll have training, we'll have, the path to hybrid clouds starts with turnKey private cloud and that's kind of what we've done here. We're working on that turnkey private cloud with our partner VMware and NetApp together to kind of facilitate that first step. Then we go out and work with our channel partner organizations to find the customers that want to go down that path. Then they can bring their additional add-on to it. There's a lot of opportunity to go out and really push and help customers make this transition between the two different worlds and obviously we can go to netapp.com and come and take a look. We have plenty of information there, too. >> Just as we wrap up here, I'm curious, Nancy, to get your perspective, from a cloud infrastructure perspective or vision, the announcements that VMware made today. Big news with AWS. Launched that last year. Talked about a lot of expansion going to apache. A lot of work in Australia. >> Yep. >> What does that as well as some other product enhancements they announced today, what does that mean to NetApp? >> I think for NetApp and for our customers, cause really let's stay focused on NetApp's customers, some of the announcements you saw Pat make today provides new options, new opportunities for NetApp's customers globally. As there's these new features, new functionalities to that turnkey solution for private cloud, what you saw is VMware expanding that relationship with AWS just gives new options and new opportunities. >> Hopefully, people can go and maybe by tomorrow get a data thriver pin or sticker. >> Going to have to run out to Kinko's real quick and make some stickers. >> Maybe print it on some bacon. >> Actually, I think we have pretzel necklaces in our booth to go for the beer crawl. >> Oh wow. What time is that? >> Soon, not soon enough. >> Nancy and Gabe, thanks so much for stopping by theCUBE and chatting with Justin and me. Very exciting to hear NetApp's continued transformation and what you're helping customers achieve. >> Thank you for your time. >> Thank you. >> Thank you very much. >> We want to thank you for watching theCUBE. For Justin Warren, I'm Lisa Martin. We're at VMworld, day one, stick around we'll be right back. (electronic tones)

Published Date : Aug 27 2018

SUMMARY :

Brought to you by VMware and its ecosystem partners. Justin and I are happy to be joined Thank you Julie, it's so great to be here. He even went full-in with his faux tap this morning. Part of that is what we're excited to announce today What are some of the things that you're seeing and seeing the potential to not only do things and I'm interested to hear, Nancy and Gabe, continues to march on and that there is more so being able to use what've you already got to us with confidence and choose NetApp with confidence. We were with you guys at SAP just a couple months ago All the sudden, everybody's scrambling to come up with to be there in one spot. the entire NetApp portfolio from edge to core to cloud. How can they come to you and say help us understand? and new services to their own customer base. fantastic technology, but it has to serve the business. as to whether you're hearing this across the board. They're going to give you the resource that you need. and having the history that it does, and it tends to be a service and support and monitoring on the floor versus what you need six months from now any help that you can do is very, very welcome. That's the second reference I've heard to that I'll have to come up with a fancy term for, You could make a logo that's going to give to NetApp's channel partners. and channel partners are really the community to do that. That's the first step is come by our booths. the path to hybrid clouds starts with turnKey private cloud Talked about a lot of expansion going to apache. some of the announcements you saw Pat make today Hopefully, people can go and maybe by tomorrow Going to have to run out to Kinko's real quick in our booth to go for the beer crawl. What time is that? and chatting with Justin and me. We want to thank you for watching theCUBE.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Nancy HartPERSON

0.99+

AWSORGANIZATION

0.99+

NancyPERSON

0.99+

Justin WarrenPERSON

0.99+

Lisa MartinPERSON

0.99+

JuliePERSON

0.99+

AustraliaLOCATION

0.99+

ConsatelORGANIZATION

0.99+

JustinPERSON

0.99+

Gabe ChapmanPERSON

0.99+

Pat GelsingerPERSON

0.99+

AmazonORGANIZATION

0.99+

PatPERSON

0.99+

GabePERSON

0.99+

Children's Mercy HospitalORGANIZATION

0.99+

VMwareORGANIZATION

0.99+

five secondsQUANTITY

0.99+

GoogleORGANIZATION

0.99+

Las VegasLOCATION

0.99+

last yearDATE

0.99+

VMworld 2018EVENT

0.99+

first stepQUANTITY

0.99+

VMworldORGANIZATION

0.99+

threeDATE

0.99+

todayDATE

0.99+

firstQUANTITY

0.99+

UberORGANIZATION

0.99+

TetrisTITLE

0.99+

NetAppORGANIZATION

0.99+

OracleORGANIZATION

0.99+

OneQUANTITY

0.98+

oneQUANTITY

0.98+

one exampleQUANTITY

0.98+

KinkoORGANIZATION

0.98+

twentieth anniversaryQUANTITY

0.97+

one spotQUANTITY

0.97+

this weekDATE

0.97+

NetAppTITLE

0.97+

first generationQUANTITY

0.96+

tomorrowDATE

0.96+

six monthsQUANTITY

0.96+

one partQUANTITY

0.96+

four years agoDATE

0.96+

theCUBEORGANIZATION

0.96+

this morningDATE

0.95+

Infrastructure For Big Data Workloads


 

>> From the SiliconANGLE media office in Boston, Massachusetts, it's theCUBE! Now, here's your host, Dave Vellante. >> Hi, everybody, welcome to this special CUBE Conversation. You know, big data workloads have evolved, and the infrastructure that runs big data workloads is also evolving. Big data, AI, other emerging workloads need infrastructure that can keep up. Welcome to this special CUBE Conversation with Patrick Osborne, who's the vice president and GM of big data and secondary storage at Hewlett Packard Enterprise, @patrick_osborne. Great to see you again, thanks for coming on. >> Great, love to be back here. >> As I said up front, big data's changing. It's evolving, and the infrastructure has to also evolve. What are you seeing, Patrick, and what's HPE seeing in terms of the market forces right now driving big data and analytics? >> Well, some of the things that we see in the data center, there is a continuous move to move from bare metal to virtualized. Everyone's on that train. To containerization of existing apps, your apps of record, business, mission-critical apps. But really, what a lot of folks are doing right now is adding additional services to those applications, those data sets, so, new ways to interact, new apps. A lot of those are being developed with a lot of techniques that revolve around big data and analytics. We're definitely seeing the pressure to modernize what you have on-prem today, but you know, you can't sit there and be static. You gotta provide new services around what you're doing for your customers. A lot of those are coming in the form of this Mode 2 type of application development. >> One of the things that we're seeing, everybody talks about digital transformation. It's the hot buzzword of the day. To us, digital means data first. Presumably, you're seeing that. Are organizations organizing around their data, and what does that mean for infrastructure? >> Yeah, absolutely. We see a lot of folks employing not only technology to do that. They're doing organizational techniques, so, peak teams. You know, bringing together a lot of different functions. Also, too, organizing around the data has become very different right now, that you've got data out on the edge, right? It's coming into the core. A lot of folks are moving some of their edge to the cloud, or even their core to the cloud. You gotta make a lot of decisions and be able to organize around a pretty complex set of places, physical and virtual, where your data's gonna lie. >> There's a lot of talk, too, about the data pipeline. The data pipeline used to be, you had an enterprise data warehouse, and the pipeline was, you'd go through a few people that would build some cubes and then they'd hand off a bunch of reports. The data pipeline, it's getting much more complex. You've got the edge coming in, you've got, you know, core. You've got the cloud, which can be on-prem or public cloud. Talk about the evolution of the data pipeline and what that means for infrastructure and big data workloads. >> For a lot of our customers, and we've got a pretty interesting business here at HPE. We do a lot with the Intelligent Edge, so, our Edgeline servers in Aruba, where a a lot of the data is sitting outside of the traditional data center. Then we have what's going on in the core, which, for a lot of customers, they are moving from either traditional EDW, right, or even Hadoop 1.0 if they started that transformation five to seven years ago, to, a lot of things are happening now in real time, or a combination thereof. The data types are pretty dynamic. Some of that is always getting processed out on the edge. Results are getting sent back to the core. We're also seeing a lot of folks move to real-time data analytics, or some people call it fast data. That sits in your core data center, so utilizing things like Kafka and Spark. A lot of the techniques for persistent storage are brand new. What it boils down to is, it's an opportunity, but it's also very complex for our customers. >> What about some of the technical trends behind what's going on with big data? I mean, you've got sprawl, with both data sprawl, you've got workload sprawl. You got developers that are dealing with a lot of complex tooling. What are you guys seeing there, in terms of the big mega-trends? >> We have, as you know, HPE has quite a few customers in the mid-range in enterprise segments. We have some customers that are very tech-forward. A lot of those customers are moving from this, you know, Hadoop 1.0, Hadoop 2.0 system to a set of essentially mixed workloads that are very multi-tenant. We see customers that have, essentially, a mix of batch-oriented workloads. Now they're introducing these streaming type of workloads to folks who are bringing in things like TensorFlow and GPGPUs, and they're trying to apply some of the techniques of AI and ML into those clusters. What we're seeing right now is that that is causing a lot of complexity, not only in the way you do your apps, but the number of applications and the number of tenants who use that data. It's getting used all day long for various different, so now what we're seeing is it's grown up. It started as an opportunity, a science project, the POC. Now it's business-critical. Becoming, now, it's very mission-critical for a lot of the services that drives. >> Am I correct that those diverse workloads used to require a bespoke set of infrastructure that was very siloed? I'm inferring that technology today will allow you to bring those workloads together on a single platform. Is that correct? >> A couple of things that we offer, and we've been helping customers to get off the complexity train, but provide them flexibility and elasticity is, a lot of the workloads that we did in the past were either very vertically-focused and integrated. One app server, networking, storage, to, you know, the beginning of the analytics phase was really around symmetrical clusters and scaling them out. Now we've got a very rich and diverse set of components and infrastructure that can essentially allow a customer to make a data lake that's very scalable. Compute, storage-oriented nodes, GPU-oriented nodes, so it's very flexible and helps us, helps the customers take complexity out of their environment. >> In thinking about, when you talk to customers, what are they struggling with, specifically as it relates to infrastructure? Again, we talked about tooling. I mean, Hadoop is well-known for the complexity of the tooling. But specifically from an infrastructure standpoint, what are the big complaints that you hear? >> A couple things that we hear is that my budget's flat for the next year or couple years, right? We talked earlier in the conversation about, I have to modernize, virtualize, containerizing my existing apps, that means I have to introduce new services as well with a very different type of DevOps, you know, mode of operations. That's all with the existing staff, right? That's the number one issue that we hear from the customers. Anything that we can do to help increase the velocity of deployment through automation. We hear now, frankly, the battle is for whether I'm gonna run these type of workloads on-prem versus off-prem. We have a set of technology as well as services, enabling services with Pointnext. You remember the acquisition we made around cloud technology partners to right-place where those workloads are gonna go and become like a broker in that conversation and assist customers to make that transition and then, ultimately, give them an elastic platform that's gonna scale for the diverse set of workloads that's well-known, sized, easy to deploy. >> As you get all this data, and the data's, you know, Hadoop, it sorta blew up the data model. Said, "Okay, we'll leave the data where it is, "we'll bring the compute there." You had a lot of skunk works projects growing. What about governance, security, compliance? As you have data sprawl, how are customers handling that challenge? Is it a challenge? >> Yeah, it certainly is a challenge. I mean, we've gone through it just recently with, you know, GDPR is implemented. You gotta think about how that's gonna fit into your workflow, and certainly security. The big thing that we see, certainly, is around if the data's residing outside of your traditional data center, that's a big issue. For us, when we have Edgeline servers, certainly a lot of things are coming in over wireless, there's a big buildout in advent of 5G coming out. That certainly is an area that customers are very concerned about in terms of who has their data, who has access to it, how can you tag it, how can you make sure it's secure. That's a big part of what we're trying to provide here at HPE. >> What specifically is HPE doing to address these problems? Products, services, partnerships, maybe you could talk about that a little bit. Maybe even start with, you know, what's your philosophy on infrastructure for big data and AI workloads? >> I mean, for us, we've over the last two years have really concentrated on essentially two areas. We have the Intelligent Edge, which is, certainly, it's been enabled by fantastic growth with our Aruba products in the networks in space and our Edgeline systems, so, being able to take that type of compute and get it as far out to the edge as possible. The other piece of it is around making hybrid IT simple, right? In that area, we wanna provide a very flexible, yet easy-to-deploy set of infrastructure for big data and AI workloads. We have this concept of the Elastic Platform for Analytics. It helps customers deploy that for a whole myriad of requirements. Very compute-oriented, storage-oriented, GPUs, cold and warm data lakes, for that matter. And the third area, what we've really focused on is the ecosystem that we bring to our customers as a portfolio company is evolving rapidly. As you know, in this big data and analytics workload space, the software development portion of it is super dynamic. If we can bring a vetted, well-known ecosystem to our customers as part of a solution with advisory services, that's definitely one of the key pieces that our customers love to come to HP for. >> What about partnerships around things like containers and simplifying the developer experience? >> I mean, we've been pretty public about some of our efforts in this area around OneSphere, and some of these, the models around, certainly, advisory services in this area with some recent acquisitions. For us, it's all about automation, and then we wanna be able to provide that experience to the customers, whether they want to develop those apps and deploy on-prem. You know, we love that. I think you guys tag it as true private cloud. But we know that the reality is, most people are embracing very quickly a hybrid cloud model. Given the ability to take those apps, develop them, put them on-prem, run them off-prem is pretty key for OneSphere. >> I remember Antonio Neri, when you guys announced Apollo, and you had the astronaut there. Antonio was just a lowly GM and VP at the time, and now he's, of course, CEO. Who knows what's in the future? But Apollo, generally at the time, it was like, okay, this is a high-performance computing system. We've talked about those worlds, HPC and big data coming together. Where does a system like Apollo fit in this world of big data workloads? >> Yeah, so we have a very wide product line for Apollo that helps, you know, some of them are very tailored to specific workloads. If you take a look at the way that people are deploying these infrastructures now, multi-tenant with many different workloads. We allow for some compute-focused systems, like the Apollo 2000. We have very balanced systems, the Apollo 4200, that allow a very good mix of CPU, memory, and now customers are certainly moving to flash and storage-class memory for these type of workloads. And then, Apollo 6500 were some of the newer systems that we have. Big memory footprint, NVIDIA GPUs allowing you to do very high calculations rates for AI and ML workloads. We take that and we aggregate that together. We've made some recent acquisitions, like Plexxi, for example. A big part of this is around simplification of the networking experience. You can probably see into the future of automation of the networking level, automation of the compute and storage level, and then having a very large and scalable data lake for customers' data repositories. Object, file, HTFS, some pretty interesting trends in that space. >> Yeah, I'm actually really super excited about the Plexxi acquisition. I think it's because flash, it used to be the bottleneck was the spinning disk, flash pushes the bottleneck largely to the network. Plexxi gonna allow you guys to scale, and I think actually leapfrog some of the other hyperconverged players that are out there. So, super excited to see what you guys do with that acquisition. It sounds like your focus is on optimizing the design for I/O. I'm sure flash fits in there as well. >> And that's a huge accelerator for, even when you take a look at our storage business, right? So, 3PAR, Nimble, All-Flash, certainly moving to NVMe and storage-class memory for acceleration of other types of big data databases. Even though we're talking about Hadoop today, right now, certainly SAP HANA, scale-out databases, Oracle, SQL, all these things play a part in the customer's infrastructure. >> Okay, so you were talking before about, a little bit about GPUs. What is this HPE Elastic Platform for big data analytics? What's that all about? >> I mean, we have a lot of the sizing and scalability falls on the shoulders of our customers in this space, especially in some of these new areas. What we've done is, we have, it's a product/a concept, and what we do is we have this, it's called the Elastic Platform for Analytics. It allows, with all those different components that I rattled off, all great systems in of their own, but when it comes to very complex multi-tenant workloads, what we do is try to take the mystery out of that for our customers, to be able to deploy that cookie-cutter module. We're even gonna get to a place pretty soon where we're able to offer that as a consumption-based service so you don't have to choose for an elastic type of acquisition experience between on-prem and off-prem. We're gonna provide that as well. It's not only a set of products. It's reference architectures. We do a lot of sizing with our partners. The Hortonworks, CloudEra's, MapR's, and a lot of the things that are out in the open source world. It's pretty good. >> We've been covering big data, as you know, for a long, long time. The early days of big data was like, "Oh, this is great, "we're just gonna put white boxes out there "and off the shelf storage!" Well, that changed as big data got, workloads became more enterprise, mainstream, they needed to be enterprise-ready. But my question to you is, okay, I hear you. You got products, you got services, you got perspectives, a philosophy. Obviously, you wanna sell some stuff. What has HPE done internally with regard to big data? How have you transformed your own business? >> For us, we wanna provide a really rich experience, not just products. To do that, you need to provide a set of services and automation, and what we've done is, with products and solutions like InfoSight, we've been able to, we call it AI for the Data Center, or certainly, the tagline of predictive analytics is something that Nimble's brought to the table for a long time. To provide that level of services, InfoSight, predictive analytics, AI for the Data Center, we're running our own big data infrastructure. It started a number of years ago even on our 3PAR platforms and other products, where we had scale-up databases. We moved and transitioned to batch-oriented Hadoop. Now we're fully embedded with real-time streaming analytics that come in every day, all day long, from our customers and telemetry. We're using AI and ML techniques to not only improve on what we've done that's certainly automating for the support experience, and making it easy to manage the platforms, but now introducing things like learning, automation engines, the recommendation engines for various things for our customers to take, essentially, the hands-on approach of managing the products and automate it and put into the products. So, for us, we've gone through a multi-phase, multi-year transition that's brought in things like Kafka and Spark and Elasticsearch. We're using all these techniques in our system to provide new services for our customers as well. >> Okay, great. You're practitioners, you got some street cred. >> Absolutely. >> Can I come back on InfoSight for a minute? It came through an acquisition of Nimble. It seems to us that you're a little bit ahead, and maybe you say a lot a bit ahead of the competition with regard to that capability. How do you see it? Where do you see InfoSight being applied across the portfolio, and how much of a lead do you think you have on competitors? >> I'm paranoid, so I don't think we ever have a good enough lead, right? You always gotta stay grinding on that front. But we think we have a really good product. You know, it speaks for itself. A lot of the customers love it. We've applied it to 3PAR, for example, so we came out with some, we have VMVision for a 3PAR that's based on InfoSight. We've got some things in the works for other product lines that are imminent pretty soon. You can think about what we've done for Nimble and 3PAR, we can apply similar type of logic to Elastic Platform for Analytics, like running at that type of cluster scale to automate a number of items that are pretty pedantic for the customers to manage. There's a lot of work going on within HPE to scale that as a service that we provide with most of our products. >> Okay, so where can I get more information on your big data offerings and what you guys are doing in that space? >> Yeah, so, we have, you can always go to hp.com/bigdata. We've got some really great information out there. We're in our run-up to our big end user event that we do every June in Las Vegas. It's HPE Discover. We have about 15,000 of our customers and trusted partners there, and we'll be doing a number of talks. I'm doing some work there with a British telecom. We'll give some great talks. Those'll be available online virtually, so you'll hear about not only what we're doing with our own InfoSight and big data services, but how other customers like BTE and 21st Century Fox and other folks are applying some of these techniques and making a big difference for their business as well. >> That's June 19th to the 21st. It's at the Sands Convention Center in between the Palazzo and the Venetian, so it's a good conference. Definitely check that out live if you can, or if not, you can all watch online. Excellent, Patrick, thanks so much for coming on and sharing with us this big data evolution. We'll be watching. >> Yeah, absolutely. >> And thank you for watcihing, everybody. We'll see you next time. This is Dave Vellante for theCUBE. (fast techno music)

Published Date : Jun 12 2018

SUMMARY :

From the SiliconANGLE media office and the infrastructure that in terms of the market forces right now to modernize what you have on-prem today, One of the things that we're seeing, of their edge to the cloud, of the data pipeline A lot of the techniques What about some of the technical trends for a lot of the services that drives. Am I correct that a lot of the workloads for the complexity of the tooling. You remember the acquisition we made the data where it is, is around if the data's residing outside Maybe even start with, you know, of the Elastic Platform for Analytics. Given the ability to take those apps, GM and VP at the time, automation of the compute So, super excited to see what you guys do in the customer's infrastructure. Okay, so you were talking before about, and a lot of the things But my question to you and automate it and put into the products. you got some street cred. bit ahead of the competition for the customers to manage. that we do every June in Las Vegas. Definitely check that out live if you can, We'll see you next time.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
PatrickPERSON

0.99+

Dave VellantePERSON

0.99+

ArubaLOCATION

0.99+

AntonioPERSON

0.99+

BTEORGANIZATION

0.99+

Patrick OsbornePERSON

0.99+

HPEORGANIZATION

0.99+

June 19thDATE

0.99+

Antonio NeriPERSON

0.99+

Las VegasLOCATION

0.99+

PointnextORGANIZATION

0.99+

Hewlett Packard EnterpriseORGANIZATION

0.99+

NVIDIAORGANIZATION

0.99+

third areaQUANTITY

0.99+

21st Century FoxORGANIZATION

0.99+

Apollo 4200COMMERCIAL_ITEM

0.99+

@patrick_osbornePERSON

0.99+

Apollo 6500COMMERCIAL_ITEM

0.99+

InfoSightORGANIZATION

0.99+

MapRORGANIZATION

0.99+

Sands Convention CenterLOCATION

0.99+

Boston, MassachusettsLOCATION

0.98+

Apollo 2000COMMERCIAL_ITEM

0.98+

CloudEraORGANIZATION

0.98+

HPORGANIZATION

0.98+

NimbleORGANIZATION

0.98+

SparkTITLE

0.98+

SAP HANATITLE

0.98+

next yearDATE

0.98+

GDPRTITLE

0.98+

One appQUANTITY

0.98+

VenetianLOCATION

0.98+

two areasQUANTITY

0.98+

todayDATE

0.98+

hp.com/bigdataOTHER

0.97+

oneQUANTITY

0.97+

HortonworksORGANIZATION

0.97+

Mode 2OTHER

0.96+

single platformQUANTITY

0.96+

SQLTITLE

0.96+

OneQUANTITY

0.96+

21stDATE

0.96+

Elastic PlatformTITLE

0.95+

3PARTITLE

0.95+

Hadoop 1.0TITLE

0.94+

seven years agoDATE

0.93+

CUBE ConversationEVENT

0.93+

PalazzoLOCATION

0.93+

HadoopTITLE

0.92+

KafkaTITLE

0.92+

Hadoop 2.0TITLE

0.91+

ElasticsearchTITLE

0.9+

PlexxiORGANIZATION

0.87+

ApolloORGANIZATION

0.87+

of years agoDATE

0.86+

Elastic Platform for AnalyticsTITLE

0.85+

OracleORGANIZATION

0.83+

TensorFlowTITLE

0.82+

EdgelineORGANIZATION

0.82+

Intelligent EdgeORGANIZATION

0.81+

about 15,000 ofQUANTITY

0.78+

one issueQUANTITY

0.77+

fiveDATE

0.74+

HPE DiscoverORGANIZATION

0.74+

both dataQUANTITY

0.73+

dataORGANIZATION

0.73+

yearsDATE

0.72+

SiliconANGLELOCATION

0.71+

EDWTITLE

0.71+

EdgelineCOMMERCIAL_ITEM

0.71+

HPETITLE

0.7+

OneSphereORGANIZATION

0.68+

coupleQUANTITY

0.64+

3PARORGANIZATION

0.63+

Arun Garg, NetApp | Cisco Live 2018


 

>> Live from Orlando, Florida it's theCUBE covering Cisco Live 2018. Brought to you by Cisco, NetApp and theCUBE's ecosystem partners. >> Hey, welcome back everyone. This is theCUBE's coverage here in Orlando, Florida at Cisco Live 2018. Our first year here at Cisco Live. We were in Barcelona this past year. Again, Cisco transforming to a next generation set of networking capabilities while maintaining all the existing networks and all the security. I'm John Furrier your host with Stu Miniman my co-host for the next three days. Our next guest is Arun Garg. Welcome to theCUBE. You are the Director of Product Management Converged Infrastructure Group at NetApp. >> Correct, thank you very much for having me on your show and it's a pleasure to meet with you. >> One of the things that we've been covering a lot lately is the NetApp's really rise in the cloud. I mean NetApp's been doing a lot of work on the cloud. I mean I've wrote stories back when Tom Georges was the CEO when Amazon just came on the scene. NetApp has been really into the cloud and from the customer's standpoint but now with storage and elastic resources and server lists, the customers are now startin' to be mindful. >> Absolutely. >> Of how to maximize the scale and with All Flash kind of a perfect storm. What are you guys up to? What's your core thing that you guys are talking about here at Cisco Live? >> So absolutely, thank you. So George Kurian, our CEO at NetApp, is very much in taking us to the next generation and the cloud. Within that I take care of some of the expansion plans we have on FlexPod with Cisco and in that we have got two new things that we are announcing right now. One is the FlexPod for Healthcare which is in FlexPod we've been doing horizontal application so far which are like the data bases, tier one database, as well as applications from Microsoft and virtual desktops. Now we are going vertical. Within the vertical our application, the first one we're looking in the vertical is healthcare. And so it's FlexPod for Healthcare. That's the first piece that we are addressing. >> What's the big thing with update on FlexPod? Obviously FlexPod's been very successful. What's the modernization aspect of it because Cisco's CEO was onstage today talking about Cisco's value proposition, about the old ways now transitioning to a new network architecture in the modern era. What's the update on FlexPod? Take a minute to explain what are the cool, new things going on with FlexPod. >> Correct, so the All Flash FAS, which is the underlying technology, which is driving the FlexPod, has really picked up over the last year as customers keep wanting to improve their infrastructure with better latencies and better performance the All Flash FAS has driven even the FlexPod into the next generation. So that's the place where we are seeing double-digit growth over the last five quarters consistently in FlexPod. So that's a very important development for us. We've also done more of the standard CVDs that we do on SAP and a few other are coming out. So those are all out there. Now we are going to make sure that all these assets can be consumed by the vertical industry in healthcare. And there's another solution we'll talk about, the managed private cloud on FlexPod. >> Yeah, Arun, I'd love to talk about the private cloud. So I think back to when Cisco launched UCS it was the storage partners that really helped drive that modernization for virtualization. NetApp with FlexPod, very successful over the years doing that. As we know, virtualization isn't enough to really be a private cloud. All the things that Chuck Robbins is talking about onstage, how do I modernize, how do I get you know, automation in there? So help us connect the dots as to how we got from you know, a good virtualized platform to this is, I think you said managed private cloud, FlexPod in Cisco. >> Absolutely. So everybody likes to consume a cloud. It's easy to consume a cloud. You go and you click on I need a VM, small, medium, large, and I just want to see a dashboard with how my VMs are doing. But in reality it's more difficult to just build your own cloud. There's complexity associated with it. You need a service platform where you can give a ticket, then you need an orchestration platform where you can set up the infrastructure, then you need a monitoring platform which will show you all of the ways your infrastructure's working. You need a capacity planning tool. There's tens of tools that need to be integrated. So what we have done is we have partnered with some of the premium partners and some DSIs who have already built this. So the risk of a customer using their private cloud infrastructure is minimized and therefore these partners also have a managed service. So when you combine the fact that you have a private cloud infrastructure in the software domain as well as a managed service and you put it on the on-prem FlexPod that are already sold then the customer benefits from having the best of both worlds, a cloud-like experience on their own premise. And that is what we are delivering with this FlexPod managed private cloud solution. >> Talk about the relationship with Cisco. So we're here at Cisco Live you guys have a good relationship with Cisco. What should customers understand about the relationship? What are the top bullet points and value opportunities and what does it mean to the impact for the customer? >> So we, all these solutions we work very closely with the Cisco business unit and we jointly develop these solutions. So within that what we do is there's the BU to BU interaction where the solution is developed and defined. There is a marketing to marketing interaction where the collateral gets created and reviewed by both parties. So you will not put a FlexPod brand unless the two companies agree. >> So it's tightly integrated. >> It's tightly integrated. The sales teams are aligned, the marketing, the communications team, the channel partner team. That's the whole value that the end customer gets because when a partner goes to a high-end enterprise customer he knows that both Cisco and NetApp teams can be brought to the table for the customer to showcase the value as well as help them through it all. >> Yeah, over in one of the other areas that's been talked about this show we talk about modernization. You talk about things like microservices. >> Yes. >> Containers are pretty important. How does that story of containerization fit into FlexPod? >> Absolutely. So containerization helps you get workloads, the cloud-native workloads or the type two native. Type two workloads as Gartner calls them. So our mode two. What we do is we work with the Cisco teams and we already had a CVD design with a hybrid cloud with a Cisco cloud center platform, which is the quicker acquisition. And we showed a design with that. What we are now bringing to the table is the ability for our customers to benefit with a managed service on top of it. So that's the piece we are dealing with the cloud teams. With the Cisco team the ACI fabric is very important to them. So that ACI fabric is visible and shown in our designs whether you do SAP, you do Oracle, you do VDI and you do basic infrastructure or you do the managed private cloud or FlexPod on Healthcare. All of these have the core networking technologies from Cisco, as well as the cloud technologies from Cisco in a form factor or in a manner that easily consumable by our customers. >> Arun, talk about the customer use cases. So say you've got a customer, obviously you guys have a lot of customers together with Cisco, they're doing some complex things with the technology, but for the customer out there that has not yet kinda went down the NetApp Cisco route, what do they do? 'Cause a lot of storage guys are lookin' at All Flash, so check, you guys have that. They want great performance, check. But then they gotta integrate. So what do you say to the folks watching that aren't yet customers about what they should look at and evaluate vis-a-vis your opportunity with them and say the competition? >> So yes, there are customers who are doing all this as separate silos, but the advantage of taking a converged infrastructure approach is that you benefit from the years of man experience or person experience that we have put behind in our labs to architect this, make sure that everything is working correctly and therefore is reduces their deployment time and reduces the risk. And if you want to be agile and faster even in the traditional infrastructure, while you're being asked to go to the cloud you can do it with our FlexPod design guides. If you want the cloud-like experience then you can do it with a managed private cloud solution on your premise. >> So they got options and they got flexibility on migrating to the cloud or architecting that. >> Yes. >> Okay, great, now I'm gonna ask you another question. This comes up a lot on theCUBE and certainly we see it in the industry. One of the trends is verticalization. >> Yes. >> So verticalization is not a new thing. Vertical industry, people go to market that way, they build products that are custom to verticals. But with cloud one of the benefits of cloud and kind of a cloud operations is you have a horizontally scalable capability. So how do you guys look at that, because these verticals, they gotta get closer to the front lines and have apps that are customized. I mean data that's fastly delivered to the app. How should verticals think about architecting storage to maintain the scale of horizontally scalable but yet provide customization into the applications that might be unique to the vertical? >> Okay, so let me give a trend first and then I'll get to the specific. So in the vertical industry, the next trend is industry clouds. For example, you have healthcare clouds and you'll have clouds to specific industries. And the reason is because these industries have to keep their data on-prem. So the data gravity plays a lot of impact in all of these decisions. And the security of their data. So that is getting into industry-specific clouds. The second pieces are analytics. So customers now are finding that data is valuable and the insight you can get from the data are actually more valuable. So what they want is the data on their premise, they want the ability all in their control so to say, they want the ability to not only run their production applications but also the ability to run analytics on top of that. In the specific example for health care what it does is when you have All Flash FAS it provides you a faster response for the patient because the physician is able to get the diagnostics done better if he has some kind of analytics helping him. [Interviewer] - Yeah. >> Plus the first piece I talked about, the rapid deployment is very important because you want to get your infrastructure set up so I can give an example on that too. >> Well before we get to the example, this is an important point because I think this is really the big megatrend. It's not really kinda talked much about but it's pretty happening is that what you just pointed out was it's not just about speeds and feeds and IOPs, the performance criteria to the industry cloud has other new things like data, the role of data, what they're using for the application. >> Correct. >> So it's just you've gotta have table stakes of great, fast storage. >> Yes. >> But it's gotta be integrated into what is becoming a use case for the verticals. Did I get that right? >> Yes, absolutely. So I'll give two examples. One I can name the customer. So they'll come at our booth tomorrow, in a minute here. So LCMC Health, part of UMC, and they have the UMC Medical Center. So when New Orleans had this Katrina disaster in Louisiana, so they came up with they need a hospital, fast. And they decided on FlexPod because within three months with the wire one's architecture and application they could scale their whole IT data center for health care. So that has helped them tremendously to get it up and running. Second is with the All Flash FAS they're able to provide faster response to their customer. So that's a typical example that we see in these kind of industries. >> Arun, thanks for coming on theCUBE. We really appreciate it. You guys are doing a great job. In following NetApps recent success lately, as always, NetApp's always goin' the next level. Quick question for you to end the segment. What's your take of Cisco Live this year? What's some of the vibe of the show? So I know it's day one, there's a lot more to come and you're just getting a sense of it. What's the vibe? What's coming out of the show this year? What's the big ah-ha? >> So I attended the keynote today and it was very interesting because Cisco has taken networking to the next level within 10 base networking, its data and analytics where you can put on a subscription mode on all the pieces of the infrastructure networking. And that's exactly the same thing which NetApp is doing, where we are going up in the cloud with this subscription base. And when you add the two subscription base then for us, at least in the managed private cloud solution we can provide the subscription base through the managed private cloud through our managed service provider. So knowing where the industry was going, knowing where Cisco was going and knowing where we want to go, we have come up with this solution which matches both these trends of Cisco as well as NetApp. >> And the number of connected devices going up every day. >> Yes. >> More network connections, more geo domains, it's complicated. >> It is complicated, but if you do it correctly we can help you find a way through it. >> Arun, thank you for coming on theCUBE. I'm John Furrier here on theCUBE with Stu Miniman here with NetApp at Cisco Live 2018. Back with more live coverage after this short break. (upbeat music)

Published Date : Jun 11 2018

SUMMARY :

Brought to you by Cisco, NetApp and all the security. and it's a pleasure to meet with you. and from the customer's standpoint What are you guys up to? One is the FlexPod for What's the modernization aspect of it So that's the place where we All the things that Chuck So the risk of a customer using Talk about the relationship with Cisco. So you will not put a FlexPod brand that the end customer gets Yeah, over in one of the other areas How does that story of So that's the piece we are and say the competition? and reduces the risk. on migrating to the cloud One of the trends is verticalization. the benefits of cloud and the insight you can get from the data Plus the first piece I talked the big megatrend. So it's just you've case for the verticals. One I can name the customer. What's some of the vibe of the show? So I attended the keynote today And the number of connected it's complicated. we can help you find a way through it. Arun, thank you for coming on theCUBE.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
CiscoORGANIZATION

0.99+

Tom GeorgesPERSON

0.99+

AmazonORGANIZATION

0.99+

ArunPERSON

0.99+

George KurianPERSON

0.99+

UMC Medical CenterORGANIZATION

0.99+

BarcelonaLOCATION

0.99+

Arun GargPERSON

0.99+

two companiesQUANTITY

0.99+

John FurrierPERSON

0.99+

LCMC HealthORGANIZATION

0.99+

Chuck RobbinsPERSON

0.99+

Stu MinimanPERSON

0.99+

MicrosoftORGANIZATION

0.99+

UMCORGANIZATION

0.99+

second piecesQUANTITY

0.99+

LouisianaLOCATION

0.99+

Orlando, FloridaLOCATION

0.99+

KatrinaEVENT

0.99+

FlexPodCOMMERCIAL_ITEM

0.99+

NetAppORGANIZATION

0.99+

both partiesQUANTITY

0.99+

New OrleansLOCATION

0.99+

SecondQUANTITY

0.99+

10 baseQUANTITY

0.99+

three monthsQUANTITY

0.99+

first pieceQUANTITY

0.99+

two examplesQUANTITY

0.99+

tomorrowDATE

0.99+

bothQUANTITY

0.99+

OneQUANTITY

0.99+

first oneQUANTITY

0.99+

todayDATE

0.98+

theCUBEORGANIZATION

0.98+

first yearQUANTITY

0.98+

last yearDATE

0.98+

GartnerORGANIZATION

0.98+

both worldsQUANTITY

0.97+

Cisco Live 2018EVENT

0.97+

twoQUANTITY

0.97+

NetAppTITLE

0.97+

this yearDATE

0.97+

All Flash FASCOMMERCIAL_ITEM

0.97+

oneQUANTITY

0.97+

two new thingsQUANTITY

0.96+

tens of toolsQUANTITY

0.95+

UCSORGANIZATION

0.94+

OracleORGANIZATION

0.94+

Dave Tokic, Algolux | Autotech Council 2018


 

>> Announcer: From Milpitas, California, at the edge of Silicon Valley, it's the Cube, covering autonomous vehicles. Brought to you by Western Digital. >> Hey, welcome back here ready, Jeff Frick here with the Cube. We're at Western Digital's office in Milpitas, California at the Autotech Council Autonomous Vehicle event. About 300 people talking about all the various problems that have to be overcome to make this thing kind of reach the vision that we all have in mind and get beyond the cute. Way more cars driving around and actually get to production fleet, so a lot of problems, a lot of opportunity, a lot of startups, and we're excited to have our next guest. He's Dave Tokic, the VP of Marketing and Strategic Partnerships from Algolux. Dave, great to see you. >> Great, thank you very much, glad to be here. >> Absolutely, so you guys are really focused on a very specific area, and that's about imaging and all the processing of imaging and the intelligence out of imaging and getting so much more out of those cameras that we see around all these autonomous vehicles. So, give us a little bit of the background. >> Absolutely, so, Algolux, we're totally focused on driving safety and autonomous vision. It's really about addressing the limitations today in imaging and computer vision systems for perceiving much more effectively and robustly the surrounding environment and the objects as well as enabling cameras to see more clearly. >> Right, and we've all seen the demo in our twitter feeds of the chihuahua and the blueberry muffin, right? This is not a simple equation, and somebody like Google and those types of companies have the benefit of everybody uploading their images, and they can run massive amounts of modeling around that. How do you guys do it in an autonomous vehicle, it's a dynamic situation, it's changing all the time, there's lots of different streets, different situations. So, what are some of the unique challenges, and how are you guys addressing those? >> Great, so, today, for both 8S systems and autonomous driving, the companies out there are focusing on really the simpler problems of being able to properly recognize an object or an obstacle in good conditions, fair weather in Arizona, or Mountain View or Tel Aviv, et cetera. But really the, we would live in the real world. There's bad weather, there is low light, there's lens issues, lens dirty, and so on. Being able to address those difficult issues is not really being done well today. There's difficulties in today's system architectures to be able to do that. We take a very different, novel approach to how we process and learn through deep learning the ability to do that much more robustly and much more accurately than today's systems. >> How much of that's done kind of in the car, how much of it's done where you're building your algorithms offline and then feeding them back into the car, how does that loop kind of work? >> Great question, so the objective for this, we're deploying on, is the intent to deploy on systems that are in the car, embedded, right? We're not looking to the cloud-based system where it's going to be processed in the cloud and the latency issues and so on that are a problem. Right now, it's focused on the embedded platform in the car, and we do training of the datasets, but we take a novel approach with training as well. We don't need as much training data because we augmented it with very specific synthetic data that understands the camera itself as well as taking in the difficult critical cases like low light and so on. >> Do you have your own dedicated camera or is it more of a software solution that you can use for lots of different types of inbound sensors? >> Yeah, what we have today is, we call it, CANA. It is a full end-to-end stack that starts from the sensor output, so say, an imaging sensor or a path to fusion like LIDAR, radar, et cetera, all the way up to the perception output that would then be used by the car to make a decision like emergency braking or turning or so on. So, we provided that full stack. >> So perception is a really interesting word to use in the context of a car, car visioning and computer vision cause it really implies a much higher level of understanding as to what's going, it really implies context, so how do you help it get beyond just identifying to starting to get perception so that you can make some decisions about actions. >> Got it, so yeah, it's all about intelligent decisions and being able to do that robustly across all types of operating conditions is paramount, it's mission critical. We've seen recent cases, Uber and Tesla and others, where they did not recognize the problem. That's where we start first with is to make sure that the information that goes up into the stack is as robust and accurate as possible and from there, it's about learning and sharing that information upstream to the control stacks of the car. >> It's weird cause we all saw the video from the Uber accident with the fatality of the gal unfortunately, and what was weird to me on that video is she came into the visible light, at least on the video we saw, very, very late. But ya got to think, right, visible light is a human eye thing, that's not a computer, that's not, ya know, there are so many other types of sensors, so when you think of vision, is it just visible light, or you guys work within that whole spectrum? >> Fantastic question, really the challenge with camera-based systems today, starting with cameras, is that the way the images are processed is meant to create a nice displayed image for you to view. There are definite limitations to that. The processing chain removes noise, removes, does deblurring, things of that nature, which removes data from that incoming image stream. We actually do perception prior to that image processing. We actually learn how to process for the particular task like seeing a pedestrian or bicyclist et cetera, and so that's from a camera perspective. It gives up quite the advantage of being able to see more that couldn't be perceived before. We're also doing the same for other sensing modalities such as LIDAR or radar and other sensing modalities. That allows us to take in different disparate sort of sensor streams and be able to learn the proper way of processing and integrating that information for higher perception accuracy using those multiple systems for sensor fusion. >> Right, I want to follow up on kind of what is sensor fusion because we hear and we see all these startups with their self-driving cars running around Menlo Park and Palo Alto all the time, and some people say we've got LIDAR, LIDAR's great, LIDAR's expensive, we're trying to do it with just cameras, cameras have limitations, but at the end of the day, then there's also all this data that comes off the cars are pretty complex data receiving vehicles as well, so in pulling it all together that must give you tremendous advantages in terms of relying on one or two or a more singular-type of input system. >> Absolutely, I think cameras will be ubiquitous, right? We know that OEMs and Tier-1s are focused heavily on camera-based systems with a tremendous amount of focus on other sensing modalities such as LIDARs as an example. Being able to kit out a car in a production fashion effectively and commercially, economically, is a challenge, but that'll, with volume, will reduce over time, but doing that integration of that today is a very manually intensive process. Each sensing mode has its own way of processing information and stitching that together, integrating, fusing that together is very difficult, so taking an approach where you learn through deep learning how to do that is a way of much more quickly getting that capability into the car and also providing higher accuracy as the merged data is combined for the particular task that you're trying to do. >> But will you system, at some point, kind of check in kind of like the Teslas, they check in at night, get the download, so that you can leverage some of the offline capabilities to do more learning, better learning, aggregate from multiple sources, those types of things? >> Right, so for us, the type of data that would be most interesting is really the escapes. The things where the car did not detect something or told the driver to pay attention or take the wheel and so on. Those are the corner cases where the system failed. Being able to accumulate those particular, I'll call it, snips of information, send that back and integrate that into the overall training process will continue to improve robustness. There's definitely a deployed model that goes out that's much more robust than what we've seen in the market today, and then there's the ongoing learning to then continue to improve the accuracy and robustness of the system. >> I think people so underestimate the amount of data that these cars are collecting in terms of just the way streets operate, the way pedestrians operate, but whether there's a incident or not, they're still gathering all that data and making judgements and identifying pedestrians, identifying bicyclists and capturing what they do, so hopefully, the predictiveness will be significantly better down the road. >> That's the expectation, but like numerous studies have said, there's a lot of data that's collected that's just sort of redundant data, so it's really about those corner cases where there was a struggle by the system to actually understand what was going on. >> So, just give us kind of where you are with Algolux, state of the company, number of people, where are ya on your lifespan? >> Algolux is the startup based in Montreal with offices in Palo Alto and Munich. We have about 26 people worldwide, most of them in Montreal, very engineering heavy these days, and we will continue to do so. We have some interesting forthcoming news that please keep an eye out for of accelerating what we're doing. I'll just hint it that way. The intent really is to expand the team to continue to productize what we've built and start to scale out, to engage more of the automotive companies we're working with. We are engaged today at the Tier-2, Tier-1, and OEM levels in automotive, and the technology is scalable across other markets as well. >> Pretty exciting, we look forward to watching, and you're giving it the challenges of real weather unlike the Mountain View guys who we don't really deal with real weather here. (laughing) >> There ya go. (laughing) Fair enough. >> All right Dave, well, thanks for taking a few minutes out of your day, and we, again, look forward to watching the story unfold. >> Excellent, thank you, Jeff. >> All right. >> All right, appreciate it. >> He's Dave, I'm Jeff, you're watching the Cube. We're are Western Digital in Milpitas at Autotech Council Autonomous Vehicle event. Thanks for watching, we'll catch ya next time.

Published Date : Apr 14 2018

SUMMARY :

at the edge of Silicon Valley, the vision that we all have in mind and get beyond the cute. and all the processing of imaging and the intelligence It's really about addressing the limitations today of the chihuahua and the blueberry muffin, right? the ability to do that much more robustly on systems that are in the car, embedded, right? all the way up to the perception output that would then in the context of a car, car visioning and being able to do that robustly across all types at least on the video we saw, very, very late. is that the way the images are processed is meant and Palo Alto all the time, and some people say as the merged data is combined for the particular send that back and integrate that into the overall of just the way streets operate, That's the expectation, but like numerous studies of the automotive companies we're working with. and you're giving it the challenges There ya go. look forward to watching the story unfold. We're are Western Digital in Milpitas

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Western DigitalORGANIZATION

0.99+

Dave TokicPERSON

0.99+

JeffPERSON

0.99+

ArizonaLOCATION

0.99+

Jeff FrickPERSON

0.99+

TeslaORGANIZATION

0.99+

MontrealLOCATION

0.99+

UberORGANIZATION

0.99+

Palo AltoLOCATION

0.99+

MunichLOCATION

0.99+

DavePERSON

0.99+

AlgoluxORGANIZATION

0.99+

Menlo ParkLOCATION

0.99+

Silicon ValleyLOCATION

0.99+

Tel AvivLOCATION

0.99+

MilpitasLOCATION

0.99+

Milpitas, CaliforniaLOCATION

0.99+

oneQUANTITY

0.99+

twoQUANTITY

0.99+

GoogleORGANIZATION

0.99+

Mountain ViewLOCATION

0.99+

bothQUANTITY

0.98+

todayDATE

0.98+

TeslasORGANIZATION

0.98+

CANAORGANIZATION

0.97+

About 300 peopleQUANTITY

0.97+

Autotech Council Autonomous VehicleEVENT

0.94+

CubeCOMMERCIAL_ITEM

0.93+

twitterORGANIZATION

0.92+

about 26 peopleQUANTITY

0.89+

Tier-1OTHER

0.88+

Tier-2OTHER

0.88+

firstQUANTITY

0.86+

Each sensing modeQUANTITY

0.77+

CubePERSON

0.77+

AutotechEVENT

0.77+

2018EVENT

0.58+

CouncilORGANIZATION

0.55+

CubeORGANIZATION

0.51+

Tier-1sOTHER

0.45+

LIDARTITLE

0.42+

8SCOMMERCIAL_ITEM

0.32+

LIDARCOMMERCIAL_ITEM

0.28+

Muddu Sudhakar, Stealth Mode Startup Company | CUBEConversation, April 2018


 

(upbeat music) >> Hi, I'm Peter Burris. Welcome another to theCUBE Conversation from beautiful Palo Alto. Here today, we are with Muddu Sudhakar, who's a CEO investor, and a long-time friend of theCUBE. Muddu, welcome to theCUBE. >> Thank you, Peter. Thanks for having me. >> So, one of the things we're going to talk about, there's a lot of things we could talk about, I mean, you've been around you've invested in a number of companies. You've got a great pedigree, a great track record. ServiceNow, and some other companies, I'll let you talk a bit more about that. But, one of the things we want to talk about is some of the big changes that are happening in the way that IT gets delivered within enterprises. The whole notion of IT operations management is on the forefront of everyone's mind. We've been talking about dev ops for a long time. It hasn't been universally adopted, it clearly needs some help; it's working really well in some places, not so well in other places. We're trying to bring that cloud-operating model into the enterprise. What are some of the things, based on your experience, talk a little bit about yourself, and then use that as a level into, what are some of the things that the IT organization, business overall, has to think about as they think about modernizing IT operations management, or ITOM. >> Great topic, it's very lengthy. We can go on for hours on this, right? As we are talking earlier, Peter, so I think operations IT management has been around for what, 20, 30, years? It started with, I guess, at the time of mainframes, to client server. But, as you rightfully said, we are in the age of cloud. How does cloud, AI machine learning, and the SaaS services going to impact ITOM, our IT operation management? I think that's, it's going to evolve, the question is how it's going to evolve. And, the one area that you are always passionate about talking about is cloud infrastructure itself, and the word that you use is called, Plastic infrastructure. The underlying infrastructure is changing so much. We are moving from virtual machines to server-less architectures, to containers. So this whole server-less architecture presents such a new concept, that the ITOM as itself should evolve to something new. I actually, I mean the industry word for this is, called AI operations. AI is just one piece. But how do you take hybrid cloud, how do you take the actual cloud substrate, and evolve IT operation management is such a big topic, on multiple areas, and how it is going to change industry. >> So, let's break it down a little bit. So, you mentioned the term plastic infrastructure. We've written a bunch about that here at Wikibon, and the basic notion of plastic infrastructure is that we can look at three generations of infrastructure, what we call static infrastructure, which might be brick, you add load to it, it might fall apart, but it was bound into the application. And in the world, or the era of elastic infrastructure is really where the cloud started, and the idea that you no longer had to purchase to your peak. That the elastic infrastructure would allow you to peak up, and peak down, but it would snap back into place, it was almost like a rubber brick. But this notion of plastic infrastructure, how do we add new workloads faster, is how do we but do so in a way that we don't have to manually go in and adust the infrastructure. That the infrastructure just responds to the new workloads in a plastic way. And snaps into a new form. Now to, we are going to need to be able to do that. If we're going to add AI, and we're going to add, you know, ML, machine learning, and all these other new application-oriented technologies to this. Can't imagine how we're going to add all that complexity to the application level, if we don't dramatically automate and simplify the operating load. And that's the basis of plastic infrastructure. What do you think? >> No, I completely, I think you kind of touched all the good points, but the areas that I can add on top of what you mentioned, is if you look at the plastic infrastructure, the one area is, so far IT operations management is built around a human being, around a dev ops, and around a IT admin. In the new world, it will be 90 to 80% to be done automated manner. Your trading is algorithmic, your in a self-driving car age, but at the IT operations management is around an IT admin and a dev ops. That got to change. I think cloud guys, the Amazon, Azure, Google, they're going to disrupt this because they have to do this in an automated manner, right? So that means, the plastic infrastructure will be able to run workloads, it should be malleable. It's like the, it should be changes shape and form. And it should be that's where the server-less really comes in. I don't want to pick a computer, and rent it for so many hours, that's still a silly concept, I think this whole virtualization, and virtual machines, is gone to the point of server-less. So, all these things. How do you manage the workloads? How do you manage your apps? To your point, apps have to be mapped downstream. I call it, as service maps. How do you build these dynamic service maps for your application? How do I know which component is failing at what point in time? Alright, asking what I call the root cause analysis. Do you expect a human being to identify that MongoDB, or a SQL server is down, because of this hardware issue? That has to be detected automatic manner, right? At least, a root cause and triage it to the point where a human being can come and say, I agree, or don't agree, able to take. Then, the final thing is, the infrastructure has to be, should be, take actions. Allow it to be at the point where the under, once you detect a problem, the infrastructure should be able to say algorithmically, progammatically to an API. I should be able to impact the change. The problem in chain infrastructure today is very much it's very much driven through scrapes and through admins. Can I do that in a programmatic manner? It hasn't happened yet. >> Then it should be, I mean when you stop and think about it AI for example, using AI as a general umbrella for a lot of different technologies that are based on you know, pattern-recognition, and anomaly detection. And all the other stuff that is associated with AI. But we have pretty good data sources in the infrastructure. We know how these tools operate, they are programmable, so they get, you know, a range of particular behaviors. But there are discernible patterns associated with those behaviors, so you'd think that infrastructure itself would be a great source, to start to building out some of these AI platforms, some of these new modeled, what we call data-first, type of applications. What do you think? >> Absolutely, you nailed it. I think, if you remember my previous company Caspida, which was acquired by Splunk. We did that for security. We created this whole area called user behavior analytics. Right? For security, understand the behavior of the users, understand the behavior of the attackers, actual inside it. Same thing needs to happen. >> But all represented through a device. >> Through a device. >> That had known characteristics. So we weren't saying, we're making big claims necessarily about people. Which have, you know, unbelievably complex, but when you start with, What is a person doing with a device? That set of behaviors is now constrained, which makes it a great source. >> Absolutely. So I think it, like given the sources in the IT operations area, if you were think about, for example, looking at the patterns and the behavior of the application, the storage, I call it like, think of like the four layers. You have apps, you have compute, your network, and storage. There are different patterns and behavior you can do it. You can do anomalies, and you can understand the various workflow of the patterns. But I call it the three P's problem with AI machine learning. The P's are, you actually said it five P's. The three P's, that I usually talk about is, the proactive, the predictive, and prescriptive nature. If I can take this data sources, whether they come from logs, events, alerts, and able to do this for those, I can do planning. I can be able to implement what changes I can do as a workflow and full actions. 'Cause detecting is no good, if I can't take an action. That's where the prescriptiveness comes in. And I think that whole area of IT operations management, what needs to happen is mundane with a human being, will be automated. And then the question comes in is, Do you do this in batch mode, or real-time? >> You want to do it in real-time. But let me get those straight. So the three P's that you mentioned where, proactive, prescriptive? >> And predictive. >> And predictive. So, that's proactive, predictive, and prescriptive. And just, you know, to level it out, I noted that all this is based on patterns. >> Yes. >> That come out of some of these infrastructure technologies. So, as we think about where ITOM is going, you mentioned earlier AI, systems management, AI services management. When we think about kind of some of the next steps, who do you anticipate are going to be kind of at the leading, or leading the charge, as we move forward here. >> I think there'll be a new sheriff in town. May not, or to your point earlier, that when many sheriffs in town in this area. The great opportunity here is when all there is a fundamental change like this happens, there will be new players will win this market. Definitely the cloud guys have the right substrate. The Amazon's, the Azure's, and the Google's of the world. They have the right infrastructure, they are all moving towards the plastic infrastructure. They just have to do more on workload management. They need to do more on the AI operations. >> They have a, absolutely a sense of urgency, and pressing need. >> They have. >> Their business falls down if they don't do this. >> So I think those guys will definitely there. Then all the start-ups, right? I think there are a whole bunch of start-ups, each of them will be doing, from a small niche player, all over to platform players. It's a great opportunity, greenfield opportunity. It's going to open up a whole wonderful, new players will come in. Who will be the next generations' AIOps operations vendors. >> So, I'm going to ask you two questions, then. One, do you think the big boys, the HPE's, the Oracle's, the Cisco's, the IBM's, are going to be able to change their stripes enough, so that they can do both? We're tryin' to keep our stall base and upgrade, enhance it, and try to introduce this new cloud operating model? And we'll talk about the start-ups in a second. What do you think? Are the big boys going to be able to make this transition? >> I think they have to, their hand is already dealt. I call it, the cloud is a runaway train, the cloud today is 30, 40 billion dollars. If you are those mega-vendors, you don't, if you're not making on this, something is wrong with you. Right? I mean, in this day and age, if you're not making money on the cloud, with this, with what we're talking about. So what they do is, how can they, either they, have to offer a cloud services, public or a PERT. If you are not doing that, might as well get into this game of AIOps, so that you are actually making money on the apps, and on the infrastructure. So, all those big, large vendors that you mentioned, about the Cisco's of the world, the Oracle's of the, they have a genuine interest to make this happen. >> Got it. So in many respects, to kind of summarize that point, it's like, look, the cloud experience is being defined elsewhere. It's being defined by Azure. AWS, Google, GCP, and these vendors are going to have to articulate very, very clearly for their customer base, the role that they're going to play. And that could include bringing the cloud experience on premise, when and if, data is required on premise. >> Absolutely, and I actually call this cloud should be the aircraft carriers, right? As a world when it settles, eventually it won't have hundred aircraft carriers. You'll have this three or four large cloud vendors. On top of them, the people who manage the apps and services will be few. You don't need 20 vendors managing your infrastructure. So there'll be a huge consolidation game. The questions is, when that happens, the winners doesn't have to be the like c-Vendors. >> Right. >> The history always show the legacy always loses out. So that's where the start-ups have an opportunity. >> Alright. So let's talk about the start-ups. Are there any particular class of start-ups out there. Is this going to, or are some of the security guys who manage services going to be able to do a better job, because they can make claims about your data? Or some of the guys, some of the companies coming from middleware? Where do you think the start-up kind of epicenter is going to be as we see new companies introduced in this space? >> That's a good question, I don't have any one particular vendor in mind. But I think that definitely the vendors that will come into play will be people who can do log management better. We already know the IS Splunk's of the world. People who can do events and alerts management. People who can do incident problem change management, right? All those things, if you look at the whole area. And people who can do the whole application management, as earlier you were talking about the workload management. So I think each of these functions, there'll be winners coming in. Eventually all of them will be offered by one single person, as a full-stack solution for the cloud, on the cloud. The key problem that I keep noticing is, most vendors are keep still tied to the old infrastructure, which is mainframe, or physical servers. Nobody is building this thing for the cloud, in the cloud again. So somebody who has the right substrate to build this, as a playbook, will end up winning this game. >> Yeah, it's going to be an interesting period of time. Now, when we stop and think about, I made an assertion earlier, that for us to build more complex applications, which is where everybody talking about, it's essential, in our opinion, that we find ways to simplify and bring more automation to the infrastructure. If we think about servers, storage, network, those type of things, is there a particular part of the infrastructure that you think is going to receive treatment earlier, and therefore is going to kind of lead the way for how, the rest of this stuff. Is storage going to show CPU, and network? Or is network going to step up, because some of the changes that are happening? What do you think? >> That's a very good question. I think, look, the key think the key pain points for most people today, if you look at the way the complex questions are, if there's a problem in the network infrastructure, it's very hard to triage that, so that area has to be automated. I mean, you can't expect a human being to understand why my switch or network is not performing. >> It's just happening too fast >> Why like, why WIFI is not working on sixth floor and seventh floor. It's a very, so network will be one area, it's highly visible. The second will be in the database and storage area. Just because my storage disk is full, I don't want my database to be down. It's such an old pattern behavior, People will catch those things in an automated manner. Right? So storage, network, because. Where you see the higher level items is when an application is not performing well. Is it a performance problem? Or, why this component is tied to what component, right? Is this applicant is built on a load balancer, and a load balancer is talking to, and the database. Building that map of who's-connected-to-who, that's a new graph, algorithms graph to the unit. That doesn't exist today. So I think what'll happen is how do you manage an application, given a problem, and mapping that. That is I think the number one, that will start happening first. Everything else, people will happen over a period of time. But the apps that are visible, where a user and a customer can see the impact, will happen first. >> Yeah, actually we have a prediction here at Wikibon, what we call networks of data. Where the idea that we're going to the next round of network formation is going to be data assets explicitly connecting with each other. And then using that as a way of zoning data assets. And saying, this application requires data from these places, and then all the technology that allows you to either move it in. >> Right. >> Or keep pointers, or whatever else it might be. So this notion, you would agree then. You know, a graph of data is going to drive a lot of the change forward. >> And to actually take you to that, I actually talk about saying it doesn't require a single class of algorithm. I call it an ensemble of machine learning algorithms you need. You need some statistical, some probablistic, some Markovian algortithm, some Bayesian, and mainly graph algorithms. This data has to capture the behaviors and patterns that you want to put in a larger graph, that you should be able to mine on. That doesn't exist today. So everybody is most often, when they talk about like their dynamic thresholding, statistical, that actually is there in idea operation management. The next level of how do you build a graph, like too big to fail, in my opinion fails. What is it relying on, like if I come to Peter's house. How is your house looks like, the area, one-bedroom, you have two kitchens, You know what I'm saying. >> It looks like a network of data right now. >> Exactly, right. (laughing) >> Okay, so, I got one more question for you, Muddu. And that is, you work with us a lot, and some of the crowd chats you do. You're a great research partner for us. As you think about kind of the story that needs to be told to the CxO about some of these changes, how's it different from the story that needs to be told to the DI team leader? I can imagine what some of the differences are, but you're talking to both sides. What would you, what would you're advice and counsel be to companies that are trying to talk to the CEO about this, or the board, what do you think? What would you say to 'em? >> I think you kind of got it yesterday in the crowd chat. I think the key thing that the CIO or CxO or CEO needs to have this is, we used to call it Chief Data Officer, where the data is the key, that delimit was applied for the overall business. That same role needs to happen within the CIO now. How do I use my data to make my IT better? So that, maybe call it a CIO, a CDO for the CIO, is a big role that needs to happen, but the goal of that person and that entity should be is, How can you do, can I run my operation in a light sort manner? I call it IT as a service. People talk about IT and service. But IT as a service to me, is a bigger concept. >> Let me make sure I got this, 'cause this is crucially important point. So in many respects, we should be saying to the CEO, your data is an asset, you have to take steps to appreciate, dramatically and rapidly, appreciate the value of data as an asset, and that requires looking at the CIO with the CDO, data officer, and saying, your job, independent of any technology or any particular set of ITOM processes, your job is to dramatically accelerate how fast we're able to generate data. >> During decisions. >> Value out of our data being able to utilize these technology investments. >> Absolutely. Because that person, once you have the data as addition, what will happen is, you'll still use the existing process, but it gets you the new insight, What can I automate? What can I do more with less people, right? That has to happen, like if I'm a CEO, he should wake up and say, 90% of my things should be able to automate today, right? >> Okay, so let's talk about last question. You've been, you've led a lot of organizations through a lot of change. We're talking about a lot of change within the IT organization, when we talk about these things. What's one bit of advice that you have for that CIO or leader of IT, and help them take their people through the types of changes that we're talking about? >> Make bets. Don't be afraid of making bets, unless you make a bet you're never going to win. So every year, every quarter, make a new bet. Some bets, you are going to fail, some you're going to succeed. Unless you make a bet, you will not innovate. >> Peter: And understand the portfolio, and sustain those bets. And then, when you've lost, don't keep putting money out. >> Exactly, yeah, keep moving on. >> Great. Alright, so, Muddu, thank you very much for being here. >> Peter, always a pleasure. >> Alright, Muddu Sadhakar, investor, CEO, once again, this has been a CUBEConversation, thank you very much for being here. >> Thank you, Peter. >> And we'll talk to you soon. >> Muddu: Thank you always, and John too. (upbeat music)

Published Date : Apr 12 2018

SUMMARY :

Welcome another to theCUBE Conversation Thanks for having me. is some of the big changes that are happening and the word that you use That the infrastructure just responds to the new So that means, the plastic infrastructure will be able And all the other stuff that is associated with AI. I think, if you remember my previous company But all represented but when you start with, the IT operations area, if you were So the three P's that you mentioned where, And just, you know, to level it out, who do you anticipate are going to be The Amazon's, the Azure's, and the Google's of the world. They have a, absolutely a sense of urgency, and It's going to open up a whole wonderful, the Cisco's, the IBM's, are going to be able to change game of AIOps, so that you are actually making money the role that they're going to play. have to be the like c-Vendors. The history always show the legacy always loses out. is going to be as we see new companies We already know the IS Splunk's of the world. that you think is going to receive treatment earlier, I mean, you can't expect a human being to understand So I think what'll happen is how do you manage of network formation is going to be data assets So this notion, you would agree then. And to actually take you to that, I actually talk about Exactly, right. and some of the crowd chats you do. is a big role that needs to happen, but the goal looking at the CIO with the CDO, Value out of our data being able to utilize these Because that person, once you have the data as addition, What's one bit of advice that you have for that CIO Don't be afraid of making bets, unless you make a bet And then, when you've lost, don't keep putting money out. Alright, so, Muddu, thank you very much for being here. thank you very much for being here. Muddu: Thank you always, and John too.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
CiscoORGANIZATION

0.99+

IBMORGANIZATION

0.99+

Peter BurrisPERSON

0.99+

OracleORGANIZATION

0.99+

AmazonORGANIZATION

0.99+

MudduPERSON

0.99+

90QUANTITY

0.99+

PeterPERSON

0.99+

90%QUANTITY

0.99+

Muddu SadhakarPERSON

0.99+

GoogleORGANIZATION

0.99+

JohnPERSON

0.99+

Muddu SudhakarPERSON

0.99+

threeQUANTITY

0.99+

April 2018DATE

0.99+

20 vendorsQUANTITY

0.99+

Palo AltoLOCATION

0.99+

AWSORGANIZATION

0.99+

ITOMORGANIZATION

0.99+

CaspidaORGANIZATION

0.99+

two questionsQUANTITY

0.99+

yesterdayDATE

0.99+

30QUANTITY

0.99+

seventh floorQUANTITY

0.99+

two kitchensQUANTITY

0.99+

sixth floorQUANTITY

0.99+

eachQUANTITY

0.99+

20QUANTITY

0.99+

todayDATE

0.99+

bothQUANTITY

0.99+

secondQUANTITY

0.99+

both sidesQUANTITY

0.99+

HPEORGANIZATION

0.98+

OneQUANTITY

0.98+

SplunkORGANIZATION

0.98+

WikibonORGANIZATION

0.98+

theCUBEORGANIZATION

0.98+

GCPORGANIZATION

0.98+

AzureORGANIZATION

0.98+

one pieceQUANTITY

0.97+

one-bedroomQUANTITY

0.97+

30, 40 billion dollarsQUANTITY

0.96+

80%QUANTITY

0.96+

ServiceNowORGANIZATION

0.95+

oneQUANTITY

0.94+

hundred aircraftQUANTITY

0.94+

one single personQUANTITY

0.93+

one areaQUANTITY

0.92+

Stealth Mode Startup CompanyORGANIZATION

0.91+

three PQUANTITY

0.91+

three generationsQUANTITY

0.89+

four layersQUANTITY

0.89+

single classQUANTITY

0.87+

one more questionQUANTITY

0.87+

four large cloud vendorsQUANTITY

0.85+

firstQUANTITY

0.84+

one particular vendorQUANTITY

0.83+

five PQUANTITY

0.79+

MongoDBTITLE

0.79+

CUBEConversationEVENT

0.67+

yearsQUANTITY

0.67+

BayesianOTHER

0.65+

AzureTITLE

0.62+

MarkovianOTHER

0.6+

CxOORGANIZATION

0.53+

everyQUANTITY

0.51+

Flynn Maloy, HPE & John Treadway, Cloud Technology Partners | HPE Discover 2017 Madrid


 

>> Narrator: Live from Madrid, Spain it's theCube, covering HPE Discover Madrid 2017. Brought to you by Hewlitt Packard Enterprise. >> Welcome back to Madrid everybody. This is theCube, the leader in live tech coverage. My name is Dave Vellante and I'm here with my co-host for the week, Peter Burris, otherwise known as Mr. Universe. This is HPE Discover Madrid 2017. Flynn Maloy is here as the Vice President of Marketing the HP Point Next. >> Hi guys. >> And John Treadway is here as the Senior Vice President of Strategy and Portfolio at Cloud Technology Partners, an HPE company. Gentlemen, great to see you again. Welcome to theCube. >> Great to see you. >> It's been a good week. We were just talking about the clarity that's coming to light with HPE, the portfolio, some of the cool acquisitions. You and I, Flynn, were at this event last year in London. You had the Cheshire Cat smile on your face. You said something big is coming. I can't really tell you about it partly because I can't tell you about it. The other part is we're still shaping it. Then Point Next came out of it. How are you feeling? Give us the update. >> It's been a really exciting year for services. This time last year we knew as Antonio announced, we're going to be bringing our services together after we announced that we're spinning out our outsourcing business. We're bringing technology services at the time forward. We had a new brand coming. We purchased Cloud Cruiser in February so we're investing in the business. We also invested in services back in the engine room all year long to really build up to our announcement this week with Green Lake which takes our consumption services to the next level. Then of course in September we continue to invest and acquire Cloud Technology Partners and by the way brought on our new leadership team with Ana Pinczuk and Parvesh Sethi. For us here at HP it's really been a banner year for services. It's really been transformative for the company and we're excited to lead it going into FY '18. >> John, Cloud Technology Partners specializing, deep technology expertise. You've got an affinity for AWS, you've got a bunch of guys that reinvent this week in close partnership with them. Interesting acquisition from your perspective coming into HPE. What's it been like? What has HP brought you and what have you brought HP? >> That's a fantastic question. We have really found that everything about this experience has exceeded our expectations across the board. When you go into these things you're kind of hoping for the best outcome, which is we're here because we want to be able to grow our business and scale it and HP gives us that scale. We also think that we have a lot of value to add to the credibility around public cloud and the capabilities we bring. You hope that those things turn out to be true. The level of engagement that we're getting across the business with the sellers, with the customers, with the partners is way beyond expectations. I like to say that we're about six months ahead of where we thought we'd be in terms of integration, in terms of capability and expertise. Really bringing that public cloud expertise, not just to AWS, we do a lot of Azure work, we do a lot of Google work as well, really does allow the HPE teams to be able to go into their clients and have a new conversation that they couldn't have a year ago. >> What is that new conversation? >> The new conversation is really about, and we like to use the term "the right mix." I.T. is not just one mode. You're gonna have internal I.T., you're gonna have private clouds. Public cloud is a reality. AWS is the fastest growing company in tech history ever. If you think about that it's a reality for our clients, HPE clients, that public cloud is there. That new capability that we could bring, that credibility is that we have done this for the last seven years with large enterprises across all sorts of industries and domains: Toko, healthcare, financial services in particular. We bring that to the table, combine that with the scale and operational capability of HPE and now we have something that's actually pretty special. >> Just to add, it is about the customers at the end of the day. It's about where do those workloads want to land? Public cloud, private cloud, traditional, those are all tools in your toolbox. What customers want to know is what is the right mix? There are workloads that are ideal for going to the public cloud. There are workloads that are ideal for staying on prem. Finding that right mix, especially by bringing in the capabilities of what needs to go to public cloud that really rounds out our portfolio for hybrid I.T. >> I'm starting to buy the story. The upstarts, the fastest growing company in the world would say old guard trying to hang onto the past. I like the way you framed it as look, we know our customers want to go to the cloud. They want certain workloads to be on prem. We want them to succeed. We're open, we're giving them choice. Maybe two years ago it sounded like bromide. But you're actually putting it into action acquiring a company like CTP. It's interesting what you were saying, John, about well no not just AWS, it's Google, it's Azure. You've got independent perspective on what should go where or on prem. >> We always have so even as a company that derived most of our revenue from public cloud over the last few years, we've never, ever been the company that said everything should go to public cloud. Toss it all, go to Amazon, toss it all, go to Azure. Never been our perspective. We've had methodology for looking through the application portfolio and helping determine where things should go. Very often a large percentage of the portfolio we say it's good where it is, don't move it. Don't move it right away. >> But in the past that's where it ended. You said okay, hey, go figure out, go talk to HPE. >> That's actually a funny thing because we've had this conversation. Literally when we would say okay we'll take care of this part for the public cloud, but you're on your own for the private cloud stuff, in the past HP would do the reverse. We'll help you with the private cloud stuff, and we think this could go to public cloud. But you're kind of on your own with that. Not that there wasn't any capability, but it wasn't really well developed. Now we can say this should go to private cloud, this should go to public cloud and guess what? We can do both. >> Dave: So now you've got a lean-in strategy. >> Absolutely right, as John said the funnels and the response from our customers have been outstanding. As you can imagine, Mike, all of our top customers are saying fantastic, come talk to us, come talk to us. They're having to prioritize where they go over the last few months. We are well ahead of where we were. >> We strongly believe over the years that the goal is not to bring your business to the cloud. It's to bring the cloud to your business. That ultimately means that public cloud will be a subset of the total although Amazon's done a wonderful job of putting forward the new mental model for the future of computing. Can you guys reliably through things like Green Lake and other, can you present yourselves as a cloud company that just doesn't have a public cloud component? >> Let me approach the response to that question in a slightly different way. When you look at our strategy around making hybrid I.T. simple it's not necessarily which cloud is the right cloud? It's not really about that. It's about where should the workloads land? We do believe that the pragmatic answer is you need to be a little bit above all of those choices. They're all in the toolbox. If you look at, for example, our announcement with One Sphere this week that's a perfect example of what customers are asking the industry to do which is to look across all of it. The reality is it's hybrid, it's multi-cloud and speaking at that length. >> But you're saying it's a super set of tools that each are chosen based on the characteristics of workloads, data, whatever it might be, that's right. So John look, as human beings we all get good at stuff. We say I know that person I can stereotype him. I can stereotype that. What's the euristic that your team is using to very quickly look at a workload? Give our audience, our clients a clue here so that they can walk away a little bit and say well that workload naturally probably is going to go here. And that workload's naturally going to go there. What's it like 30 second where you're able to generally get it right 80% of the time? >> It really comes down to a set of factors, right? One factor is just technical fit. Will it work at all? We can knock out a lot of workloads because they're on old Unix or just kind of generally the technical fit isn't there, right? Second thing is from a business case. Does it make sense? Is there gonna be any operational saving against the cost of doing the migration? Because migrating something isn't free right? It's never free. Third is what is the security and governance constraint within which I'm living? If I have a data residency requirement in a country and there's no hyper-scale public cloud presence in that country then that workload needs to stay in that country, right? It's those types of high-level factors we can very quickly go from the list of here's your entire list down to already these are candidates for further evaluation. Then we start to get into sort of deeper analysis. But the top level screen can happen very, very quickly. >> You do that across the, you take an application view, obviously. A workload view. Then how do avoid sort of boiling the ocean? Or do you boil the ocean? You have tools to help do that. >> We do, I mean we've invested a lot in IP, both service IP and software IP in both Point Next also comes with some strong IP in this as well that we've been able to merge in with. Our application assessment methodology is backed by a tool called Aura. Aura is a tool for taking that data, collecting it, and help providing individualization in reporting and decisioning at the high level on these items. Then every application that looks like a great candidate for something that I'm gonna invest in migration, we need to do a deeper analysis. Because it isn't lifting and shifting. It doesn't work for 90% of the applications, or 80%, or 70. It's certainly not anywhere near 50% of the applications. They require a little bit of work, sometimes a lot of work, to be able to have operational scale in a public cloud environment because they're expecting a certain performance and operational characteristic of their internal infrastructure and it's not there. It's a different model in the public cloud. >> A lot of organizations like yours would have a challenge presenting that to a customer because they can't get the attention of the senior leaders. How is it that you guys are able to do that? You were talking I think, off-camera, talking about 20-plus years of experience on average for each of your professionals. Is that one of the secrets to how you've succeeded? >> This is a big thing and why this integration's working so well is that the people, the early team all the way through today of CTP are all seasoned I.T. professionals. We're not kids straight out of school that have only known how to do I.T. in an Amazon way. We have CIOs of banks that are in our executive team, or in our architecture team that have that empathy and understanding of what it means to be in the shoes. Not having this arrogant approach of everything must be a certain way because that's what we believe. That doesn't work. The clients are all different. Every application is a snowflake and needs to be treated as such, needs to be treated like an individual, like a human. You want to be treated like an individual, not like -- >> Stalker! (laughing) >> Gezunheit. (laughing) >> Okay, so now the challenge is how you scale that. How you replicate that globally and scale it and get the word out. Talk about that challenge. >> That's right and one of the big things we're really excited to see is the merger of the IP that comes from CTP along with everything that we have inside of Point Next and then rolling that out to the 5,000 plus consultants that we've got inside of HP and our partners. That's really where we're expecting a lot of the magic to come from is once we really expose the integrated set of what those capabilities are we think, and Ana has said it on stage. We had heard from a couple of analysts that we believe that together we have the largest cloud advisory in the industry today. >> It was interesting we actually had, we've had challenges in the past where we've gone into clients and were starting to get into some pretty serious level of work. We were a younger company, didn't have the scale, and scope, and capability of HPE. Now we're being brought in to these opportunities and the clients are saying HP, you're right here. We can do that. We have the scale to now start doing the larger transformation programs and projects with these clients that we didn't have before. Now we're being invited back in, right? In addition to that being invited in because now we have the cloud competency that we can bring to the table. >> You know what, I kind of want to go back to the point you made earlier about how it's all cloud. That resonates with me. I think it is all cloud depending on where you want to land the various pieces. If that's what you want to call that umbrella I think it makes a ton of sense. You know, a lot of what we've announced this week with Green Lake is about trying to bridge the benefits gap with public cloud as the benchmark for the experience today for what needs to stay on prem. When you sit down and for all those reasons you outlined, whether it's ready, whether it isn't ready, where the data has to sit, or whether or not. There's gonna be x-workloads that need to stay on prem. We've been working hard in the engine room to really build out an experience that can feel to the customer a lot like what you get from the public cloud. That's gonna continue to be an investment area for us. >> If the goal is success for the business then you don't measure success by whether you got to Amazon. >> That's correct. >> The goal of success is the business. You measure success by whether or not the business successfully adopts the technology where the data requires. What's interesting about the change we're experiencing is in many respects for the first time the way of thinking about problems in this industry is going through a radical transformation. Let's credit AWS for catalyzing a lot of that change. >> Absolutely, setting that benchmark. I mean it really is a catalyst. >> But you look at this show, HP has adopted the thought process, it's adopted it. It's no longer in our position to say fine, you want to think this way, we'll help. >> Imagine this, as One Sphere comes up and as we really can manage multi-clouds and as we'll eventually be able to move workloads between the various clouds, manage the whole estate, view the whole estate and everything under it whether it's off-prem or on-prem is all consumption. I mean, how does that change central I.T.? Central I.T. radically changes. If everything's consumed, wherever it is and you've got a visibility to the whole estate and you can move stuff depending on what the right mix is, that's a fundamental change and we're not there yet as an industry. But that's a fundamental change to the role of Central I.T. >> But your CIOs are thinking along those lines. We can verify they are thinking along those lines. >> Again the strategy's coming into focus for me personally. I think us generally. We talked to Ana about services-led, outcome-led. And if it's big chewy outcome like kind of IBM talks well you've got partners to help you do that. Deloitte, we had PWC on. They're big, world-class organizations with deep expertise in retail and manufacturing and oil and gas. You're happy to work with those guys. If it is service-led or outcome-led you can make money whether you're going to Amazon, whether you're staying on prem, whether you're doing some kind of hybrid in between and you're happy to do that as an agnostic, independent player. Now yeah, of course you'd like to sell HP hardware and software, why not? >> I think that's really an important point. When it comes to the infrastructure itself we do believe we have the best infrastructure in the industry, but we play well with others and we always said HPE plays well with others. When it comes to the app layer we are app agnostic. A lot of our biggest competitors are not. When you go out and talk to CIOs today that's really, this is my app, this is my baby. This is the one that I want. They're not really looking for alternatives for that in many cases. When you're thinking agnostic that's really where we think partner, being agnostic, working with all the ad vendors, working with all the SIs, we think that's where the future-- >> And it's a key thing. You guys are younger, but you remember Unix is snake oil. I mean-- >> Designing is a Russian Trump. >> Unix is snake oil and then two years later it's like here our Unix. >> Flynn: It's the best thing ever. >> So you now are in a position to say great, wherever you wanna go we'll take you there. That's powerful because it can be genuine and it can be lucrative. >> What's unlocking here is the ability to actually execute a digital transformation program within the enterprise. One of the big things the public cloud providers brought to us and that HPE's now bringing in through the internal infrastructure is that agility and speed of innovation of the users. Their ability to actually get things done very quickly and reduce the cycle time of innovation. That frankly has always been the core benefit of the public cloud model, that pay-as-you-go, start with what you need, use the platform services as they grow. That model has been there since the beginning and it's over 11 years of AWS at this point. Now with enterprise technology adopting similar models of pay for it when you consume it, we'll provision it in advance, we'll get things going for you, we're giving that model. It's about unlocking the ability for the enterprise to do innovation at scale. >> I wanna end if I can on met Jonathan Buma last night, J.P., J.B., sorry. You're J.T. >> It's confusing. >> But one of the things I learned, a small organization, 200-250 people roughly when you got acquired, but you've got this thing called Doppler, right? Is that what it's called, Doppler? Explain that, explain the thought leadership angles that you guys have. >> Actually from the very beginning. >> The marketing team loves this, it's fantastic. >> So follow up with how. >> From day one there's a few things that we said were core principles, the way that we were going to grow and run the business. I'll talk about one other thing first which was that we were gonna be technology-enabled, technology-enabled services company. That we were gonna invest in IP both at the service level but as the technology level to accelerate the delivery of what we do. The second thing as a core principle is that we were going to lead through thought leadership. So we have been the most prolific producers of independent cloud content as a services firm bar none. Yeah, there's newspapers, magazines, analyst firms like yourself producing a lot of content. The stuff that we're producing is based on direct experience of implementing these solutions in the cloud with our clients so we can bring best practices. We're not talking about our services. We're talking about what is the best practice for any enterprise that wants to get to the cloud. How do you do security? How do you do organizational change? That has a very large following of Doppler both online where we have an email newsletter. But we also do printed publication of our quarterly Dopplers that goes out to a lot of our clients, the CIOs and key partners. That kind of thought leadership has really set us apart from all of the rest of the, even the born in cloud consultancies who never put that investment in. >> Flynn, you're a content guy. >> Absolutely. >> So you've got to really appreciate this. >> That's a dream, it's an absolute dream. One of the things, another proof point as a way to end, services first strategy is what we're doing in the market community at HP more money, energy, content, time is going into how we're talking, thought leadership and services than anything else in the company. We've got not just branding for Point Next and Green Lake, but bringing Doppler forward, bringing those great case studies forward. Putting that kind of content at the tip of the HPE sphere. It's not something you've seen from our company in the past. I think keep your eyes out over the next year. We'll have this conversation in six months and you'll see a lot more from us on that topic. >> Great stuff, congratulations on the process, the exit, the future. Good luck, exciting. >> Thanks guys. >> Really appreciate it. Keep it right there everybody, we'll be back right after this short break. Dave Vallente for Peter Burris from HPE Discover Madrid. This is theCube. (upbeat instrumental music)

Published Date : Nov 29 2017

SUMMARY :

Brought to you by Hewlitt Packard Enterprise. Flynn Maloy is here as the Vice President of Marketing And John Treadway is here as the Senior Vice President You had the Cheshire Cat smile on your face. and acquire Cloud Technology Partners and by the way that reinvent this week in close partnership with them. and the capabilities we bring. We bring that to the table, combine that with the scale of the day. I like the way you framed it as look, most of our revenue from public cloud over the last But in the past that's where it ended. for the private cloud stuff, in the past HP would do and the response from our customers have been outstanding. of the total although Amazon's done a wonderful job We do believe that the pragmatic answer is that each are chosen based on the characteristics go from the list of here's your entire list Then how do avoid sort of boiling the ocean? It's certainly not anywhere near 50% of the applications. Is that one of the secrets to how you've succeeded? We have CIOs of banks that are in our executive team, (laughing) Okay, so now the challenge is how you scale that. We had heard from a couple of analysts that we believe We have the scale to now start doing the larger to the customer a lot like what you get If the goal is success for the business The goal of success is the business. Absolutely, setting that benchmark. HP has adopted the thought process, it's adopted it. between the various clouds, manage the whole estate, We can verify they are thinking along those lines. Again the strategy's coming into focus in the industry, but we play well with others I mean-- Unix is snake oil and then two years later So you now are in a position to say great, One of the big things the public cloud providers I wanna end if I can on met Jonathan Buma last night, But one of the things I learned, a small organization, but as the technology level to accelerate the delivery Putting that kind of content at the tip of the exit, the future. This is theCube.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
JohnPERSON

0.99+

MikePERSON

0.99+

Dave VellantePERSON

0.99+

Peter BurrisPERSON

0.99+

Dave VallentePERSON

0.99+

DavePERSON

0.99+

DeloitteORGANIZATION

0.99+

HPORGANIZATION

0.99+

IBMORGANIZATION

0.99+

Jonathan BumaPERSON

0.99+

AWSORGANIZATION

0.99+

80%QUANTITY

0.99+

LondonLOCATION

0.99+

FebruaryDATE

0.99+

90%QUANTITY

0.99+

PWCORGANIZATION

0.99+

SeptemberDATE

0.99+

AmazonORGANIZATION

0.99+

70QUANTITY

0.99+

Green LakeORGANIZATION

0.99+

John TreadwayPERSON

0.99+

HPEORGANIZATION

0.99+

30 secondQUANTITY

0.99+

FlynnPERSON

0.99+

last yearDATE

0.99+

AnaPERSON

0.99+

GoogleORGANIZATION

0.99+

One factorQUANTITY

0.99+

FY '18DATE

0.99+

Flynn MaloyPERSON

0.99+

Cloud Technology PartnersORGANIZATION

0.99+

Ana PinczukPERSON

0.99+

J.P.PERSON

0.99+

oneQUANTITY

0.99+

Hewlitt Packard EnterpriseORGANIZATION

0.99+

MadridLOCATION

0.99+

firstQUANTITY

0.99+

first timeQUANTITY

0.99+

ThirdQUANTITY

0.99+

bothQUANTITY

0.99+

J.B.PERSON

0.99+

two years laterDATE

0.99+

this weekDATE

0.98+

next yearDATE

0.98+

J.T.PERSON

0.98+

OneQUANTITY

0.98+

One SphereORGANIZATION

0.98+

todayDATE

0.98+

TrumpPERSON

0.98+

over 11 yearsQUANTITY

0.98+

two years agoDATE

0.98+

eachQUANTITY

0.98+

Madrid, SpainLOCATION

0.97+

about 20-plus yearsQUANTITY

0.97+

5,000 plus consultantsQUANTITY

0.97+

200-250 peopleQUANTITY

0.96+

Bob Picciano & Stefanie Chiras, IBM Cognitive Systems | Nutanix NEXT Nice 2017


 

>> Announcer: Live from Nice, France, it's The Cube covering Dot Next Conference 2017, Europe. Brought to you by Nutanix. (techno music) >> Welcome back, I'm Stu Miniman happy to welcome back to our program, from the IBM Cognitive Systems Group, we have Bob Picciano and Stefanie Chiras. Bob, fresh off the keynote, uh speech. Went a little bit long but glad we could get you in. Um, I think when the, when the IBM Power announcement with Nutanix got out there, a lot of people were trying to put the pieces together and understand. You know, we with The Cube we've, we've been tracking, you know, Power for quite a while, Open Power, all the things but, but I have to admit that even myself, it was like, okay, I understand cognitive systems. We got all this AI things and everything but on the stage this morning, you kind of talked a little bit about the chipset and the bandwidth. You know, things like GPUs and utilization, you know, explain to us, you know, what is resonating with customers and, you know, where, you know, what's different about this because a lot of the other ones it's like, oh well, you know, software runs a lot of places and it doesn't matter that much. What's important about cognitive systems for Nutanix? >> Yeah, so, first off, thanks Stu. And, as always, thanks for, you know, you for following us and understanding what we're doing. You mentioned not just Power but you mentioned Open Power, and I think that's important. It shows, actually, the deeper understanding. You know, we've come a long way in a very short amount of time with what we've done with Open Power. Open Power was very much at it's core about really making Power a natural choice for industry standard Linux, right? The Linuxes that used to run on Power a couple of generations ago were more proprietary Linuxes. They were Big Endian Linux but Open Power was about making all that industry standard software run on top of Power where we knew our value proposition would shine based on how much optimization we put into our cores and how much optimization we put into IO bandwidth and memory bandwidth. And boy, you know, have we been right. In fact, when we take an industry standard workload like a no sequel database or Enterprise DB, or a Mongoloid DB, Hadoop, and put it on top of Linux, an industry standard Linux, on top of Power, we typically see that run about 2X to 3X better price performance on Linux on Power than it would on Linux on Intel. This is a repeating pattern. And so, what we're trying to do here is uh, really enable that same efficiency and economics to the Nutanix Hyper Converged Space. And remember, all these things about insight based applications, artificial intelligence, are all about data intensive workloads. Data intensive workloads and that's what we do best. So we're bringing the best of what we do and the optionality now for these AI workloads and cognitive systems right into the heart of what Nutanix is pivoting to as well. Which is really at the, at the core of the enterprise for data intensive workloads. Not just, you know, edge related VDI based workloads. Stefanie will you, you want to comment on that a little bit as well. >> Yeah, we are so focused on being prioritized and what space we go after in the Linux market around these data centric and AI workloads. And at the end of the day, you know, Nutanix has Nutanix states. It's about invisible infrastructure, but the infrastructure underneath matters. And now with the simplicity of what Nutanix brings you can choose the best infrastructure for the workloads that you decide to run, all with single pane of glass management. So it allows us to bring our capabilities at the infrastructure levels for those workloads, into a very simplest, simple deployment model under a Nutanix private cloud. >> Yeah, I, I think back when, you know, we had things like, when Hadoop came out, you know, we got all these new modern databases, >> Right. >> You know, I wanted to change the infrastructure but simplicity sure wasn't there. >> Yep. >> Uh-huh. >> It was a couple of servers sitting under the desk, okay, but when you needed to scale, when you needed to manage the environment, um, it was challenging. We, we saw, when, you know, Wikibon for years was doing, you know, research on big data and it was like, ah, you know, half the deployments are failing because, you know, it wasn't what they expected. >> Right. >> The performance wasn't there, the cost was challenging. So it feels like we're kind of, you know, turn the corner on, you know, making, putting the pieces together to make these solutions workable. >> I think we are. I think Dheeraj and his team, Sunil, they've done a wonderful job on making the one click simplicity, ease of deployment, ease of manageability. We saw today, creation of availability zones. High availability infrastructure. Very very simplistic. So, you know, as, you know, I've had other segments with Dave and John in the past, we've always talked about, it's not about big data, it's about really creating the ability to get fast actionable insights. So it's a confluence of that date environment, the processed based workflow environment, and then making that all simple. And this feels like a very natural way to accomplish that. >> I want to understand, if I caught right, it's not Power or x86 but it's really putting the right workloads in the, in the right place. >> That's right. >> Did I get that right? >> That's right. >> What, what are the customer deployments, you know? >> Heterogeneity is key. >> How do I then manage those environments because, you know, I, I want kind of homogeneity of, of management, even if I have heterogeneity, you know, in, in my environment, you know. What, what are you hearing from your customers? >> I think how we've looked at Linux evolved. The set of workloads that are being run on Linux have evolved so dramatically from where they started to running companies and being much more aggressive on compute intensive. So it's about when you bring total cost of ownership which requires the ability to simply manage your operations in a data center. Now the best of Prism capabilities along with the Acropolis stack allows simplicity of single pane of glass management for you to run your Power node, set of nodes, side by side with your x86 set of nodes. So what you want to run on x86 or Windows can now be run seamlessly and compatible with your data centric workloads and data driven workloads, or AI workloads on your Power nodes. It really is about bringing total cost of ownership down. And that really requires accessibility and it requires simplicity of management. And that's what this partnership really brings. It's a new age for hyper converged. >> Yeah. >> What should we be looking for, for the partnership, kind of over the next 12 years, 12, 12 months. (laughs) >> 12 years? (laughs) (laughter) >> 12 years might be a little tough to predict, but over the next year, what, what should we be looking for the partnership? You know, I think back you talked about, Open Powered Google is, you know, a big partner there. Is there a connection? Am I drawing lines between, you know, Nutanix and Google and what you're doing? >> I won't comment on that yet but, you know, but, as you know we have a big rollout coming up as we're getting ready to launch Power Nine. So there'll be more news on some of those fronts as we go through the coming weeks. And I hope to see you down in Dallas at our Cloud or Cognitive event. Or at one of the other events we'll be jointly at where we do some of these announcements. But if you think about where this naturally takes us, Sunil talked about mode one and mode two applications. So what we want to see is increasing that catalog for mode one applications. So things that I'd like to see is an expanded set of relationships around what we both do in the SAP space. I'd like to see that catalog of support enriched for what's out there on top of the Linux on Power space, where we know our value proposition will continue to be demonstrated both in total cost of acquisition as well as total cost of ownership. >> Yeah. >> I mean, we're really, you know, seeing some great results on our Linux base. As you know, it's now about 20 percent of the power revenue base is from Linux. >> Uh-huh. >> And that's grown from a very small amount just a few years ago. So, I look to see that and then I would look at more heterogeneity in terms of the support of what we do, both in Linux and maybe, in the future, also what we do to support the AIX workloads, uh, with Nutanix as well. Because I do think our clients are asking about that optionality. They have big investments, mission critical workloads around AIX and the want to start to bring those worlds together. >> Alright and Stefanie, want to give you the final word, you know, anything kind of learnings that you've had, of the relationships as you've been getting out and getting into those customer environments. >> I have to say the excitement coming in from the sales team, from our clients, and from the business partners have been incredible. It really is about the coming together of, not only two spaces of simple, and absolutely the best infrastructure and being able to optimize from bottom to top, but it's about taking hyper converge to a new set of workloads. A new space. Um, so the excitement is just incredible. I am thrilled to be here at Dot Next and be able to talk to our clients and partners about it. >> Alright well Stefanie and Bob thank you so much for joining us. >> Thanks Stu. >> Thank you Stu. >> Sorry we had to do a short segment but we'll be catching ya up at many more. Alright so we'll be back with lots more coverage here from Nutanix Dot Next in Nice, France. I'm Stu Miniman, you're watching The Cube. (techno music)

Published Date : Nov 8 2017

SUMMARY :

Brought to you by Nutanix. explain to us, you know, what And boy, you know, have we been right. And at the end of the day, you know, change the infrastructure was doing, you know, So it feels like we're kind of, you know, So, you know, as, you know, the right workloads in you know, in, in my environment, you know. So what you want to run on x86 or Windows of over the next 12 years, Am I drawing lines between, you know, And I hope to see you down in Dallas you know, seeing some in the future, also what to give you the final word, and from the business Alright well Stefanie and Bob thank you Alright so we'll be back with

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
StefaniePERSON

0.99+

DavePERSON

0.99+

Stefanie ChirasPERSON

0.99+

BobPERSON

0.99+

GoogleORGANIZATION

0.99+

Bob PiccianoPERSON

0.99+

DheerajPERSON

0.99+

NutanixORGANIZATION

0.99+

Stu MinimanPERSON

0.99+

JohnPERSON

0.99+

SunilPERSON

0.99+

Stu MinimanPERSON

0.99+

DallasLOCATION

0.99+

12QUANTITY

0.99+

IBM Cognitive Systems GroupORGANIZATION

0.99+

12 yearsQUANTITY

0.99+

LinuxTITLE

0.99+

bothQUANTITY

0.99+

two spacesQUANTITY

0.99+

12 monthsQUANTITY

0.99+

Nice, FranceLOCATION

0.99+

StuPERSON

0.99+

3XQUANTITY

0.99+

The CubeTITLE

0.98+

x86TITLE

0.98+

Nutanix Dot NextORGANIZATION

0.98+

WindowsTITLE

0.98+

WikibonORGANIZATION

0.97+

Dot NextORGANIZATION

0.97+

IBM Cognitive SystemsORGANIZATION

0.97+

AIXTITLE

0.97+

next yearDATE

0.96+

IntelORGANIZATION

0.96+

todayDATE

0.96+

about 2XQUANTITY

0.94+

this morningDATE

0.92+

oneQUANTITY

0.92+

about 20 percentQUANTITY

0.92+

CloudEVENT

0.91+

firstQUANTITY

0.9+

EuropeLOCATION

0.89+

applicationsQUANTITY

0.86+

HadoopTITLE

0.86+

2017DATE

0.86+

single paneQUANTITY

0.83+

yearsQUANTITY

0.83+

mode oneOTHER

0.75+

IBMORGANIZATION

0.75+

a couple of generations agoDATE

0.74+

Hyper Converged SpaceCOMMERCIAL_ITEM

0.71+

Dot Next Conference 2017EVENT

0.71+

AcropolisTITLE

0.65+

halfQUANTITY

0.64+

PrismTITLE

0.64+

few years agoDATE

0.6+

NutanixCOMMERCIAL_ITEM

0.58+

nodeTITLE

0.57+

Big EndianORGANIZATION

0.55+

Gabe Chapman, NetApp & Sidney Sonnier, 4TH and Bailey | NetApp Insight 2017


 

>> Narrator: Live, from Las Vegas its theCUBE. Covering NetApp Insight 2017. Brought to you by NetApp. >> Hello everyone, welcome back to our live coverage, exclusive coverage at NetApp Insight 2017, it's theCUBE's coverage. I'm John Furrier, co-host, theCUBE co-founder of SiliconANGLE Media, with my co-host, Keith Townsend at CTO Advisor. Our next two guests is Gabe Chapman, Senior Manager, NetApp HCI, and Sidney Sonnier, who's the IT consultant at 4th and Bailey, also a member of the A-Team, a highly regarded, top-credentialed expert. Welcome to theCUBE, guys. Good to see you. >> Hey >> Thanks for having us. >> Thank you, good to be here. >> So love the shirt, by the way, great logo, good font, good, comes up great on the camera. >> Thank you. >> We're talking about the rise of the cloud and everything in between, kind of the segment. As a NetApp, A-Team member, and customer. It's here, cloud's here. >> Sidney: Yes >> But it's not yet big in the minds of the Enterprise because they got, it's a path to get there. So, there's public cloud going on, >> Sidney: Right. >> Hybrid clouds, everyone gets that. >> Sidney: Right. >> There's a lot of work to do at home inside a data center. >> Yes, there is, there's an extreme amount of work. And, like you said, these are very exciting times, because we have a blend of all of the technologies and being at an event like this allows us to look at those technologies, look at that fabric, look at that platform, and how we can merge all of those things into an arena that can allow any customer to dynamically move on-prem, off-prem, public cloud, private cloud, but still be able to manage and securely keep all their data in one specific place. >> Gabe, I want to get your thoughts, as he brings up a good point. Architecture's king, it's the cloud architect. Devop has gone mainstream. Pretty much, we all kind of can look at that and say, okay QED, Don, and everyone else put their plans together, but the Enterprises and the folks doing cloud, cloud service providers and everyone else, they have issues, and their plates are full. They have an application development mandate. Get more developers, new kinds of developers, retrain, re-platforming, new onboarding, open source is booming. They have security departments that are unbundling from IT in a way and fully staffed, reporting to the board of directors, top security challenges, data coverage, and then over the top is IoT, industrial IoT. Man, their plate's full. >> Sidney: Right. >> So architecture's huge, and there's a lot of unknown things going on that need to be automated. So it's a real challenge for architects. What's your thoughts. >> So you know, my thoughts about that is, I like to make this joke that there's no book called, The Joy of Menial Tasks. And there are so many of those menial tasks that we do on a day-in and day-out basis, in terms of the Enterprise, whether it's storage, whether it's virtualization, whether it's, whatever it is, right? And I think we've seen this massive shift towards automation and orchestration, and fundamentally the technologies that we're provisioning in today. APIs are king, and they're going to be kind of the focal point, as we move forward. Everything has to have some form of API in it. We have to be making a shift in a transition towards infrastructure as code. At the end of the day the hardware has relevance. It still does, it always will. But the reality is to abstract away the need for that relevance and make it as simple as possible. That's where we have things like hyper converged infrastructure being so at the forefront for so many organizations, NetApp making a foray into this space, as well, is to push, to simplify as much as possible, the day-to-day minutiae, and the infrastructure provisioning. And then, transition those resources over towards getting those next-generation data center applications up, running, and functional. >> Old adage that's been in the industry around making things simple, as our cubbies like an aircraft carrier. But when you go below the water lines, everyone in little canoes paddling, bumping into each other. These silos, if you will. >> Gabe: Right. >> And this is really the dynamic around cloud architecture, is where the operating model's changing. So, you got to be prepared to handle things differently. And in storage, the old days, is, I won't say, easy, but you guys made it easy. A lot of great customers. NetApp has a long history of, but it's not the storage anymore. It's the data fabric as you guys are talking about. It's the developer enablement. It's getting these customers to drive for themselves. It's not about the engine anymore, although, you've got to have a good engine, call it tech, hardware, software together. But the ultimate outcome is the people driving the solutions are app guys. They're just the lines of businesses are under huge pressure and huge need. >> I think you can look at it this way. It's like we're kind of data-driven. You'll see Gene talk about that as part of our messaging. We can no longer be just a storage company. We need to be a data company and a data management organization as we start to have those conversations. Yes, you're going to go in there and talk to the storage administrations and storage teams, but there are 95% of the other people inside of the Enterprise, inside information technology, within different lines of business. They're the ones that we have the most relevant discussions with. That's where our message probably resonates more strongly in the data-driven aspect, or the management, or analytics, and all those other spaces. And I think that's the white space and growth area potential for NetApp, is the fact that we can go in there and have very authoritative discussions with customers around their data needs, and understanding governance. You have things like GPRD, and AMIA. That's a giant open ecosystem for, it has so many requirements and restrictions around it, and everybody's just now starting to wrap their head around it. So building a program around something like that, as well. So there's challenges for everybody. And there's even challenges for vendors like ourselves, because we had, we were mode one. Now we're mode two. So it's kind of like making that transition. And the old speeds, the speeds were always, hey, how fast can you go, what's the files look like, with replication, blah, blah, blah. Now you've got solid, solid state storage. You got SolidFire. Now people want outcomes as a service. Not outcomes anymore, like a cliché, things are happening very dynamically. And last week at Big Data NYC, our event, around the big data world, you couldn't get anymore clear that there's no more room for hype. They want real solutions now. Realtime is critical. And, now watching the keynotes here at NetApp, it's not speed that's featured, although there's a lot of work going on under the hood, it's really about competitive advantage. You're hearing words like data as a competitive advantage. >> Sidney: Yes. >> Sidney, you're in the field, you're in the front lines. Make sense of this. >> The sense that we have to make is, we made up some great points. >> Gabe: Yes. >> Getting the business engaged is one thing, because you still, with the cloud and the cloud architecture, you still have a lot of individuals who are not necessarily sold on it, all the way. So even from a technical perspective. So those guys that are down in the bottom of the boat, so to speak, you still have to kind of convince them because they feel somewhat uncomfortable about it. They have not all the way accepted it. The business is kind of accepted it in pockets. So being, having been on a customer's side and then going to more of a consulting side of things, you understand those pain points. So by getting those businesses engaged and then also engaging those guys to say, listen, it's freeing, the relevance of cloud architecture is not to eliminate a position, it's more to move the mundane tasks that you were more accustomed to using and move you closer to the business so that you can be more effective, and feel more of a participant, and have more value in that business. So that's-- >> So it's creating a value role for the-- >> Right, Right. >> The nondifferentiated tasks >> Absolutely. >> That were being mundane tasks, as you called them. >> Yes. >> You can then put that person now on, whether analytics or ... >> All those IoT things like you were mentioning on those advance projects, and use and leverage the dynamic capability of the cloud being able to go off-prem or on-prem. >> Alright, so what's the guiding principle for a cloud architecture? We'll have to get your thoughts on this because we talked about, in a segment earlier, with Josh, around a good devops person sees automation opportunities and they jump on it like a grenade. There it is, take care of that business and automate it. How do you know what to automate? How do you architect around the notion of we might be continually automating things to shift the people and the process to the value? >> I think what it boils down to is the good cloud architect looks and sees where there are redundancies, things that can be eliminated, things that can be minimized, and sees where complexity is, and focuses to simplify as much of it as possible, right? So my goal has always been to abstract away the complexity, understand that it's there and have the requirements and the teams that can functionally build those things, but then make it look to you as if it were your iPhone, right? I don't know how the app store works. I just download the apps and use it. A good cloud architect does the same thing for their customers. Internally and externally, as well. >> So where does NetApp fit in there, from a product perspective? As a cloud architect, you're always wondering what should I build versus what should I buy? When I look at the open source projects out there, I see a ton of them. Should I go out and dive head deep into one of these projects? Should I look towards a vendor like NetApp to bring to bear that simplified version? Where is the delineation for those? >> So the way we see it is traditionally, there's kind of four consumption models that exists. There's an as-a-service model, or just-in-time model. There are, we see converged, hyper converged as a consumption continuum that people leverage and utilize. There are best-of-breach solutions. Because if I want an object store, I want an object store, and I want it to do exactly what it does. That's an engineering solution. But then there's the as-a-service, I mean, I'm sorry, there's a software-defying component, as well. And those are the, kind of the four areas. If you look at the NetApp product lines, we have an ONTAP set of products, and we have an Element OS set of products, and we have solutions that fit into each one of those consumption continuums, based on what the customer's characteristics are like. You may have a customer that likes configurability. So they would look at a traditional FlexPod with a FAS and say that that's a great idea for me for, in terms of provisioning infrastructure. You may get other customers that are looking at, I want the next-generation data center. I want to provide block storage as a service. So they would look at something like SolidFire. Or, you have the generalist team that looks at simplicity as the key running factor, and time-to-value. And they look at hyper converged infrastructure. So there's a whole set. For me, when I have a conversation with a customer around build versus buy, I want to understand why they would like to build it versus buy it. Because I think that a lot of times, people think, oh, I just download the software and I put it on a box. I'm like, well, right, that's awesome. Now you're in the supply-chain management business. Is that your core competency? Because I don't think it is, right? And so there's a whole bunch of things. It's like firmware management and all these things. We abstract away all of that complexity. That's the reason we charge up for a product, Is the fact that we do all that heavy lifting for the customer. We provide them with an engineered solution. I saw a lot of that when we really focused significantly on the OpenStack space, where we would come up and compete against SEP. And I'm like, well how many engineers do you want to dedicate to keeping SEP up and running? I could give you a turnkey solution for a price premium, but you will never have to dedicate any engineers to it. So that's the trade-off. >> So on that point, I just want to followup. A followup to that is you vision OpenStack, which, big fans of, as you know, we love OpenStack. In the beginning, the challenge with the dupe in OpenStack early on, although that kind of solved, the industry's evolved, is that the early stage was the cost of ownership problem. Which means you had the early tire kickers. Early pioneers doing to work. And they iterated through it. So the question around modernization, which came up as a theme here, what are some modernization practices that I could take as a potential customer, or customer of NetApp, whether I'm an existing customer or a future customer, I want to modernize but I don't want to, I want to manage cost of ownership. And I want to have an architect that's going to allow me to manage my data for that competitive advantage. So I want the headroom of know that it's not just about putting a data link out there, I got to make data realtime, and I don't know when and where it's going to be available. So I need kind of like a fabric or a layer, but I got to have a modern infrastructure. What do I do, what's the playbook? >> So that's where that data fabric, again, comes in. It's like one of the keynotes we heard earlier in the General Session yesterday. We have customers now who are interested in buying infrastructure like we buy electricity. Or like we buy Internet service at home. So by us having this fabric, and it being associated with a brand like NetApp, we're, it's opening up to the point where, what do you really want to do? That's the question we come to you and ask. And if you're into the modernization, we can provide you all the modernization tools right within this fabric, and seamlessly transition from one provider to the next, or plug into another platform or the next, or even put it on-prem. Whatever you want to do. But this will allow the effective management of the entire platform in one location, where you don't have to worry about a big team. You can take your existing team, and that's where that internal support will come in and allow people to kind of concentrate and say, oh, this is some really interesting stuff. Coming from the engineering side of things, being on that customer side, and when you go into customers, you can connect with those guys and help them to leverage this knowledge that they already have because they're familiar with the products. They know the brand. So that makes it more palatable for them to accept. >> So from the cloud architect's perspective, as you look at it, you look at the data-driven fabric or data fabric, and you're like, wow, this is a great idea. Practically, where's the starting point? Is this a set of products? Is it an architecture? Where do I start to bite into this apple? >> So ultimately, I think, you look at it, and I approach it the same way, I would say, like, I can't just go and buy devops. >> Right. >> Right, but data fabric is still, it's a concept, but it's enabled by a suite of technology products. And we look at NetApp across our portfolio and see all the different products that we have. They all have a data fabric element to them, right? Whether it's a FAS, and Snapmirror and snapping to, and ONTAP cloud, it's running in AWS. Whether it's how we're going to integrate with Azure, now with our NFS service that we're providing in there, whether it's hyper converged infrastructure and the ability to move data off there. Our friend Dave McCrory talked about data having gravity, right, he coined that term. And it does, it does have gravity, and you need to be able to understand where it sits. We have analytics in place that help us craft that. We have a product called OCI that customers use. And what it does, it gives them actionable intelligence about where their data sits, where things may be inefficient. We have to start making that transition to, not just providing storage, but understanding what's in the storage, the value that it has, and using it more like currency. We heard George talk about data as currency, it really is kind of the currency, and information is power, right? >> Yeah, Gabe, I mean Gabe, this is right on the money. I mean cryptocurrency and blockchain is a tell sign of what's coming around the corner. A decentralized and distributed environment that's coming. That wave is way out there, but it's coming fast. So you, I want you to take a minute to talk about the cloud component. >> Sidney: Sure. >> Because you mentioned cloud. Talk about your relationship to the clouds, because multi cloud is coming, too. It's not yet there yet, but just because you have a cloud, something in every cloud means multi cloud in the sense of moving stuff around. And then talk about the customer perspective. Because if I'm a customer, I'm saying to myself, okay, I have NetApp, I got files everywhere, I've got ONTAP, they understand the management game, they know how to manage data on-prem, but now I got this cloud thing going on, and I got this shiny new toy start-up over there that's promised me the moon. But I got to make a decision. You're laughing, I know you're thinking about it. This is the dilemma. Do I stay with what I know? >> Right. >> And what I know, is that relevant for where I'm going? A lot of times start-ups will have that pitch. >> Oh, yeah. >> Right >> So address the cloud and then talk about the impact of the customer around the choice. >> Ultimately, it boils down to me in many respects. When I have a conversation with a customer, if I'm going to go for the bright and shiny, right, there has to be a very compelling business interest to do so. If I've built a set of tools and processes around data governance, management, implementation, movement, et cetera, around a bunch of on-premises technologies and I want that same effect or that same look and feel in the public cloud, then that's how we transition there. I want to make it look like I'm using it here locally but it's not on my site, it's somewhere else. It's being managed by somebody else, from a physical standpoint. I'm just consuming that information. But I also know I have to go back and retool everything I've spent in the last 15 and 20 years building because something new and neat comes along. If that new and neat thing comes along, it abstracts away, or it makes a significant cost reduction or something like that, then obviously, you're going to validate that or look at and vet that technology out. But reality is, is that we kind of have these-- >> Well, they don't want to recode, they don't want to retool, they'll rewrite code, but if you look at the clouds, AWS, Azure, and Google, top three in my mind, >> Sidney: Right. >> They all implement everything differently. They got S3 over there, they got it over here, so like, I got it resting on-prem but then I got to hire a devops team that's trained for Azure, Sidney, this is the reality. I mean, evolution might take care of this, but right now, customers have to know that. >> We're at a point right now where customers, businesses we go to, realtime is very important. Software as a service is the thing now. So if you have a customer who is just clicking on a button, and if they can't see that website or whatever your business is, that's a problem. You're going to lose money. You're going to lose customers, you're going to lose revenue. So what you have to do is, as a business, discover what you have internally. And once you discover that and really understand it as a business, not just the tech team, but the business actually understands that. Move that forward and then blend some cloud technology in that with a data fabric, because you're leveraging what you already have. Most of the time, they usually have some sort of NetApp appliance of some sort. And then some of the new appliances that we do have, you can either say, have a small spin, put it next to an old appliance, or use some of the OCI, or something of that nature, to help you migrate to a more dynamic, and the thing about it is, is to just make it more a fluid transition. That's what you're looking to do. Uptime is everything. >> Yeah. >> Totally. >> This fabric will allow you to have that uptime so that you can propel your business and sustain your business. Because you want to be able to still use what you have, and still get that ROI out of that technology, but at the same token, you want to be more dynamic than the competition, so that you can increase that business and still grow the business, but now lose any business. >> Sidney, you bring up a good point. In fact, we should do a followup segment on this, because, what I'm hearing you say, and I've heard this many times in theCUBE, but it's happening, and certainly, we're doing our part on theCUBE to help, but the tech guys, whether they're ops or devs, they're becoming more business savvy. They've got to get closer to the business. >> Sidney: You have to. >> But they don't want to get an MBA, per se, but they have to become street MBA. >> Sidney: Right. >> They got to get that business degree through scar tissue. >> Yes. You can't just be the tech anymore, you have to understand why your business is making this effort, why it's investing this technology, why they would look to go to the public cloud, if you can't deliver a service, and try to emulate that. We've seen that time and time again, the concept of shadow IT, and a shift away from resources. And if you want to be relevant longterm, and not just the guy that sits in the closet, and then plugs in the wires, start learning about your business. Learn about how the business is run and how it generates revenue and see what you can do to affect that. >> Yeah, and the jobs aren't going away. This nonsense about automation killing jobs. >> No, it's not. >> And they use the mainframe as an example, not really relevant, but kind of, but there are other jobs. I mean, look at cyber security, huge data aspect, impact story. >> Sure, it's huge. >> That paradigm is changing realtime. So good stuff, a lot of good business conferences we should do a followup on. I'll give you guys a final word in this segment. If you could each weigh in on what cloud architects should be doing right now. I mean, besides watching theCUBE, and watching you guys here. They got to have the 20-mile stare. They got to understand the systems that are in place. It's almost like an operating system model. They got to see the big picture. Architecting on paper seems easy, but right now it's hard. What's your advice for cloud architects? >> I mean, I say continue to follow the trends. Continue to expose yourself to new technologies. I mean, I'm really interested in things like serverless and those type technologies, and how we integrate our platforms into those types of solutions. Because, that's kind of the next wave of things that are coming along, as we become more of an API-driven ecosystem, right? So if it's infrastructure, if it's code, if it's everything is just in time instance of spin up, how do I have the communications between those technologies? You've just got to stay well ahead of the curve and, you know ... >> John: Sidney, your thoughts? >> My thoughts are along those lines. Not only from a technical perspective but also like you were talking about, that business perspective. Understand your business needs. Because even though, and be able to provide a portfolio, or a suite of tools that will help that business take that next step. And that's where that value. So it's kind of like a blend. You're more of a hybrid. Where you're coming in, not only as a technical person, but you're coming in to assist the business and develop it and help it take it's next step. >> John: And IT is not a department, anymore, it's everywhere. >> No it's not, not. >> It's integrated. >> It is the business. >> Yes. >> Guys, great conversation here on the future of the cloud architect, here inside theCUBE at NetApp Insight 2017 here at the Mandalay Bay in Las Vegas, theCUBE's coverage. We'll be right back with more after this short break. (techno music) (fast and furious music)

Published Date : Oct 4 2017

SUMMARY :

Brought to you by NetApp. also a member of the A-Team, a highly regarded, So love the shirt, by the way, and everything in between, kind of the segment. because they got, it's a path to get there. that can allow any customer to dynamically move but the Enterprises and the folks doing cloud, So it's a real challenge for architects. But the reality is to abstract away the need Old adage that's been in the industry It's the data fabric as you guys are talking about. around the big data world, you couldn't get anymore clear Sidney, you're in the field, you're in the front lines. The sense that we have to make is, and the cloud architecture, You can then put that person now on, of the cloud being able to go off-prem or on-prem. We'll have to get your thoughts on this and the teams that can functionally build those things, Where is the delineation for those? So the way we see it is traditionally, is that the early stage was the cost of ownership problem. That's the question we come to you and ask. So from the cloud architect's perspective, and I approach it the same way, I would say, and the ability to move data off there. about the cloud component. But I got to make a decision. And what I know, is that relevant for where I'm going? So address the cloud and then talk about the impact in the public cloud, then that's how we transition there. but then I got to hire a devops team and the thing about it is, but at the same token, you want to be more dynamic but the tech guys, whether they're ops or devs, but they have to become street MBA. and not just the guy that sits in the closet, Yeah, and the jobs aren't going away. And they use the mainframe as an example, and watching you guys here. I mean, I say continue to follow the trends. but also like you were talking about, John: And IT is not a department, of the cloud architect, here inside theCUBE

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Keith TownsendPERSON

0.99+

SidneyPERSON

0.99+

Dave McCroryPERSON

0.99+

JohnPERSON

0.99+

GabePERSON

0.99+

GeorgePERSON

0.99+

JoshPERSON

0.99+

John FurrierPERSON

0.99+

95%QUANTITY

0.99+

Sidney SonnierPERSON

0.99+

AWSORGANIZATION

0.99+

Gabe ChapmanPERSON

0.99+

20-mileQUANTITY

0.99+

SiliconANGLE MediaORGANIZATION

0.99+

The Joy of Menial TasksTITLE

0.99+

last weekDATE

0.99+

Mandalay BayLOCATION

0.99+

iPhoneCOMMERCIAL_ITEM

0.99+

yesterdayDATE

0.99+

NetAppORGANIZATION

0.99+

Las VegasLOCATION

0.99+

DonPERSON

0.99+

GoogleORGANIZATION

0.99+

theCUBEORGANIZATION

0.99+

GenePERSON

0.99+

NYCLOCATION

0.99+

AMIAORGANIZATION

0.98+

NetAppTITLE

0.98+

two guestsQUANTITY

0.97+

one providerQUANTITY

0.97+

oneQUANTITY

0.97+

appleORGANIZATION

0.97+

FlexPodCOMMERCIAL_ITEM

0.97+

one locationQUANTITY

0.96+

GPRDORGANIZATION

0.96+

eachQUANTITY

0.96+

AzureORGANIZATION

0.94+

OCIORGANIZATION

0.94+

one thingQUANTITY

0.94+

NetApp HCIORGANIZATION

0.93+

NetApp Insight 2017EVENT

0.93+

SnapmirrorTITLE

0.93+

BaileyPERSON

0.91+

each oneQUANTITY

0.9+

SEPORGANIZATION

0.9+

QEDPERSON

0.89+

todayDATE

0.87+

OpenStackORGANIZATION

0.86+

ONTAPTITLE

0.86+

mode twoQUANTITY

0.85+

20 yearsQUANTITY

0.85+

CTO AdvisorORGANIZATION

0.81+

mode oneQUANTITY

0.78+

15QUANTITY

0.77+

S3TITLE

0.77+

SolidFireORGANIZATION

0.76+

BigORGANIZATION

0.76+

ndQUANTITY

0.76+

waveEVENT

0.75+

Narrator: LiveTITLE

0.74+

Rob Young, Red Hat Product Management | VMworld 2017


 

>> Narrator: Live from Las Vegas. It's theCUBE. Covering VMWorld 2017. Brought to you by vmware and it's ecosystem partners. (bright pop music) >> Welcome back to theCUBE on day three of our continuing coverage of VMWorld 2017. I'm Lisa Martin. My co-host for this segment is John Troyer and we're excited to be joined by Rob Young, who is a CUBE alumni and the manager of product and strategy at Red Hat. Welcome back to theCUBE, Rob. >> Thanks, Lisa. It's great to be here. >> So, Red Hat and VMware. You've got a lot of customers in common. I imagine you've been to many many VMworlds. What are you hearing from some of the folks that you're talking to during the show this week? >> So, a lot of the interest that we're seeing is how Red Hat can help customers, VMware or otherwise, continue to maintain mode one applications, legacy applications, while planning for mode two, more cloud-based deployments. We're seeing a large interest in open-source technologies and how that model could work for them to lower cost, to innovate more quickly, deliver things in a more agile way, so there's a mixture of messages that we're getting, but we're receiving them loud and clear. >> Excellent. You guys have a big investment in OpenStack. >> Yes we do and even back in the early days when OpenStack was struggling as a technology, we recognized that it was an enabler for customers, partners, large enterprises that wanted to create and maintain their own private clouds or even to have a hybrid cloud environment to where they maintained and managed, controlled some aspect of it, while having some of the workloads on a public cloud environment as well so Red Hat has invested heavily in OpenStack to this point. We're now in our 11th version of Red Hat OpenStack platform and we continue to lead that market as far as OpenStack development, innovation, and contributions. >> Rob, we were with theCUBE at the last OpenStack summit in Boston. Big Red Hat presence there, obviously. I was very impressed at the maturity of the OpenStack market and community. I mean, we're past the hype cycle now, right? We're down to real people, real uses, real people using it. A lot of very, people with a strong business critical investment in OpenStack and many different use cases. Can you kind of give us a picture of the state of the OpenStack market and userbase now that we are past that hype cycle? >> So, I think what we're witnessing now in the market is that there's a thirst for OpenStack. One, because it's a very efficient architecture. It's very extensible. There's a tremendous ecosystem around the Red Hat distribution of OpenStack and what we're seeing from enterprises, specifically the TelCo industry, is that they see OpenStack as a way to lower their cost, raise their margins in a very competitive environment, so anywhere you see an industry or a vertical where there's very heavy competition for customers and eyeballs, that type of thing. OpenStack is going to play a role and if it's not already doing so, it's going to be there at some point because of the simplification of what was once complex but also in the cost savings, it could be realized by managing your own cloud within a hybrid cloud environment. >> You mention TelCo and specifically OpenStack kind of value for companies that need to compete for customers. Besides TelCo, what other industries are really kind of primed for embracing OpenStack technologies? >> So, we're seeing it across many industries, finance and banking, healthcare, public sector, anywhere where there's an emphasis on the move to open source and to open compute environment, open APIs. We're seeing a tremendous growth in traction and because Red Hat has been the leader in Linux, many of these same customers who trust us for Red Hat Enterprise Linux, are now looking to us for the very same reason on OpenStack platform, because much like we have done with Enterprise Linux, we have adopted an upstream community-driven project. We have made it safe to use within an environment in an enterprise way, in a supported way as well, the subscription. So, many industries, many verticals. We expect to see more, but primary-use cases, NFE and TelCo, healthcare, banking, public sector are among the top dogs out there. >> Is there a customer story that kind of stands out in your mind as really a hallmark that showcases the success of working with Red Hat and OpenStack? >> Well there are many customers, there are many partners that we have out there that we work with, but I would say that if you look at some of the, four of out of five of the large TelCos - Orange, Ericsson, Nokia, others that we've recently done business with would be really good examples of not only customer use cases but how they're using OpenStack to enable their customers to have better experience with their cell networks, with their billing, with their availability, that type of thing. And we had two press announcements that came out in May, one is an educational institution of a consortium, a very high profile Northeast learning institutions, public institutions, that are now standardized on OpenStack and that are contributing, and we've also got Oakridge, forgive me, it escapes me, but there's a case study out there on the Red Hat website that was posted on May the eighth that depicts how they're using our product and how others can do the same. >> Rob, switching over a little bit to talking a little bit more about the tech and how the levers get pulled, right, we're talking about cloud, right, another term, "past the hype cycle," right? It's a reality. And when you're talking about cloud, you're talking about scale. >> Rob: Yes. >> We mentioned Linux, OpenStack, and Red Hat kind of built on a foundation of Linux, it's super solid, super huge community, super rich, super long history, but can you talk about scale up, scale out, data center, public cloud, private, how are you seeing enterprises of various sizes address the scale problem and using technologies like the Red Hat and CloudStack to address that? >> So there's a couple things, there's many aspects to that question but what we have seen from OpenStack is when we first got involved with the project, it was very much bounded by the number of servers that you needed to deploy an OpenStack infrastructure on. What Red Hat has done, or what we've done as a company is we've looked at the components and we have unshackled them from each other, so that you can scale individual storage, individual network, individual high availability, on the number of servers that best fit your needs. So if you want to have a very large footprint with you know, many nodes of storage, you can do that. If you want to scale that just when peak season hits, you can do that as well. But we have led the community efforts to de-shackle the dependencies between components so from that aspect we have scaled the technology, now scaling operational capabilities and skillsets as well. We've also led the effort to create open APIs for management tools. We've created communities around the different components of OpenStack and other outsourced technologies - >> Automation a big part of that as well, right? >> Automation as well, so if you look at Ansible, as an example, Red Hat has a major stake in Ansible, and it is predominantly the management scripting language of choice, or the management platform of choice, so we have baked that into our products, we have made it very simple for customers to not only deploy things like OpenStack but OpenShift, CloudForms, other management capabilities that we have, but we've also added APIs to these products so that even if you choose not to use a Red Hat solution, you can easily plug in a third-party solution or a home-grown solution into our framework or our stack so that you can use our toolset, single pane of glass, to manage it all. >> So with that, can you tell us a little bit about the partner ecosystem that Red Hat has, and what you've done sounds like to expand that to make your customers successful in OpenStack deployments. >> Absolutely, so as you're aware, Red Hat Enterprise Linux, we certified most of the hardware, or all of the hardware, OEMs on Red Hat Enterprise Linux. We have a tremendous ecosystem around Enterprise Linux. For OpenStack, this is probably one of the most exciting aspects of Red Hat right now. If you look at the ecosystem and the partners that are just around OpenStack on its own, we've got an entire catalog of hundreds of partners, some at a deeper level than others, integration-wise, business-wise, whatever, but the ecosystem is growing and it's not because of Red Hat's efforts. We have customers and partners that are coming to us saying, we need a storage solution, we're using, you know, NetAMP as an example. You need to figure out a way to integrate with these guys, and certify it, make sure that it's something that we've already invested in, it's going to work with your product as well as it works with our legacy stuff. So the ecosystem around OpenStack is growing, we're also looking at growing the ecosystem around OpenShift, around Red Hat virtualization as well, so I think you'll see a tremendous amount of overlap in those ecosystem as well, which is a great thing for us. The synergies are there, and I just think it's only going to help us multiply our efforts in the market. >> Go ahead John. >> Oh Rob, talking again, partnerships, I've always been intrigued at the role of open source upstream, the open source community, and the role of the people that take that open source and then package it for customers and do the training, enablement. So can you talk maybe a little bit about some of the open source partners and maybe how the role of Red Hat in translating all that upstream code into a product that is integrated and has training, and is available for consumption from the IT side. >> Sure. So at Red Hat, we partner not only with open source community members and providers but also with proprietaries. So I just want to make sure that everybody understands we're not exclusive to who we partner with. Upstream, we look for partners that have the open source spirit and mind, so everything that they're doing that they're asking us to either consider as a component within our solution or to integrate with, we're going to make sure that they're to the letter of the law, contributing their code back, and there's no hooks or strings attached. Really the value comes in, are they providing value to their customers with the contribution and also to our combined customers, and what we're seeing in our partnerships is that many of our partners, even proprietary partners like Microsoft as an example, are looking at open source in a different way. They're providing open source options for their customers and subscription-based, consumption-based models as well, so we hope that we're having a positive impact in that way, because if you look at our industry it's really headed toward the open source, open API, open model and the proprietary model still has the place and time I believe but I think it's going to diminish over time and open source is going to be just the way people do business together. >> One of the things that you were talking about kind of reminded me of one of the things Michael Dell said yesterday during the keynote with Pat Gelsinger and that was about innovation and that you really got to, companies to be successful need to be innovating with their customers and it sounds like that's definitely one of the core elements of what you're doing with customers. You said customers and partners are bringing us together to really drive that innovation. >> Yeah, I couldn't agree more. It's an honor to be mentioned in the same breath as Michael Dell, by the way. But what we see is because of the open source model, you can release early and often, and you can fail early, and what that does is encourage innovation. So it's not only corporations like Red Hat that are contributing to upstream projects, OpenStack as an example or Linux as an example, or KVM as an example. There's also college students, there's people out there who work for Bank of America. Across the fruited plains all over the world. And the one thing that unites us is this ability to recognize the value of our contributions to an open source community, and we think that that really helps with agile development, agile delivery, and if you look at our project deliveries for OpenStack as an example, OpenStack releases a major version of its product every six months. And because of contributions that we get from our community, we're able to release our - and testing, it's not just, contributions come in many forms. Testing is a huge part of that. Because of the testing we get from a worldwide community, we're able to release shortly after a major version of upstream OpenStack because that innovation. In a pure waterfall model, it's not even possible. In an open source model, it's just the way of life . >> So as we're kind of wrapping up VMworld day three, what are some of the key takeaways for you personally from the event and that Red Hat has observed in the last couple of days here in Las Vegas. >> So there's a couple of observations that have kind of been burned into my brain. One is we believe at Red Hat, our opinion is that virtualization as a model will remain core, not only to legacy applications, mode one, but also to mode two, and the trend that we see in the model, that we see is that for mode two, virtualization is going to be a commodity feature. People are going to expect it to be baked into the operating system or into the infrastructure that they're running the operating system or their applications on. So we see that trend and we've suspected it, but coming to VMworld this week helped confirm that. And I say that because of the folks I've talked to, after sessions, at dinner, in the partner pavilion. I really see that as a trend. The other thing I see is that there's a tremendous thirst within the VMware customer base to learn more about open source and learn more about how they can, you know, leverage some of this not only to lower their total cost of ownership and not to replace VMware, but how they can complement what they've already invested in with faster, more agile-based mode two development. And that's where we see the market from a Red Hat standpoint. >> Excellent. Well there's a great TEI study that you guys did recently, Total Economic Impact, on virtualization that folks can find on the website. And Rob, we thank you for sticking around and sharing some of your insights and innovations that Red Hat is pioneering and we look forward to having you back on the show. >> Great to be here. Thanks. >> Absolutely, and for my co-host John Troyer, I'm Lisa Martin and you're watching theCUBE's continuing coverage, day three, of VMworld 2017. Stick around, we'll be right back. (bright pop music)

Published Date : Sep 5 2017

SUMMARY :

Brought to you by vmware and it's ecosystem partners. and the manager of product and strategy at Red Hat. What are you hearing from some of the folks that So, a lot of the interest that we're seeing is how You guys have a big investment in OpenStack. and we continue to lead that market as far as of the OpenStack market and community. and eyeballs, that type of thing. kind of primed for embracing OpenStack technologies? and because Red Hat has been the leader in Linux, and how others can do the same. and how the levers get pulled, right, We've also led the effort to create language of choice, or the management platform of choice, So with that, can you tell us a little bit about that are coming to us saying, we need a storage solution, and is available for consumption from the IT side. and open source is going to be just the way One of the things that you were talking about kind of Because of the testing we get from a worldwide community, that Red Hat has observed in the last couple of days in the model, that we see is that for mode two, and we look forward to having you back on the show. Great to be here. I'm Lisa Martin and you're watching theCUBE's

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
John TroyerPERSON

0.99+

Rob YoungPERSON

0.99+

Lisa MartinPERSON

0.99+

EricssonORGANIZATION

0.99+

NokiaORGANIZATION

0.99+

MicrosoftORGANIZATION

0.99+

RobPERSON

0.99+

Michael DellPERSON

0.99+

LisaPERSON

0.99+

MayDATE

0.99+

Red HatORGANIZATION

0.99+

Pat GelsingerPERSON

0.99+

JohnPERSON

0.99+

TelCoORGANIZATION

0.99+

Enterprise LinuxTITLE

0.99+

yesterdayDATE

0.99+

Las VegasLOCATION

0.99+

BostonLOCATION

0.99+

fiveQUANTITY

0.99+

Red HatTITLE

0.99+

Red Hat Enterprise LinuxTITLE

0.99+

CUBEORGANIZATION

0.99+

VMwareORGANIZATION

0.98+

oneQUANTITY

0.98+

LinuxTITLE

0.98+

fourQUANTITY

0.98+

VMworldORGANIZATION

0.98+

NFEORGANIZATION

0.98+

OneQUANTITY

0.98+

OakridgeORGANIZATION

0.98+

this weekDATE

0.98+

OpenStackTITLE

0.98+

VMworld 2017EVENT

0.97+

Bank of AmericaORGANIZATION

0.97+

VMWorld 2017EVENT

0.97+

two press announcementsQUANTITY

0.97+

theCUBEORGANIZATION

0.97+

OpenStackORGANIZATION

0.97+

CloudFormsTITLE

0.97+

OpenShiftTITLE

0.96+

CloudStackTITLE

0.95+

day threeQUANTITY

0.94+

VMworldEVENT

0.92+

mode twoQUANTITY

0.92+