Manufacturing - Drive Transportation Efficiency and Sustainability with Big | Cloudera

>> Welcome to our industry drill down. This is for manufacturing. I'm here with Michael Ger, who is the managing director for automotive and manufacturing solutions at Cloudera. And in this first session, we're going to discuss how to drive transportation efficiencies and improve sustainability with data. Connected trucks are fundamental to optimizing fleet performance, costs, and delivering new services to fleet operators. And what's going to happen here is Michael's going to present some data and information, and we're going to come back and have a little conversation about what we just heard. Michael, great to see you! Over to you. >> Oh, thank you, Dave. And I appreciate having this conversation today. Hey, you know, this is actually an area, connected trucks, you know, this is an area that we have seen a lot of action here at Cloudera. And I think the reason is kind of important, right? Because you know, first of all, you can see that, you know, this change is happening very, very quickly, right? 150% growth is forecast by 2022 and the reasons, and I think this is why we're seeing a lot of action and a lot of growth, is that there are a lot of benefits, right? We're talking about a B2B type of situation here. So this is truck made, truck makers providing benefits to fleet operators. And if you look at the, the top fleet operator, the top benefits that fleet operators expect, you see this in, in the, in the graph over here, now almost 80% of them expect improved productivity, things like improved routing, right? So route efficiencies, improved customer service, decrease in fuel consumption, better better technology. This isn't technology for technology's sake, these connected trucks are coming onto the marketplace because, hey, it can provide tremendous value to the business. And in this case, we're talking about fleet operators and fleet efficiencies. So, you know, one of the things that's really important to be able to enable us, right, trucks are becoming connected because at the end of the day, we want to be able to provide fleet efficiencies through connected truck analytics and machine learning. Let me explain to you a little bit about what we mean by that, because what, you know, how this happens is by creating a connected vehicle, analytics, machine-learning life cycle, and to do that, you need to do a few different things, right? You start off, of course, with connected trucks in the field. And, you know, you could have many of these trucks because typically you're dealing at a truck level and at a fleet level, right? You want to be able to do analytics and machine learning to improve performance. So you start off with these trucks. And the first thing you need to be able to do is connect to those trucks, right? You have to have an intelligent edge where you can collect that information from the trucks. And by the way, once you collect the, this information from the trucks, you want to be able to analyze that data in real-time and take real-time actions. Now what I'm going to show you, the ability to take this real-time action, is actually the result of your machine-learning lifecycle. Let me explain to you what I mean by that. So we have these trucks, we start to collect data from it, right? At the end of the day what we'd like to be able to do is pull that data into either your data center or into the cloud, where we can start to do more advanced analytics. And we start with being able to ingest that data into the cloud, into that enterprise data lake. We store that data. We want to enrich it with other data sources. So for example, if you're doing truck predictive maintenance, you want to take that sensor data that you've connected, collected from those trucks. And you want to augment that with your dealership, say, service information. Now you have, you know, you have sensor data and the resulting repair orders. You're now equipped to do things like predict when maintenance will work, all right. You've got all the data sets that you need to be able to do that. So what do you do? Like I said, you're ingested, you're storing, you're enriching it with data, right? You're processing that data. You're aligning, say, the sensor data to that transactional system data from your, from your your repair maintenance systems; you're, you're bringing it together so that you can do two things. You can do, first of all, you could do self-service BI on that data, right? You can do things like fleet analytics, but more importantly, what I was talking to you about before is you now have the data sets to be able to do create machine learning models. So if you have the sensor values and the need, for example, for, for a dealership repair, or is, you could start to correlate which sensor values predicted the need for maintenance, and you could build out those machine learning models. And then as I mentioned to you, you could push those machine learning models back out to the edge, which is how you would then take those real-time actions I mentioned earlier. As that data that then comes through in real-time, you're running it again against that model. And you can take some real-time actions. This is what we, this is this, this, this analytics and machine learning model, machine learning life cycle is exactly what Cloudera enables. This end-to-end ability to ingest data; store, you know, store it, put a query lay over it, create machine learning models, and then run those machine learning models in real time. Now that's what we, that's what we do as a business. Now one such customer, and I just want to give you one example of a customer that we have worked with to provide these types of results is Navistar. And Navistar was kind of an early, early adopter of connected truck analytics, and they provided these capabilities to their fleet operators, right? And they started off by, by, you know, connecting 475,000 trucks to up to well over a million now. And you know, the point here is that they were centralizing data from their telematics service providers, from their trucks' telematics service providers. They're bringing in things like weather data and all those types of things. And what they started to do was to build out machine learning models aimed at predictive maintenance. And what's really interesting is that you see that Navistar made tremendous strides in reducing the need, or the expense associated with maintenance, right? So rather than waiting for a truck to break and then fixing it, they would predict when that truck needs service, condition-based monitoring, and service it before it broke down, so that you can do that in a much more cost-effective manner. And if you see the benefits, right, they reduce maintenance costs 3 cents a mile from the, you know, down from the industry average of 15 cents a mile down to 12 cents cents a mile. So this was a tremendous success for Navistar. And we're seeing this across many of our, you know, truck manufacturers. We're, we're working with many of the truck OEMs, and they are all working to achieve very, very similar types of benefits to their customers. So just a little bit about Navistar. Now, we're going to turn to Q and A. Dave's got some questions for me in a second, but before we do that, if you want to learn more about our, how we work with connected vehicles and autonomous vehicles, please go to our web, to our website. What you see up, up on the screen. There's the URL. It's cloudera.com forward slash solutions, forward slash manufacturing. And you'll see a whole slew of collateral and information in much more detail in terms of how we connect trucks to fleet operators who provide analytics. Use cases that drive dramatically improved performance. So with that being said, I'm going to turn it over to Dave for questions. >> Thank you, Michael. That's a great example you've got. I love the life cycle. You can visualize that very well. You've got an edge use case you do in both real time inference, really, at the edge. And then you're blending that sensor data with other data sources to enrich your models. And you can push that back to the edge. That's that life cycle. So really appreciate that, that info. Let me ask you, what are you seeing as the most common connected vehicle when you think about analytics and machine learning, the use cases that you see customers really leaning into? >> Yeah, that's really, that's a great question, Dave, you know, cause, you know, everybody always thinks about machine learning like this is the first thing you go to. Well, actually it's not, right? For the first thing you really want to be able to go down, many of our customers are doing, is look, let's simply connect our trucks or our vehicles or whatever our IOT asset is, and then you can do very simple things like just performance monitoring of the, of the piece of equipment. In the truck industry, a lot of performance monitoring of the truck, but also performance monitoring of the driver. So how is the, how is the driver performing? Is there a lot of idle time spent? You know, what's, what's route efficiency looking like? You know, by connecting the vehicles, right? You get insights, as I said, into the truck and into the driver and that's not machine learning even, right? But, but that, that monitoring piece is really, really important. So the first thing that we see is monitoring types of use cases. Then you start to see companies move towards more of the, what I call the machine learning and AI models, where you're using inference on the edge. And then you start to see things like predictive maintenance happening, kind of route real-time, route optimization and things like that. And you start to see that evolution again, to those smarter, more intelligent dynamic types of decision-making. But let's not, let's not minimize the value of good old fashioned monitoring, that's to give you that kind of visibility first, then moving to smarter use cases as you, as you go forward. >> You know, it's interesting, I'm I'm envisioning, when you talked about the monitoring, I'm envisioning, you see the bumper sticker, you know, "How am I driving?" The only time somebody ever probably calls is when they get cut off it's and you know, I mean, people might think, "Oh, it's about big brother," but it's not. I mean, that's yeah okay, fine. But it's really about improvement and training and continuous improvement. And then of course the, the route optimization. I mean, that's, that's bottom line business value. So, so that's, I love those, those examples. >> Great! >> I wonder, I mean, what are the big hurdles that people should think about when they want to jump into those use cases that you just talked about, what are they going to run into? You know, the blind spots they're, they're going to, they're going to to get hit with. >> There's a few different things, right? So first of all, a lot of times your IT folks aren't familiar with the kind of the more operational IOT types of data. So just connecting to that type of data can be a new skill set, right? There's very specialized hardware in the car and things like, like that and protocols. That's number one. That's the classic IT OT kind of conundrum that, you know, many of our customers struggle with. But then, more fundamentally, is, you know, if you look at the way these types of connected truck or IOT solutions started, you know, oftentimes they were, the first generation were very custom built, right? So they were brittle, right? They were kind of hardwired. Then as you move towards more commercial solutions, you had what I call the silo problem, right? You had fragmentation in terms of this capability from this vendor, this capability from another vendor. You get the idea. You know, one of the things that we really think that we need that we, that needs to be brought to the table, is, first of all, having an end to end data management platform. It's kind of an integrated, it's all tested together, you have a data lineage across the entire stack. But then also importantly, to be realistic, we have to be able to integrate to industry kind of best practices as well in terms of solution components in the car, the hardware and all those types of things. So I think there's, you know, it's just stepping back for a second, I think that there is, has been fragmentation and complexity in the past. We're moving towards more standards and more standard types of offerings. Our job as a software maker is to make that easier and connect those dots, so customers don't have to do it all on all on their own. >> And you mentioned specialized hardware. One of the things we heard earlier in the main stage was your partnership with Nvidia. We're talking about new types of hardware coming in. You guys are optimizing for that. We see the IT and the OT worlds blending together, no question. And then that end-to-end management piece, you know, this is different from, your right, from IT, normally everything's controlled, you're in the data center. And this is a metadata, you know, rethinking kind of how you manage metadata. So in the spirit of, of what we talked about earlier today, other technology partners, are you working with other partners to sort of accelerate these solutions, move them forward faster? >> Yeah, I'm really glad you're asking that, Dave, because we actually embarked on a product on a project called Project Fusion, which really was about integrating with, you know, when you look at that connected vehicle lifecycle, there are some core vendors out there that are providing some very important capabilities. So what we did is we joined forces with them to build an end-to-end demonstration and reference architecture to enable the complete data management life cycle. Now Cloudera's piece of this was ingesting data and all the things I talked about in storing and the machine learning, right? And so we provide that end to end. But what we wanted to do is we wanted to partner with some key partners. And the partners that we did integrate with were NXP. NXP provides the service-oriented gateways in the car, so that's the hardware in the car. Wind River provides an in-car operating system. That's Linux, right? That's hardened and tested. We then ran ours, our, our Apache MiNiFi, which is part of Cloudera data flow, in the vehicle, right on that operating system, on that hardware. We pumped the data over into the cloud where we did the, all the data analytics and machine learning, and built out these very specialized models. And then we used a company called Airbiquity, once we built those models, to do, you know, they specialize in automotive over-the-air updates, right? So they can then take those models, and update those models back to the vehicle very rapidly. So what we said is, look, there's, there's an established, you know, ecosystem, if you will, of leaders in this space. What we wanted to do is make sure that Cloudera was part and parcel of this ecosystem. And by the way, you mentioned Nvidia as well. We're working close with Nvidia now. So when we're doing the machine learning, we can leverage some of their hardware to get some still further acceleration in the machine learning side of things. So yeah, you know, one of the things I, I, I always say about these types of use cases, it does take a village. And what we've really tried to do is build out that, that, that an ecosystem that provides that village so that we can speed that analytics and machine learning lifecycle just as fast as it can be. >> This is, again, another great example of data intensive workloads. It's not your, it's not your grandfather's ERP that's running on, you know, traditional, you know, systems, it's, these are really purpose built, maybe they're customizable for certain edge-use cases. They're low cost, low, low power. They can't be bloated. And you're right, it does take an ecosystem. You've got to have, you know, APIs that connect and, and that's that, that takes a lot of work and a lot of thought. So that, that leads me to the technologies that are sort of underpinning this. We've talked, we've talked a lot on The Cube about semiconductor technology, and now that's changing, and the advancements we're seeing there. What, what do you see as some of the key technology areas that are advancing this connected vehicle machine learning? >> You know, it's interesting, I'm seeing it in, in a few places, just a few notable ones. I think, first of all, you know, we see that the vehicle itself is getting smarter, right? So when you look at, we look at that NXP type of gateway that we talked about. That used to be kind of a, a dumb gateway that was, really all it was doing was pushing data up and down, and provided isolation as a gateway down to the, down from the lower level subsystems. So it was really security and just basic, you know, basic communication. That gateway now is becoming what they call a service oriented gateway. So it can run. It's not, it's got disc, it's got memory, it's got all this. So now you could run serious compute in the car, right? So now all of these things like running machine-learning inference models, you have a lot more power in the car. At the same time, 5G is making it so that you can push data fast enough, making low latency computing available, even on the cloud. So now, now you've got incredible compute both at the edge in the vehicle and on the cloud, right? And, you know, and then on the, you know, on the cloud, you've got partners like Nvidia, who are accelerating it still further through better GPU-based computing. So, I mean the whole stack, if you look at that, that machine learning life cycle we talked about, you know, Dave, it seems like there's improvements in every, in every step along the way, we're starting to see technology optim, optimization just pervasive throughout the cycle. >> And then, you know, real quick, it's not a quick topic, but you mentioned security. I mean, we've seen a whole new security model emerge. There is no perimeter anymore in this, in a use case like this is there? >> No, there isn't. And one of the things that we're, you know, remember we're the data management plat, platform, and the thing we have to provide is provide end-to-end, you know, end-to-end lineage of where that data came from, who can see it, you know, how it changed, right? And that's something that we have integrated into, from the beginning of when that data is ingested, through when it's stored, through when it's kind of processed and people are doing machine learning; we provide, we will provide that lineage so that, you know, that security and governance is assured throughout the, throughout that data learning life's level. >> And federated across, in this example, across the fleet, so. All right, Michael, that's all the time we have right now. Thank you so much for that great information. Really appreciate it. >> Dave, thank you. And thanks for the audience for listening in today. >> Yes, thank you for watching. Keep it right there.

Published Date : Aug 3 2021

SUMMARY :

And in this first session, And the first thing you the use cases that you see For the first thing you really it's and you know, I that you just talked about, So I think there's, you know, And this is a metadata, you know, And by the way, you You've got to have, you and just basic, you know, And then, you know, real that lineage so that, you know, the time we have right now. And thanks for the audience Yes, thank you for watching.

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
Michael	PERSON	0.99+
Nvidia	ORGANIZATION	0.99+
Michael Ger	PERSON	0.99+
Cloudera	ORGANIZATION	0.99+
12 cents	QUANTITY	0.99+
NXP	ORGANIZATION	0.99+
first	QUANTITY	0.99+
Airbiquity	ORGANIZATION	0.99+
Navistar	ORGANIZATION	0.99+
150%	QUANTITY	0.99+
475,000 trucks	QUANTITY	0.99+
2022	DATE	0.99+
first session	QUANTITY	0.99+
today	DATE	0.99+
two things	QUANTITY	0.99+
first generation	QUANTITY	0.98+
15 cents a mile	QUANTITY	0.98+
Wind River	ORGANIZATION	0.98+
Linux	TITLE	0.98+
cloudera.com	OTHER	0.98+
One	QUANTITY	0.98+
3 cents a mile	QUANTITY	0.98+
first thing	QUANTITY	0.97+
one example	QUANTITY	0.97+
both	QUANTITY	0.96+
one	QUANTITY	0.96+
almost 80%	QUANTITY	0.94+
Apache	ORGANIZATION	0.94+
cents a mile	QUANTITY	0.82+
over a million	QUANTITY	0.79+
earlier today	DATE	0.79+
Project	TITLE	0.75+
one such customer	QUANTITY	0.72+
Cube	ORGANIZATION	0.7+
a second	QUANTITY	0.69+
5G	ORGANIZATION	0.64+
well	QUANTITY	0.61+
MiNiFi	COMMERCIAL_ITEM	0.59+
second	QUANTITY	0.56+
up	QUANTITY	0.54+
Cloudera	TITLE	0.53+

Rick Farnell, Protegrity | AWS Startup Showcase: The Next Big Thing in AI, Security, & Life Sciences

(gentle music) >> Welcome to today's session of the AWS Startup Showcase The Next Big Thing in AI, Security, & Life Sciences. Today we're featuring Protegrity for the life sciences track. I'm your host for theCUBE, Natalie Erlich, and now we're joined by our guest, Rick Farnell, the CEO of Protegrity. Thank you so much for being with us. >> Great to be here. Thanks so much Natalie, great to be on theCUBE. >> Yeah, great, and so we're going to talk today about the ransomware game, and how it has changed with kinetic data protection. So, the title of today's video segment makes a bold claim, how are kinetic data and ransomware connected? >> So first off kinetic data, data is in use, it's moving, it's not static, it's no longer sitting still, and your data protection has to adhere to those same standards. And I think if you kind of look at what's happening in the ransomware kind of attacks, there's a couple of different things going on, which is number one, bad actors are getting access to data in the clear, and they're holding that data ransom, and threatening to release that data. So kind of from a Protegrity standpoint, with our protection capabilities, that data would be rendered useless to them in that scenario. So there's lots of ways in which kind of backup data protection, really wonderful opportunities to do both data protection and kind of that backup mixed together really is a wonderful solution to the threat of ransomware. And it's a serious issue and it's not just targeting the most highly regulated industries and customers, we're seeing kind of attacks on pipeline and ferry companies, and really there is no end to where some of these bad actors are really focusing on and the damages can be in the hundreds of millions of dollars and last for years after from a brand reputation. So I think if you look at how data is used today, there's that kind of opposing forces where the business wants to use data at the speed of light to produce more machine learning, and more artificial intelligence, and predict where customers are going to be, and have wonderful services at their fingertips. But at the same time, they really want to protect their data, and sometimes those architectures can be at odds, and at Protegrity, we're really focusing on solving that problem. So free up your data to be used in artificial intelligence and machine learning, while making sure that it is absolutely bulletproof from some of these ransomware attacks. >> Yeah, I mean, you bring a really fascinating point that's really central to your business. Could you tell us more about how you're actually making that data worthless? I mean, that sounds really revolutionary. >> So, it sounds novel, right? To kind of make your data worthless in the wrong hands. And I think from a Protegrity perspective, our kind of policy and protection capability follows the individual piece of data no matter where it lives in the architecture. And we do a ton of work as the world does with Amazon Web Services, so kind of helping customers really blend their hybrid cloud strategies with their on-premise and their use of AWS, is something that we thrive at. So protecting that data, not just at rest or while it's in motion, but it's a continuous protection policy that we can basically preserve the privacy of the data but still keep it unique for use in downstream analytics and machine learning. >> Right, well, traditional security is rather stifling, so how can we fix this, and what are you doing to amend that? >> Well, I think if you look at cybersecurity, and we certainly play a big role in the cybersecurity world but like any industry, there are many layers. And traditional cybersecurity investment has been at the perimeter level, at the network level keeping bad actors out, and once people do get through some of those fences, if your data is not protected at a fine grain level, they have access to it. And I think from our standpoint, yes, we're last line of defense but at the same time, we partner with folks in the cybersecurity industry and with AWS and with others in the backup and recovery to give customers that level of protection, but still allow their kinetic data to be utilized in downstream analytics. >> Right, well, I'd love to hear more about the types of industries that you're helping, and specifically healthcare obviously, a really big subject for the year and probably now for years to come, how is this industry using kinetic protection at the moment? >> So certainly, as you mentioned, some of the most highly regulated industries are our sweet spot. So financial services, insurance, online retail, and healthcare, or any industry that has sensitive data and sensitive customer data, so think first name last name, credit card information, national ID number, social security number blood type, cancer type. That's all sensitive information that you as an organization want to protect. So in the healthcare space, specifically, some of the largest healthcare organizations in the world rely on Protegrity to provide that level of protection, but at the same time, give them the business flexibility to utilize that data. So one of our customers, one of the leaders in online prescriptions, and that is an AWS customer, to allow a wonderful service to be delivered to all of their customers while maintaining protection. If you think about sharing data on your watch with your insurance provider, we have lots of customers that bridge that gap and have that personal data coming in to the insurance companies. All the way to, if in a use case in the future, looking at the pandemic, if you have to prove that you've been vaccinated, we're talking about some sensitive information, so you want to be able to show that information but still have the confidence that it's not going to be used for nefarious purposes. >> Right, and what is next for Protegrity? >> Well, I think continuing on our journey, we've been around for 17 years now, and I think the last couple, there's been an absolute renaissance in fine-grained data protection or that connected data protection, and organizations are recognizing that continuing to protect your perimeter, continuing to protect your firewalls, that's not going to go away anytime soon. Your access points, your points of vulnerability to keep bad actors out, but at the same time, recognizing that the data itself needs to be protected but with that balance of utilizing it downstream for analytic purposes, for machine learning, for artificial intelligence. Keeping the data of hundreds of millions if not billions of people saved, that's what we do. If you were to add up the customers of all of our customers, the largest banks, the largest insurance companies, largest healthcare companies in the world, globally, we're protecting the private data of billions of human beings. And it doesn't just stop there, I think you asked a great question about kind of the industry and yes, insurance, healthcare, retail, where there's a lot of sensitive data that certainly can be a focus point. But in the IOT space, kind of if you think about GPS location or geolocation, if you think about a device, and what it does, and the intelligence that it has, and the decisions that it makes on the fly, protecting data and keeping that safe is not just a personal thing, we're stepping into intellectual property and some of the most valuable assets that companies have, which is their decision-making on how they use data and how they deliver an experience, and I think that's why there's been such a renaissance, if you will, in kind of that fine grain data protection that we provide. >> Yeah, well, what is Protegrity's role now in future proofing businesses against cyber attacks? I mean, you mentioned really the ramifications of that and the impact it can have on businesses, but also on governments. I mean, obviously this is really critical. >> So there's kind of a three-step approach, and this is something that we have certainly kind of felt for a long, long time, and we work on with our customers. One is having that fine-grain data protection. So tokenizing your data so that if someone were to get your data, it's worthless, unless they have the ability to unlock every single individual piece of data. So that's number one, and then that's kind of what Protegrity provides. Number two, having a wonderful backup capability to roll kind of an active-active, AWS being one of the major clouds in the world where we deploy our software regularly and work with our customers, having multi-regions, multi-capabilities for an active-active scenario where if there's something that goes down or happens you can bring that down and bring in a new environment up. And then third is kind of malware detection in the rest of the cyber world to make sure that you rinse kind of your architecture from some of those agents. And I think when you kind of look at it, ransomware, they take data, they encrypt your data, so they force you to give them Bitcoin, or whatnot, or they'll release some of your data. And if that data is rendered useless, that's one huge step in kind of your discussions with these nefarious actors and be like you could release it, but there's nothing there, you're not going to see anything. And then second, if you have a wonderful backup capability where you wind down that environment that has been infiltrated, prove that this new environment is safe, have your production data have rolling and then wind that back up, you're back in business. You don't have to notify your customers, you don't have to deal with the ransomware players. So it's really a three-step process but ultimately it starts with protecting your data and tokenizing your data, and that's something that Protegrity does really, really well. >> So you're basically able to eliminate the financial impact of a breach? >> Honestly, we dramatically reduce the risk of customers being at risk for ransomware attacks 100%. Now, tokenizing data and moving that direction is something that it's not trivial, we are literally replacing production data with a token and then making sure that all downstream applications have the ability to utilize that, and make sure that the analytic systems and machine learning systems, and artificial intelligence applications that are built downstream on that data have the ability to execute, but that is something that from our patent portfolio and what we provide to our customers, again, some of the largest organizations in retail, in financial services, in banking, and in healthcare, we've been doing that for a long time. We're not just saying that we can do this and we're in version one of our product, we've been doing this for years, supporting the largest organizations with a 24 by seven capability. >> Right, and tell us a bit about the competitive landscape, where do you see your offering compared to your competitors? >> So, kind of historically back, let's call it an era ago maybe even before cloud even became a thing, and hybrid cloud, there were a handful of players that could acquire into much larger organizations, those organizations have been dusting off those acquired assets, and we're seeing them come back in. There's some new entrants into our space that have some protection mechanisms, whether it be encryption, or whether it be anonymization, but unless you're doing fine grain tokenization, you're not going to be able to allow that data to participate in the artificial intelligence world. So, we see kind of a range of competition there. And then I'd say probably the biggest competitor, Natalie, is customers not doing tokenization. They're saying, "No, we're okay, we'll continue protecting our firewall, we'll continue protecting our access points, we'll invest a little bit more in maybe some governance, but that fine grain data protection, maybe it's not for us." And that is the big shift that's happening. You look at kind of the beginning of this year with the solar winds attack, and the vulnerability that caused the very large and important organizations found themselves the last few weeks with all the ransomware attacks that are happening on meat processing plants and facilities, shutting down meat production, pipeline, stopping oil and gas and kind of that. So we're seeing a complete shift in the types of organizations and the industries that need to protect their data. It's not just the healthcare organizations, or the banks, or the credit card companies, it is every single industry, every single size company. >> Right, and I got to ask you this questioning, what is your defining contribution to the future of cloud scale? >> Well, ultimately we kind of have a charge here at Protegrity where we feel like we protect the world's most sensitive data. And when we come into work every day, that's what every single employee thinks at Protegrity. We are standing behind billions of individuals who are customers of our customers, and that's a cultural thing for us, and we take that very serious. We have maniacal customer support supporting our biggest customers with a fall of the sun 24 by seven global capability. So that's number one. So, I think our part in this is really helping to educate the world that there is a solution for this ransomware and for some of these things that don't have to happen. Now, naturally with any solution, there's going to be some investment, there's going to be some architecture changes, but with partnerships like AWS, and our partnership with pretty much every data provider, data storage provider, data solution provider in the world, we want to provide fine-grain data protection, any data in any system on any platform. And that's our mission. >> Well, Rick Farnell, this has been really fascinating conversation with you, thank you so much. The CEO of Protegrity, really great to have you on this program for the AWS Startup Showcase, talking about how ransomware game has changed with the kinetic data protection. Really appreciate it. Again, I'm your host Natalie Erlich, thank you again very much for watching. (light music)

Published Date : Jun 24 2021

SUMMARY :

of the AWS Startup Showcase Great to be here. and how it has changed with and kind of that backup mixed together that's really central to your business. in the architecture. but at the same time, and have that personal data coming in and some of the most valuable and the impact it can have on businesses, have the ability to unlock and make sure that the analytic systems And that is the big that don't have to happen. really great to have you on this program

ENTITIES

Entity	Category	Confidence
Natalie Erlich	PERSON	0.99+
Rick Farnell	PERSON	0.99+
Natalie	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Protegrity	ORGANIZATION	0.99+
Amazon Web Services	ORGANIZATION	0.99+
24	QUANTITY	0.99+
pandemic	EVENT	0.99+
hundreds of millions	QUANTITY	0.99+
17 years	QUANTITY	0.99+
100%	QUANTITY	0.99+
second	QUANTITY	0.99+
one	QUANTITY	0.98+
third	QUANTITY	0.98+
today	DATE	0.98+
Today	DATE	0.98+
billions of people	QUANTITY	0.98+
One	QUANTITY	0.97+
three-step	QUANTITY	0.97+
hundreds of millions of dollars	QUANTITY	0.96+
both	QUANTITY	0.96+
billions of human beings	QUANTITY	0.96+
billions of individuals	QUANTITY	0.93+
seven	QUANTITY	0.9+
theCUBE	ORGANIZATION	0.89+
Next Big Thing	TITLE	0.85+
Startup Showcase	EVENT	0.85+
first	QUANTITY	0.83+
this year	DATE	0.78+
last	DATE	0.78+
Number two	QUANTITY	0.76+
single industry	QUANTITY	0.76+
single employee	QUANTITY	0.75+
weeks	DATE	0.73+
years	QUANTITY	0.72+
single size	QUANTITY	0.7+
one	OTHER	0.7+
Startup Showcase The Next Big Thing in	EVENT	0.68+
Security, &	EVENT	0.67+
ransomware	TITLE	0.64+
an	DATE	0.63+
24	DATE	0.59+
couple	QUANTITY	0.59+
single individual piece	QUANTITY	0.59+
Sciences	EVENT	0.58+
step	QUANTITY	0.54+
version	QUANTITY	0.46+
sun	EVENT	0.36+

Ariel Assaraf, Coralogix | AWS Startup Showcase: The Next Big Thing in AI, Security, & Life Sciences

(upbeat music) >> Hello and welcome today's session for the AWS Startup Showcase, the next big thing in AI, Security and Life Sciences featuring Coralogix for the AI track. I'm your host, John Furrier with theCUBE. We're here we're joined by Ariel Assaraf, CEO of Coralogix. Ariel, great to see you calling in from remotely, videoing in from Tel Aviv. Thanks for coming on theCUBE. >> Thank you very much, John. Great to be here. >> So you guys are features a hot next thing, start next big thing startup. And one of the things that you guys do we've been covering for many years is, you're into the log analytics, from a data perspective, you guys decouple the analytics from the storage. This is a unique thing. Tell us about it. What's the story? >> Yeah. So what we've seen in the market is that probably because of the great job that a lot of the earlier generation products have done, more and more companies see the value in log data, what used to be like a couple rows, that you add, whenever you have something very important to say, became a standard to document all communication between different components, infrastructure, network, monitoring, and the application layer, of course. And what happens is that data grows extremely fast, all data grows fast, but log data grows even faster. What we always say is that for sure data grows faster than revenue. So as fast as a company grows, its data is going to outpace that. And so we found ourselves thinking, how can we help companies be able to still get the full coverage they want without cherry picking data or deciding exactly what they want to monitor and what they're taking risk with. But still give them the real time analysis that they need to make sure that they get the full insight suite for the entire data, wherever it comes from. And that's why we decided to decouple the analytics layer from storage. So instead of ingesting the data, then indexing and storing it, and then analyzing the stored data, we analyze everything, and then we only store it matters. So we go from the insights backwards. That allowed us to reduce the amount of data, reduce the digital exhaust that it creates, and also provide better insights. So the idea is that as this world of data scales, the need for real time streaming analytics is going to increase. >> So what's interesting is we've seen this decoupling with storage and compute be a great success formula and cloud scale, for instance, that's a known best practice. You're taking a little bit different. I love how you're coming backwards from it, you're working backwards from the insights, almost doing some intelligence on the front end of the data, probably sees a lot of storage costs. But I want to get specifically back to this real time. How do you do that? And how did you come up with this? What's the vision? How did you guys come up with the idea? What was the magic light bulb that went off for Coralogix? >> Yes, the Coralogix story is very interesting. Actually, it was no light bulb, it was a road of pain for years and years, we started by just you know, doing the same, maybe faster, a couple more features. And it didn't work out too well. The first few years, the company were not very successful. And we've grown tremendously in the past three years, almost 100X, since we've launched this, and it came from a pain. So once we started scaling, we saw that the side effects of accessing the storage for analytics, the latency it creates, the the dependency on schema, the price that it poses on our customers became unbearable. And then we started thinking, so okay, how do we get the same level of insights, because there's this perception in the world of storage. And now it started to happen in analytics, also, that talks about tiers. So you want to get a great experience, you pay a lot, you want to get a less than great experience, you pay less, it's a lower tier. And we decided that we're looking for a way to give the same level of real time analytics and the same level of insights. Only without the issue of dependencies, decoupling all the storage schema issues and latency. And we built our real time pipeline, we call it Streama. Streama is a Coralogix real time analysis platform that analyzes everything in real time, also the stateful thing. So stateless analytics in real time is something that's been done in the past and it always worked well. The issue is, how do you give a stateful insight on data that you analyze in real time without storing and I'll explain how can you tell that a certain issue happened that did not happen in the past three months if you did not store the past three months? Or how can you tell that behavior is abnormal if you did not store what's normal, you did not store to state. So we created what we call the state store that holds the state of the system, the state of data, were a snapshot on that state for the entire history. And then instead of our state being the storage, so you know, you asked me, how is this compared to last week? Instead of me going to the storage and compare last week, I go to the state store, and you know, like a record bag, I just scroll fast, I find out one piece of state. And I say, okay, this is how it looked like last week, compared to this week, it changed in ABC. And once we started doing that we on boarded more and more services to that model. And our customers came in and say, hey, you're doing everything in real time. We don't need more than that. Yeah, like a very small portion of data, we actually need to store and frequently search, how about you guys fit into our use cases, and not just sell on quota? And we decided to basically allow our customers to choose what is the use case that they have, and route the data through different use cases. And then each log records, each log record stops at the relevant stops in our data pipeline based on the use case. So just like you wouldn't walk into the supermarket, you fill in a bag, you go out, they weigh it and they say, you know, it's two kilograms, you pay this amount, because different products have different costs and different meaning to you. That same way, exactly, We analyze the data in real time. So we know the importance of data, and we allow you to route it based on your use case and pay a different amount per use case. >> So this is really interesting. So essentially, you guys, essentially capture insights and store those, you call them states, and then not have to go through the data. So it's like you're eliminating the old problem of, you know, going back to the index and recovering the data to get the insights, did we have that? So anyway, it's a round trip query, if you will, you guys are start saving all that data mining cost and time. >> We call it node zero side effects, that round trip that you that you described is exactly it, no side effects to an analysis that is done in real time. I don't need to get the latency from the storage, a bit of latency from the database that holds the model, a bit of latency from the cache, everything stays in memory, everything stays in stream. >> And so basically, it's like the definition of insanity, doing the same thing over and over again and expecting a different result. Here, that's kind of what that is, the old model of insight is go query the database and get something back, you're actually doing the real time filtering on the front end, capturing the insights, if you will, storing those and replicating that as use case. Is that right? >> Exactly. But then, you know, there's still the issue of customer saying, yeah, but I need that data. Someday, I need to really frequently search, I don't know, you know, the unknown unknowns, or some of the day I need for compliance, and I need an immutable record that stays in my compliance bucket forever. So we allowed customers, we have this some that screen, we call the TCO optimizer, that allows them to define those use cases. And they can always access the data by creating their remote storage from Coralogix, or carrying the hot data that is stored with Coralogix. So it's all about use cases. And it's all about how you consume the data because it doesn't make sense for me to pay the same amount or give the same amount of attention to a record that is completely useless. It's just there for the record or for a compliance audit, that may or may not happen in the future. And, you know, do the same with the most critical exception in my application log that has immediate business impact. >> What's really good too, is you can actually set some policy up if you want a certain use cases, okay, store that data. So it's not to say you don't want to store it, but you might want to store it on certain use cases. So I can see that. So I got to ask the question. So how does this differ from the competition? How do you guys compete? Take us through a use case of a customer? How do you guys go to the customer and you just say, hey, we got so much scar tissue from this, we learned the hard way, take it from us? How does it go? Take us through an example. >> So an interesting example of actually a company that is not the your typical early adopter, let's call it this way. A very advanced in technology and smart company, but a huge one, one of the largest telecommunications company in India. And they were actually cherry picking about 100 gigs of data per day, and sending it to one of the legacy providers which has a great solution that does give value. But they weren't even thinking about sending their entire data set because of cost because of scale, because of, you know, just a clutter. Whenever you search, you have to sift through millions of records that many of them are not that important. And we help them actually ask analyze their data and work with them to understand these guys had over a terabyte of data that had incredible insights, it was like a goldmine of insights. But now you just needed to prioritize it by their use case, and they went from 100 gig with the other legacy solution to a terabyte, at almost the same cost, with more advanced insights within one week, which isn't in that scale of an organization is something that is is out of the ordinary, took them four months to implement the other product. But now, when you go from the insights backwards, you understand your data before you have to store it, you understand the data before you have to analyze it, or before you have to manually sift through it. So if you ask about the difference, it's all about the architecture. We analyze and only then index instead of indexing and then analyzing. It sounds simple. But of course, when you look at this stateful analytics, it's a lot more, a lot more complex. >> Take me through your growth story, because first of all, I'll get back to the secret sauce in the same way. I want to get back to how you guys got here. (indistinct) you had this problem? You kind of broke through, you hit the magic formula, talking about the growth? Where's the growth coming from? And what's the real impact? What's the situation relative to the company's growth? >> Yeah, so we had a first rough three years that I kind of mentioned, and then I was not the CEO at the beginning, I'm one of the co founders. I'm more of the technical guy, was the product manager. And I became CEO after the company was kind of on the verge of closing at the end of 2017. And the CTO left the CEO left, the VP of R&D became the CTO, I became the CEO, we were five people with $200,000 in the bank that you know, you know that that's not a long runway. And we kind of changed attitudes. So we kind of, so we first we launched this product, and then we understood that we need to go bottoms up, you can go to enterprises and try to sell something that is out of the ordinary, or that changes how they're used to working or just, you know, sell something, (indistinct) five people will do under $1,000 in the bank. So we started going from bottoms up, and the earlier adopters. And it's still until today, you know, the the more advanced companies, the more advanced teams. This is our Gartner friend Coralogix, the preferred solution for Advanced, DevOps and Platform Teams. So they started adopting Coralogix, and then it grew to the larger organization, and they were actually pushing, there are champions within their organizations. And ever since. So until the beginning of 2018, we raised about $2 million and had sales or marginal. Today, we have over 1500, pink accounts, and we raised almost $100 million more. >> Wow, what a great pivot. That was great example of kind of getting the right wave here, cloud wave. You said in terms of customers, you had the DevOps kind of (indistinct) initially. And now you said expanded out to a lot more traditional enterprise, you can take me through the customer profile. >> Yeah, so I'd say it's still the core would be cloud native and (indistinct) companies. These are typical ones, we have very tight integration with AWS, all the services, all the integrations required, we know how to read and write back to the different services and analysis platforms in AWS. Also for Asia and GCP, but mostly AWS. And then we do have quite a few big enterprise accounts, actually, five of the largest 50 companies in the world use Coralogix today. And it grew from those DevOps and platform evangelists into the level of IT, execs and even (indistinct). So today, we have our security product that already sells to some of the biggest companies in the world, it's a different profile. And the idea for us is that, you know, once you solve that issue of too much data, too expensive, not proactive enough, too couple with the storage, you can actually expand that from observability logging metrics, now into tracing and then into security and maybe even to other fields, where the cost and the productivity are an issue for many companies. >> So let me ask you this question, then Ariel, if you don't mind. So if a customer has a need for Coralogix, is it because the data fall? Or they just got data kind of sprawled all over the place? Or is it that storage costs are going up on S3 or what's some of the signaling that you would see, that would be like, telling you, okay, okay, what's the opportunity to come in and either clean house or fix the mess or whatnot, Take us through what you see. What do you see is the trend? >> Yeah. So like the tip customer (indistinct) Coralogix will be someone using one of the legacy solution and growing very fast. That's the easiest way for us to know. >> What grows fast? The storage, the storage is growing fast? >> The company is growing fast. >> Okay. And you remember, the data grows faster than revenue. And we know that. So if I see a company that grew from, you know, 50 people to 500, in three years, specifically, if it's cloud native or internet company, I know that their data grew not 10X, but 100X. So I know that that company that might started with a legacy solution at like, you know, $1,000 a month, and they're happy with it. And you know, for $1,000 a month, if you don't have a lot of data, those legacy solutions, you know, they'll do the trick. But now I know that they're going to get asked to pay 50, 60, $70,000 a month. And this is exactly where we kick in. Because now, when it doesn't fit the economic model, when it doesn't fit the unit economics, and he started damaging the margins of those companies. Because remember, those internet and cloud companies, it's not costs are not the classic costs that you'll see in an enterprise, they're actually damaging your unit economics and the valuation of the business, the bigger deal. So now, when I see that type of organization, we come in and say, hey, better coverage, more advanced analytics, easier integration within your organization, we support all the common open source syntaxes, and dashboards, you can plug it into your entire environment, and the costs are going to be a quarter of whatever you're paying today. So once they see that they see, you know, the Dev friendliness of the product, the ease of scale, the stability of the product, it makes a lot more sense for them to engage in a PLC, because at the end of the day, if you don't prove value, you know, you can come with 90% discount, it doesn't do anything, not to prove the value to them. So it's a great door opener. But from then on, you know, it's a PLC like any other. >> Cloud is all about the PLC or pilot, as they say. So take me through the product, today, and what's next for the product, take us through the vision of the product and the product strategy. >> Yeah, so today, the product allows you to send any log data, metric data or security information, analyze it a million ways, we have one of the most extensive alerting mechanism to market, automatic anomaly detection, data flustering. And all the real law, you know, the real time pipeline, things that help companies make their data smarter, and more readable, parsing, enriching, getting external sources to enrich the data, and so on, so forth. Where we're stepping in now is actually to make the final step of decoupling the analytics from storage, what we call the datalist data platform in which no data will sit or reside within the Coralogix cloud, everything will be analyzed in real time, stored in a storage of choice of our customers, then we'll allow our customers to remotely query that incredible performance. So that'll bring our customers away, to have the first ever true SaaS experience for observability. Think about no quota plans, no retention, you send whatever you want, you pay only for what you send, you retain it, how long you want to retain it, and you get all the real time insights much, much faster than any other product that keeps it on a hot storage. So that'll be our next step to really make sure that, you know, we're kind of not reselling cloud storage, because a lot of the times when you are dependent on storage, and you know, we're a cloud company, like I mentioned, you got to keep your unit economics. So what do you do? You sell storage to the customer, you add your markup, and then you you charge for it. And this is exactly where we don't want to be. We want to sell the intelligence and the insights and the real time analysis that we know how to do and let the customers enjoy the, you know, the wealth of opportunities and choices their cloud providers offer for storage. >> That's great vision in a way, the hyper scalars early days showed that decoupling compute from storage, which I mentioned earlier, was a huge category creation. Here, you're doing it for data. We call hyper data scale, or like, maybe there's got to be a name for this. What do you see, about five years from now? Take us through the trajectory of the next five years, because certainly observability is not going away. I mean, it's data management, monitoring, real time, asynchronous, synchronous, linear, all the stuffs happening, what's the what's the five year vision? >> Now add security and observability, which is something we started preaching for, because no one can say I have observability to my environment when people you know, come in and out and steal data. That's no observability. But the thing is that because data grows exponentially, because it grows faster than revenue what we believe is that in five years, there's not going to be a choice, everyone are going to have to analyze the data in real time. Extract the insights and then decide whether to store it on a you know long term archive or not, or not store it at all. You still want to get the full coverage and insights. But you know, when you think about observability, unlike many other things, the more data you have many times, the less observability you get. So you think of log data unlike statistics, if my system was only in recording everything was only generating 10 records a day, I have full, incredible observability I know everything that I've done. what happens is that you pay more, you get less observability, and more uncertainty. So I think that you know, with time, we'll start seeing more and more real time streaming analytics, and a lot less storage based and index based solutions. >> You know, Ariel, I've always been saying to Dave Vellante on theCUBE, many times that there needs to be insights as to be the norm, not the exception, where, and then ultimately, it would be a database of insights. I mean, at the end of the day, the insights become more plentiful. You have the ability to actually store those insights, and refresh them and challenge them and model update them, verify them, either sunset them or add to them or you know, saying that's like, when you start getting more data into your organization, AI and machine learning prove that pattern recognition works. So why not grab those insights? >> And use them as your baseline to know what's important, and not have to start by putting everything in a bucket. >> So we're going to have new categories like insight, first, software (indistinct) >> Go from insights backwards, that'll be my tagline, if I have to, but I'm a terrible marketing (indistinct). >> Yeah, well, I mean, everyone's like cloud, first data, data is data driven, insight driven, what you're basically doing is you're moving into the world of insights driven analytics, really, as a way to kind of bring that forward. So congratulations. Great story. I love the pivot love how you guys entrepreneurially put it all together and had the problem your own problem and brought it out and to the to the rest of the world. And certainly DevOps in the cloud scale wave is just getting bigger and bigger and taking over the enterprise. So great stuff. Real quick while you're here. Give a quick plug for the company. What you guys are up to, stats, vitals, hiring, what's new, give the commercial. >> Yeah, so like mentioned over 1500 being customers growing incredibly in the past 24 months, hiring, almost doubling the company in the next few months. offices in Israel, East Center, West US, and UK and Mumbai. Looking for talented engineers to join the journey and build the next generation of data lists data platforms. >> Ariel Assaraf, CEO of Coralogix. Great to have you on theCUBE and thank you for participating in the AI track for our next big thing in the Startup Showcase. Thanks for coming on. >> Thank you very much John, really enjoyed it. >> Okay, I'm John Furrier with theCUBE. Thank you for watching the AWS Startup Showcase presented by theCUBE. (calm music)

Published Date : Jun 24 2021

SUMMARY :

Ariel, great to see you Thank you very much, John. And one of the things that you guys do So instead of ingesting the data, And how did you come up with this? and we allow you to route and recovering the data database that holds the model, capturing the insights, if you will, that may or may not happen in the future. So it's not to say you that is not the your sauce in the same way. and the earlier adopters. And now you said expanded out to And the idea for us is that, the opportunity to come in So like the tip customer and the costs are going to be a quarter and the product strategy. and let the customers enjoy the, you know, of the next five years, the more data you have many times, You have the ability to and not have to start by Go from insights backwards, I love the pivot love how you guys and build the next generation and thank you for Thank you very much the AWS Startup Showcase

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Ariel Assaraf	PERSON	0.99+
$200,000	QUANTITY	0.99+
Israel	LOCATION	0.99+
India	LOCATION	0.99+
90%	QUANTITY	0.99+
John	PERSON	0.99+
last week	DATE	0.99+
$1,000	QUANTITY	0.99+
Tel Aviv	LOCATION	0.99+
10X	QUANTITY	0.99+
John Furrier	PERSON	0.99+
two kilograms	QUANTITY	0.99+
100 gig	QUANTITY	0.99+
Mumbai	LOCATION	0.99+
UK	LOCATION	0.99+
50	QUANTITY	0.99+
Ariel	PERSON	0.99+
50 people	QUANTITY	0.99+
Coralogix	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
five	QUANTITY	0.99+
this week	DATE	0.99+
three years	QUANTITY	0.99+
today	DATE	0.99+
five people	QUANTITY	0.99+
100X	QUANTITY	0.99+
Today	DATE	0.99+
five year	QUANTITY	0.99+
each log	QUANTITY	0.99+
about $2 million	QUANTITY	0.99+
four months	QUANTITY	0.99+
five years	QUANTITY	0.99+
one piece	QUANTITY	0.99+
millions of records	QUANTITY	0.99+
60	QUANTITY	0.99+
50 companies	QUANTITY	0.99+
almost $100 million	QUANTITY	0.99+
one week	QUANTITY	0.99+
Gartner	ORGANIZATION	0.99+
500	QUANTITY	0.98+
Asia	LOCATION	0.98+
Coralogix	PERSON	0.98+
West US	LOCATION	0.98+
over 1500	QUANTITY	0.98+
East Center	LOCATION	0.97+
under $1,000	QUANTITY	0.97+
first	QUANTITY	0.96+
each log records	QUANTITY	0.96+
10 records a day	QUANTITY	0.96+
one	QUANTITY	0.96+
end of 2017	DATE	0.96+
about 100 gigs	QUANTITY	0.96+
Streama	TITLE	0.95+
$1,000 a month	QUANTITY	0.95+
R&D	ORGANIZATION	0.95+
beginning	DATE	0.95+
first few years	QUANTITY	0.93+
past three months	DATE	0.93+
$70,000 a month	QUANTITY	0.9+
Coralogix	TITLE	0.9+
GCP	ORGANIZATION	0.88+
TCO	ORGANIZATION	0.88+
AWS Startup Showcase	EVENT	0.87+

Toni Manzano, Aizon | AWS Startup Showcase | The Next Big Thing in AI, Security, & Life Sciences

(up-tempo music) >> Welcome to today's session of the cube's presentation of the AWS startup showcase. The next big thing in AI security and life sciences. Today, we'll be speaking with Aizon, as part of our life sciences track and I'm pleased to welcome the co-founder as well as the chief science officer of Aizon: Toni Monzano, will be discussing how artificial intelligence is driving key processes in pharma manufacturing. Welcome to the show. Thanks so much for being with us today. >> Thank you Natalie to you and to your introduction. >> Yeah. Well, as you know industry 4.0 is revolutionizing manufacturing across many industries. Let's talk about how it's impacting biotech and pharma and as well as Aizon's contributions to this revolution. >> Well, actually pharmacogenetics is totally introducing a new concept of how to manage processes. So, nowadays the industry is considering that everything is particularly static, nothing changes and this is because they don't have the ability to manage the complexity and the variability around the biotech and the driving factor in processes. Nowadays, with pharma - technologies cloud, our computing, IOT, AI, we can get all those data. We can understand the data and we can interact in real time, with processes. This is how things are going on nowadays. >> Fascinating. Well, as you know COVID-19 really threw a wrench in a lot of activity in the world, our economies, and also people's way of life. How did it impact manufacturing in terms of scale up and scale out? And what are your observations from this year? >> You know, the main problem when you want to do a scale-up process is not only the equipment, it is also the knowledge that you have around your process. When you're doing a vaccine on a smaller scale in your lab, the only parameters you're controlling in your lab, they have to be escalated when you work from five liters to 2,500 liters. How to manage this different of a scale? Well, AI is helping nowadays in order to detect and to identify the most relevant factors involved in the process. The critical relationship between the variables and the final control of all the full process following a continued process verification. This is how we can help nowadays in using AI and cloud technologies in order to accelerate and to scale up vaccines like the COVID-19. >> And how do you anticipate pharma manufacturing to change in a post COVID world? >> This is a very good question. Nowadays, we have some assumptions that we are trying to overpass yet with human efforts. Nowadays, with the new situation, with the pandemic that we are living in, the next evolution that we are doing humans will take care about the good practices of the new knowledge that we have to generate. So AI will manage the repetitive tasks, all the human condition activity that we are doing, So that will be done by AI, and humans will never again do repetitive tasks in this way. They will manage complex problems and supervise AI output. >> So you're driving more efficiencies in the manufacturing process with AI. You recently presented at the United nations industrial development organization about the challenges brought by COVID-19 and how AI is helping with the equitable distribution of vaccines and therapies. What are some of the ways that companies like Aizon can now help with that kind of response? >> Very good point. Could you imagine you're a big company, a top pharma company, that you have an intellectual property of COVID-19 vaccine based on emergency and principle, and you are going to, or you would like to, expand this vaccination in order not to get vaccination, also to manufacture the vaccine. What if you try to manufacture these vaccines in South Africa or in Asia in India? So the secret is to transport, not only the raw material not only the equipment, also the knowledge. How to appreciate how to control the full process from the initial phase 'till their packaging and the vials filling. So, this is how we are contributing. AI is packaging all this knowledge in just AI models. This is the secret. >> Interesting. Well, what are the benefits for pharma manufacturers when considering the implementation of AI and cloud technologies. And how can they progress in their digital transformation by utilizing them? >> One of the benefits is that you are able to manage the variability the real complexity in the world. So, you can not create processes, in order to manufacture drugs, just considering that the raw material that you're using is never changing. You cannot consider that all the equipment works in the same way. You cannot consider that your recipe will work in the same way in Brazil than in Singapore. So the complexity and the variability is must be understood as part of the process. This is one of the benefits. The second benefit is that when you use cloud technologies, you have not a big care about computing's licenses, software updates, antivirals, scale up of cloud ware computing. Everything is done in the cloud. So well, this is two main benefits. There are more, but this is maybe the two main ones. >> Yeah. Well, that's really interesting how you highlight how this is really. There's a big shift how you handle this in different parts of the world. So, what role does compliance and regulation play here? And of course we see differences the way that's handled around the world as well. >> Well, I think that is the first time the human race in the pharma - let me say experience - that we have a very strong commitment from the 30 bodies, you know, to push forward using this kind of technologies actually, for example, the FDA, they are using cloud, to manage their own system. So why not use them in pharma? >> Yeah. Well, how does AWS and Aizon help manufacturers address these kinds of considerations? >> Well, we have a very great partner. AWS, for us, is simplifying a lot our life. So, we are a very, let me say different startup company, Aizon, because we have a lot of PhDs in the company. So we are not in the classical geeky company with guys all day parameter developing. So we have a lot of science inside the company. So this is our value. So everything that is provided by Amazon, why we have to aim to recreate again so we can rely on Sage Maker. we can rely on Cogito, we can rely on Landon we can rely on Esri to have encryption data with automatic backup. So, AWS is simplifying a lot of our life. And we can dedicate all our knowledge and all our efforts to the things that we know: pharma compliance. >> And how do you anticipate that pharma manufacturing will change further in the 2021 year? Well, we are participating not only with business cases. We also participate with the community because we are leading an international project in order to anticipate this kind of new breakthroughs. So, we are working with, let me say, initiatives in the - association we are collaborating in two different projects in order to apply AI in computer certification in order to create more robust process for the MRA vaccine. We are collaborating with the - university creating the standards for AI application in GXP. We collaborating with different initiatives with the pharma community in order to create the foundation to move forward during this year. >> And how do you see the competitive landscape? What do you think Aizon provides compared to its competitors? >> Well, good question. Probably, you can find a lot of AI services, platforms, programs softwares that can run in the industrial environment. But I think that it will be very difficult to find a GXP - a full GXP-compliant platform working on cloud with AI when AI is already qualified. I think that no one is doing that nowadays. And one of the demonstration for that is that we are also writing some scientific papers describing how to do that. So you will see that Aizon is the only company that is doing that nowadays. >> Yeah. And how do you anticipate that pharma manufacturing will change or excuse me how do you see that it is providing a defining contribution to the future of cloud-scale? >> Well, there is no limits in cloud. So as far as you accept that everything is varied and complex, you will need power computing. So the only way to manage this complexity is running a lot of power computation. So cloud is the only system, let me say, that allows that. Well, the thing is that, you know pharma will also have to be compliant with the cloud providers. And for that, we created a new layer around the platform that we say qualification as a service. We are creating this layer in order to qualify continuously any kind of cloud platform that wants to work on environment. This is how we are doing that. >> And in what areas are you looking to improve? How are you constantly trying to develop the product and bring it to the next level? >> Always we have, you know, in mind the patient. So Aizon is a patient-centric company. Everything that we do is to improve processes in order to improve at the end, to deliver the right medicine at the right time to the right patient. So this is how we are focusing all our efforts in order to bring this opportunity to everyone around the world. For this reason, for example, we want to work with this project where we are delivering value to create vaccines for COVID-19, for example, everywhere. Just packaging the knowledge using AI. This is how we envision and how we are acting. >> Yeah. Well, you mentioned the importance of science and compliance. What do you think are the key themes that are the foundation of your company? >> The first thing is that we enjoy the task that we are doing. This is the first thing. The other thing is that we are learning every day with our customers and for real topics. So we are serving to the patients. And everything that we do is enjoying science enjoying how to achieve new breakthroughs in order to improve life in the factory. We know that at the end will be delivered to the final patient. So enjoying making science and creating breakthroughs; being innovative. >> Right, and do you think that in the sense that we were lucky, in light of COVID, that we've already had these kinds of technologies moving in this direction for some time that we were somehow able to mitigate the tragedy and the disaster of this situation because of these technologies? >> Sure. So we are lucky because of this technology because we are breaking the distance, the physical distance, and we are putting together people that was so difficult to do that in all the different aspects. So, nowadays we are able to be closer to the patients to the people, to the customer, thanks to these technologies. Yes. >> So now that also we're moving out of, I mean, hopefully out of this kind of COVID reality, what's next for Aizon? Do you see more collaboration? You know, what's next for the company? >> The next for the company is to deliver AI models that are able to be encapsulated in the drug manufacturing for vaccines, for example. And that will be delivered with the full process not only materials, equipment, personnel, recipes also the AI models will go together as part of the recipe. >> Right, well, we'd love to hear more about your partnership with AWS. How did you get involved with them? And why them, and not another partner? >> Well, let me explain to you a secret. Seven years ago, we started with another top cloud provider, but we saw very soon, that this other cloud provider were not well aligned with the GXP requirements. For this reason, we met with AWS. We went together to some seminars, conferences with top pharma communities and pharma organizations. We went there to make speeches and talks. We felt that we fit very well together because AWS has a GXP white paper describing very well how to rely on AWS components. One by one. So this is for us, this is a very good credential, when we go to our customers. Do you know that when customers are acquiring and are establishing the Aizon platform in their systems, they are outbidding us. They are outbidding Aizon. Well we have to also outbid AWS because this is the normal chain in pharma supplier. Well, that means that we need this documentation. We need all this transparency between AWS and our partners. This is the main reason. >> Well, this has been a really fascinating conversation to hear how AI and cloud are revolutionizing pharma manufacturing at such a critical time for society all over the world. Really appreciate your insights, Toni Monzano: the chief science officer and co-founder of Aizon. I'm your host, Natalie Erlich, for the Cube's presentation of the AWS startup showcase. Thanks very much for watching. (soft upbeat music)

Published Date : Jun 24 2021

SUMMARY :

of the AWS startup showcase. and to your introduction. contributions to this revolution. and the variability around the biotech in a lot of activity in the world, the knowledge that you the next evolution that we are doing in the manufacturing process with AI. So the secret is to transport, considering the implementation You cannot consider that all the equipment And of course we see differences from the 30 bodies, you and Aizon help manufacturers to the things that we in order to create the is that we are also to the future of cloud-scale? So cloud is the only system, at the right time to the right patient. the importance of science and compliance. the task that we are doing. and we are putting in the drug manufacturing love to hear more about This is the main reason. of the AWS startup showcase.

ENTITIES

Entity	Category	Confidence
Toni Monzano	PERSON	0.99+
Natalie Erlich	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Natalie	PERSON	0.99+
Aizon	ORGANIZATION	0.99+
Singapore	LOCATION	0.99+
Brazil	LOCATION	0.99+
South Africa	LOCATION	0.99+
Amazon	ORGANIZATION	0.99+
Asia	LOCATION	0.99+
COVID-19	OTHER	0.99+
one	QUANTITY	0.99+
2,500 liters	QUANTITY	0.99+
five liters	QUANTITY	0.99+
2021 year	DATE	0.99+
30 bodies	QUANTITY	0.99+
Today	DATE	0.99+
second benefit	QUANTITY	0.99+
India	LOCATION	0.99+
Toni Manzano	PERSON	0.99+
One	QUANTITY	0.99+
two main benefits	QUANTITY	0.99+
pandemic	EVENT	0.98+
today	DATE	0.98+
two different projects	QUANTITY	0.98+
COVID	OTHER	0.97+
Seven years ago	DATE	0.97+
two main ones	QUANTITY	0.97+
this year	DATE	0.96+
Landon	ORGANIZATION	0.95+
first thing	QUANTITY	0.92+
FDA	ORGANIZATION	0.89+
MRA	ORGANIZATION	0.88+
Cube	ORGANIZATION	0.85+
United nations	ORGANIZATION	0.82+
first time	QUANTITY	0.8+
Sage Maker	TITLE	0.77+
Startup Showcase	EVENT	0.73+
GXP	ORGANIZATION	0.64+
Esri	ORGANIZATION	0.64+
GXP	TITLE	0.6+
Cogito	ORGANIZATION	0.6+
Aizon	TITLE	0.57+
benefits	QUANTITY	0.36+
GXP	COMMERCIAL_ITEM	0.36+

Gil Geron, Orca Security | AWS Startup Showcase: The Next Big Thing in AI, Security, & Life Sciences

(upbeat electronic music) >> Hello, everyone. Welcome to theCUBE's presentation of the AWS Startup Showcase. The Next Big Thing in AI, Security, and Life Sciences. In this segment, we feature Orca Security as a notable trend setter within, of course, the security track. I'm your host, Dave Vellante. And today we're joined by Gil Geron. Who's the co-founder and Chief Product Officer at Orca Security. And we're going to discuss how to eliminate cloud security blind spots. Orca has a really novel approach to cybersecurity problems, without using agents. So welcome Gil to today's sessions. Thanks for coming on. >> Thank you for having me. >> You're very welcome. So Gil, you're a disruptor in security and cloud security specifically and you've created an agentless way of securing cloud assets. You call this side scanning. We're going to get into that and probe that a little bit into the how and the why agentless is the future of cloud security. But I want to start at the beginning. What were the main gaps that you saw in cloud security that spawned Orca Security? >> I think that the main gaps that we saw when we started Orca were pretty similar in nature to gaps that we saw in legacy, infrastructures, in more traditional data centers. But when you look at the cloud when you look at the nature of the cloud the ephemeral nature, the technical possibilities and disruptive way of working with a data center, we saw that the usage of traditional approaches like agents in these environments is lacking, it actually not only working as well as it was in the legacy world, it's also, it's providing less value. And in addition, we saw that the friction between the security team and the IT, the engineering, the DevOps in the cloud is much worse or how does that it was, and we wanted to find a way, we want for them to work together to bridge that gap and to actually allow them to leverage the cloud technology as it was intended to gain superior security than what was possible in the on-prem world. >> Excellent, let's talk a little bit more about agentless. I mean, maybe we could talk a little bit about why agentless is so compelling. I mean, it's kind of obvious it's less intrusive. You've got fewer processes to manage, but how did you create your agentless approach to cloud security? >> Yes, so I think the basis of it all is around our mission and what we try to provide. We want to provide seamless security because we believe it will allow the business to grow faster. It will allow the business to adopt technology faster and to be more dynamic and achieve goals faster. And so we've looked on what are the problems or what are the issues that slow you down? And one of them, of course, is the fact that you need to install agents that they cause performance impact, that they are technically segregated from one another, meaning you need to install multiple agents and they need to somehow not interfere with one another. And we saw this friction causes organization to slow down their move to the cloud or slow down the adoption of technology. In the cloud, it's not only having servers, right? You have containers, you have manage services, you have so many different options and opportunities. And so you need a different approach on how to secure that. And so when we understood that this is the challenge, we decided to attack it in three, using three periods; one, trying to provide complete security and complete coverage with no friction, trying to provide comprehensive security, which is taking an holistic approach, a platform approach and combining the data in order to provide you visibility into all of your security assets, and last but not least of course, is context awareness, meaning being able to understand and find these the 1% that matter in the environment. So you can actually improve your security posture and improve your security overall. And to do so, you had to have a technique that does not involve agents. And so what we've done, we've find a way that utilizes the cloud architecture in order to scan the cloud itself, basically when you integrate Orca, you are able within minutes to understand, to read, and to view all of the risks. We are leveraging a technique that we are calling side scanning that uses the API. So it uses the infrastructure of the cloud itself to read the block storage device of every compute instance and every instance, in the environment, and then we can deduce the actual risk of every asset. >> So that's a clever name, side scanning. Tell us a little bit more about that. Maybe you could double click on, on how it works. You've mentioned it's looking into block storage and leveraging the API is a very, very clever actually quite innovative. But help us understand in more detail how it works and why it's better than traditional tools that we might find in this space. >> Yes, so the way that it works is that by reading the block storage device, we are able to actually deduce what is running on your computer, meaning what kind of waste packages applications are running. And then by con combining the context, meaning understanding that what kind of services you have connected to the internet, what is the attack surface for these services? What will be the business impact? Will there be any access to PII or any access to the crown jewels of the organization? You can not only understand the risks. You can also understand the impact and then understand what should be our focus in terms of security of the environment. Different factories, the fact that we are doing it using the infrastructure itself, we are not installing any agents, we are not running any packet. You do not need to change anything in your architecture or design of how you use the cloud in order to utilize Orca Orca is working in a pure SaaS way. And so it means that there is no impact, not on cost and not on performance of your environment while using Orca. And so it reduces any friction that might happen with other parties of the organization when you enjoy the security or improve your security in the cloud. >> Yeah, and no process management intrusion. Now, I presume Gil that you eat your own cooking, meaning you're using your own product. First of all, is that true? And if so, how has your use of Orca as a chief product officer help you scale Orca as a company? >> So it's a great question. I think that something that we understood early on is that there is a, quite a significant difference between the way you architect your security in cloud and also the way that things reach production, meaning there's a difference, that there's a gap between how you imagined, like in everything in life how you imagine things will be and how they are in real life in production. And so, even though we have amazing customers that are extremely proficient in security and have thought of a lot of ways of how to secure the environment. Ans so, we of course, we are trying to secure environment as much as possible. We are using Orca because we understand that no one is perfect. We are not perfect. We might, the engineers might, my engineers might make mistakes like every organization. And so we are using Orca because we want to have complete coverage. We want to understand if we are doing any mistake. And sometimes the gap between the architecture and the hole in the security or the gap that you have in your security could take years to happen. And you need a tool that will constantly monitor your environment. And so that's why we are using Orca all around from day one not to find bugs or to do QA, we're doing it because we need security to our cloud environment that will provide these values. And so we've also passed the compliance auditing like SOC 2 and ISO using Orca and it expedited and allowed us to do these processes extremely fast because of having all of these guardrails and metrics has. >> Yeah, so, okay. So you recognized that you potentially had and did have that same problem as your customer has been. Has it helped you scale as a company obviously but how has it helped you scale as a company? >> So it helped us scale as a company by increasing the trust, the level of trust customer having Orca. It allowed us to adopt technology faster, meaning we need much less diligence or exploration of how to use technology because we have these guardrails. So we can use the richness of the technology that we have in the cloud without the need to stop, to install agents, to try to re architecture the way that we are using the technology. And we simply use it. We simply use the technology that the cloud offer as it is. And so it allows you a rapid scalability. >> Allows you allows you to move at the speed of cloud. Now, so I'm going to ask you as a co-founder, you got to wear many hats first of a co-founder and the leadership component there. And also the chief product officer, you got to go out, you got to get early customers, but but even more importantly you have to keep those customers retention. So maybe you can describe how customers have been using Orca. Did they, what was their aha moment that you've seen customers react to when you showcase the new product? And then how have you been able to keep them as loyal partners? >> So I think that we are very fortunate, we have a lot of, we are blessed with our customers. Many of our customers are vocal customers about what they like about Orca. And I think that something that comes along a lot of times is that this is a solution they have been waiting for. I can't express how many times I hear that I could go on a call and a customer says, "I must say, I must share. "This is a solution I've been looking for." And I think that in that respect, Orca is creating a new standard of what is expected from a security solution because we are transforming the security all in the company from an inhibitor to an enabler. You can use the technology. You can use new tools. You can use the cloud as it was intended. And so (coughs) we have customers like one of these cases is a customer that they have a lot of data and they're all super scared about using S3 buckets. We call over all of these incidents of these three buckets being breached or people connecting to an s3 bucket and downloading the data. So they had a policy saying, "S3 bucket should not be used. "We do not allow any use of S3 bucket." And obviously you do need to use S3 bucket. It's a powerful technology. And so the engineering team in that customer environment, simply installed a VM, installed an FTP server, and very easy to use password to that FTP server. And obviously two years later, someone also put all of the customer databases on that FTP server, open to the internet, open to everyone. And so I think it was for him and for us as well. It was a hard moment. First of all, he planned that no data will be leaked but actually what happened is way worse. The data was open to the to do to the world in a technology that exists for a very long time. And it's probably being scanned by attackers all the time. But after that, he not only allowed them to use S3 bucket because he knew that now he can monitor. Now, you can understand that they are using the technology as intended, now that they are using it securely. It's not open to everyone it's open in the right way. And there was no PII on that S3 bucket. And so I think the way he described it is that, now when he's coming to a meeting about things that needs to be improved, people are waiting for this meeting because he actually knows more than what they know, what they know about the environment. And I see it really so many times where a simple mistake or something that looks benign when you look at the environment in a holistic way, when you are looking on the context, you understand that there is a huge gap. That should be the breech. And another cool example was a case where a customer allowed an access from a third party service that everyone trusts to the crown jewels of the environment. And he did it in a very traditional way. He allowed a certain IP to be open to that environment. So overall it sounds like the correct way to go. You allow only a specific IP to access the environment but what he failed to to notice is that everyone in the world can register for free for this third-party service and access the environment from this IP. And so, even though it looks like you have access from a trusted service, a trusted third party service, when it's a Saas service, it's actually, it can mean that everyone can use it in order to access the environment and using Orca, you saw immediately the access, you saw immediately the risk. And I see it time after time that people are simply using Orca to monitor, to guardrail, to make sure that the environment stays safe throughout time and to communicate better in the organization to explain the risk in a very easy way. And the, I would say the statistics show that within few weeks, more than 85% of the different alerts and risks are being fixed, and think it comes to show how effective it is and how effective it is in improving your posture, because people are taking action. >> Those are two great examples, and of course they have often said that the shared responsibility model is often misunderstood. And those two examples underscore thinking that, "oh I hear all this, see all this press about S3, but it's up to the customer to secure the endpoint components et cetera. Configure it properly is what I'm saying. So what an unintended consequence, but but Orca plays a role in helping the customer with their portion of that shared responsibility. Obviously AWS is taking care of this. Now, as part of this program we ask a little bit of a challenging question to everybody because look it as a startup, you want to do well you want to grow a company. You want to have your employees, you know grow and help your customers. And that's great and grow revenues, et cetera but we feel like there's more. And so we're going to ask you because the theme here is all about cloud scale. What is your defining contribution to the future of cloud at scale, Gil? >> So I think that cloud is allowed the revolution to the data centers, okay? The way that you are building services, the way that you are allowing technology to be more adaptive, dynamic, ephemeral, accurate, and you see that it is being adopted across all vendors all type of industries across the world. I think that Orca is the first company that allows you to use this technology to secure your infrastructure in a way that was not possible in the on-prem world, meaning that when you're using the cloud technology and you're using technologies like Orca, you're actually gaining superior security that what was possible in the pre cloud world. And I think that, to that respect, Orca is going hand in hand with the evolution and actually revolutionizes the way that you expect to consume security, the way that you expect to get value, from security solutions across the world. >> Thank You for that Gil. And so we're at the end of our time, but we'll give you a chance for final wrap up. Bring us home with your summary, please. >> So I think that Orca is building the cloud security solution that actually works with its innovative aid agentless approach to cyber security to gain complete coverage, comprehensive solution and to gain, to understand the complete context of the 1% that matters in your security challenges across your data centers in the cloud. We are bridging the gap between the security teams, the business needs to grow and to do so in the paste of the cloud, I think the approach of being able to install within minutes, a security solution in getting complete understanding of your risk which is goes hand in hand in the way you expect and adopt cloud technology. >> That's great Gil. Thanks so much for coming on. You guys doing awesome work. Really appreciate you participating in the program. >> Thank you very much. >> And thank you for watching this AWS Startup Showcase. We're covering the next big thing in AI, Security, and Life Science on theCUBE. Keep it right there for more great content. (upbeat music)

Published Date : Jun 24 2021

SUMMARY :

of the AWS Startup Showcase. agentless is the future of cloud security. and the IT, the engineering, but how did you create And to do so, you had to have a technique into block storage and leveraging the API is that by reading the you eat your own cooking, or the gap that you have and did have that same problem And so it allows you a rapid scalability. to when you showcase the new product? the to do to the world And so we're going to ask you the way that you expect to get value, but we'll give you a in the way you expect and participating in the program. And thank you for watching

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Orca	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
1%	QUANTITY	0.99+
Gil	PERSON	0.99+
Gil Geron	PERSON	0.99+
one	QUANTITY	0.99+
more than 85%	QUANTITY	0.99+
two examples	QUANTITY	0.99+
two years later	DATE	0.99+
Orca Security	ORGANIZATION	0.98+
three	QUANTITY	0.98+
two great examples	QUANTITY	0.98+
ISO	ORGANIZATION	0.98+
three buckets	QUANTITY	0.97+
three periods	QUANTITY	0.96+
today	DATE	0.96+
S3	TITLE	0.96+
First	QUANTITY	0.95+
first	QUANTITY	0.94+
first company	QUANTITY	0.91+
day one	QUANTITY	0.9+
SOC 2	TITLE	0.87+
theCUBE	ORGANIZATION	0.86+
Saas	ORGANIZATION	0.82+
Startup Showcase	EVENT	0.8+
s3	TITLE	0.7+
double	QUANTITY	0.57+
Gil	ORGANIZATION	0.55+
Next Big Thing	TITLE	0.51+
years	QUANTITY	0.5+
S3	COMMERCIAL_ITEM	0.47+

Rohan D'Souza, Olive | AWS Startup Showcase | The Next Big Thing in AI, Security, & Life Sciences.

(upbeat music) (music fades) >> Welcome to today's session of theCUBE's presentation of the AWS Startup Showcase, I'm your host Natalie Erlich. Today, we're going to feature Olive, in the life sciences track. And of course, this is part of the future of AI, security, and life sciences. Here we're joined by our very special guest Rohan D'Souza, the Chief Product Officer of Olive. Thank you very much for being with us. Of course, we're going to talk today about building the internet of healthcare. I do you appreciate you joining the show. >> Thanks, Natalie. My pleasure to be here, I'm excited. >> Yeah, likewise. Well tell us about AI and how it's revolutionizing health systems across America. >> Yeah, I mean, we're clearly living around, living at this time of a lot of hype with AI, and there's a tremendous amount of excitement. Unfortunately for us, or, you know, depending on if you're an optimist or a pessimist, we had to wait for a global pandemic for people to realize that technology is here to really come into the aid of assisting everybody in healthcare, not just on the consumer side, but on the industry side, and on the enterprise side of delivering better care. And it's a truly an exciting time, but there's a lot of buzz and we play an important role in trying to define that a little bit better because you can't go too far today and hear about the term AI being used/misused in healthcare. >> Definitely. And also I'd love to hear about how Olive is fitting into this, and its contributions to AI in health systems. >> Yeah, so at its core, we, the industry thinks of us very much as an automation player. We are, we've historically been in the trenches of healthcare, mostly on the provider side of the house, in leveraging technology to automate a lot of the high velocity, low variability items. Our founding and our DNA is in this idea of, we think it's unfair that healthcare relies on humans as being routers. And we have looked to solve the problem of technology not talking to each other, by using humans. And so we set out to really go in into the trenches of healthcare and bring about core automation technology. And you might be sitting there wondering, well why are we talking about automation under the umbrella of AI? And that's because we are challenging the very status quo of siloed-based automation, and we're building, what we say, is the internet of healthcare. And more importantly what we've done is, we've brought in a human, very empathetic approach to automation, and we're leveraging technology by saying when one Olive learns, all Olives learn, so that we take advantage of the network effect of a single Olive worker in the trenches of healthcare, sharing that knowledge and wisdom, both with her human counterparts, but also with her AI worker counterparts that are showing up to work every single day in some of the most complex health systems in this country. >> Right. Well, when you think about AI and, you know, computer technology, you don't exactly think of, you know, humanizing kind of potential. So how are you seeking to make AI really humanistic, and empathetic, potentially? >> Well, most importantly the way we're starting with that is where we are treating Olive just like we would any single human counterpart. We don't want to think of this as just purely a technology player. Most importantly, healthcare is deeply rooted in this idea of investing in outcomes, and not necessarily investing in core technology, right? So we have learned that from the early days of us doing some really robust integrated AI-based solutions, but we've humanized it, right? Take, for example, we treat Olive just like any other human worker would, she shows up to work, she's onboarded, she has an obligation to her customers and to her human worker counterparts. And we care very deeply about the cost of the false positive that exists in healthcare, right? So, and we do this through various different ways. Most importantly, we do it in an extremely transparent and interpretable way. By transparent I mean, Olive provides deep insights back to her human counterparts in the form of reporting and status reports, and we even, we even have a term internally, that we call is a sick day. So when Olive calls in sick, we don't just tell our customers Olive's not working today, we tell our customers that Olive is taking a sick day, because a human worker that might require, that might need to stay home and recover. In our case, we just happened to have to rewire a certain portal integration because a portal just went through a massive change, and Olive has to take a sick day in order to make that fix, right? So. And this is, you know, just helping our customers understand, or feel like they can achieve success with AI-based deployments, and not sort of this like robot hanging over them, where we're waiting for Skynet to come into place, and truly humanizing the aspects of AI in healthcare. >> Right. Well that's really interesting. How would you describe Olive's personality? I mean, could you attribute a personality? >> Yeah, she's unbiased, data-driven, extremely transparent in her approach, she's empathetic. There are certain days where she's direct, and there are certain ways where she could be quirky in the way she shares stuff. Most importantly, she's incredibly knowledgeable, and we really want to bring that knowledge that she has gained over the years of working in the trenches of healthcare to her customers. >> That sounds really fascinating, and I love hearing about the human side of Olive. Can you tell us about how this AI, though, is actually improving efficiencies in healthcare systems right now? >> Yeah, not too many people know that about a third of every single US dollar is spent in the administrative burden of delivering care. It's really, really unfortunate. In the capitalistic world, of, just us as a system of healthcare in the United States, there is a lot of tail wagging the dog that ends up happening. Most importantly, I don't know that the last time, if you've been through a process where you have to go and get an MRI or a CT scan, and your provider tells you that we first have to wait for the insurance company in order to give us permission to perform this particular task. And when you think about that, one, there's, you know the tail wagging the dog scenario, but two, the administrative burden to actually seek the approval for that test, that your provider is telling you that you need to perform. Right? And what we've done is, as humans, or as sort of systems, we have just put humans in the supply chain of connecting the left side to the right side. So what we're doing is we're taking advantage of massive distributing cloud computing platforms, I mean, we're fully built on the AWS stack, we take advantage of things that we can very quickly stand up, and spin up. And we're leveraging core capabilities in our computer vision, our natural language processing, to do a lot of the tasks that, unfortunately, we have relegated humans to do, and our goal is can we allow humans to function at the top of their license? Irrespective of what the license is, right? It could be a provider, it could be somebody working in the trenches of revenue cycle management, or it could be somebody in a call center talking to a very anxious patient that just learned that he or she might need to take a test in order to rule out something catastrophic, like a very adverse diagnosis. >> Yeah, really fascinating. I mean, do you think that this is just like the tip of the iceberg? I mean, how much more potential does AI have for healthcare? >> Yeah, I think we're very much in the early, early, early days of AI being applied in a production in practical sense. You know, AI has been talked about for many, many many years, in the trenches of healthcare. It has found its place very much in challenging status quos in research, it has struggled to find its way in the trenches of just the practicality on the application of AI. And that's partly because we, you know, going back to the point that I raised earlier, the cost of the false positive in healthcare is really high. You know, it can't just be a, you know, I bought a pair of shoes online, and it recommended that I buy a pair of socks, and I happen to get the socks and I returned them back because I realized that they're really ugly and hideous and I don't want them. In healthcare, you can't do that. Right? In healthcare you can't tell a patient or somebody else oops, I really screwed up, I should not have told you that. So, what that's meant for us, in the trenches of delivery of AI-based applications, is we've been through a cycle of continuous pilots and proof of concepts. Now, though, with AI starting to take center stage, where a lot of what has been hardened in the research world can be applied towards the practicality to avoid the burnout, and the sheer cost that the system is under, we're starting to see this real upwards tick of people implementing AI-based solutions, whether it's for decision-making, whether it's for administrative tasks, drug discovery, it's just, is an amazing, amazing time to be at the intersection of practical application of AI and really, really good healthcare delivery for all of us. >> Yeah, I mean, that's really, really fascinating, especially your point on practicality. Now how do you foresee AI, you know, being able to be more commercial in its appeal? >> I think you have to have a couple of key wins under your belt, is number one, number two, the standard, sort of outcomes-based publications that is required. Two, I think we need, we need real champions on the inside of systems to support the narrative that us as vendors are pushing heavily on the AI-driven world or the AI-approachable world, and we're starting to see that right now. You know, it took a really, really long time for providers, first here in the United States, but now internationally, on this adoption and move away from paper-based records to electronic medical records. You know, you still hear a lot of pain from people saying oh my God, I used an EMR, but try to take the EMR away from them for a day or two, and you'll very quickly realize that life without an EMR is extremely hard right now. AI is starting to get to that point where, for us, we, you know, we treat, we always say that Olive needs to pass the Turing test. Right? So when you clearly get this, this sort of feeling that I can trust my AI counterpart, my AI worker to go and perform these tasks, because I realized that, you know, as long as it's unbiased, as long as it's data-driven, as long as it's interpretable, and something that I can understand, I'm willing to try this out in a routine basis, but we really, really need those champions on the internal side to promote the use of this safe application. >> Yeah. Well, just another thought here is, you know, looking at your website, you really focus on some of the broken systems in healthcare, and how Olive is uniquely prepared to shine the light on that, where others aren't. Can you just give us an insight onto that? >> Yeah. You know, the shine the light is a play on the fact that there's a tremendous amount of excitement in technology and AI in healthcare applied to the clinical side of the house. And it's the obvious place that most people would want to invest in, right? It's like, can I bring an AI-based technology to the clinical side of the house? Like decision support tools, drug discovery, clinical NLP, et cetera, et cetera. But going back to what I said, 30% of what happens today in healthcare is on the administrative side. And so what we call as the really, sort of the dark side of healthcare where it's not the most exciting place to do true innovation, because you're controlled very much by some big players in the house, and that's why we we provide sort of this insight on saying we can shine a light on a place that has typically been very dark in healthcare. It's around this mundane aspects of traditional, operational, and financial performance, that doesn't get a lot of love from the tech community. >> Well, thank you Rohan for this fascinating conversation on how AI is revolutionizing health systems across the country, and also the unique role that Olive is now playing in driving those efficiencies that we really need. Really looking forward to our next conversation with you. And that was Rohan D'Souza, the Chief Product Officer of Olive, and I'm Natalie Erlich, your host for the AWS Startup Showcase, on theCUBE. Thank you very much for joining us, and look forward for you to join us on the next session. (gentle music)

Published Date : Jun 24 2021

SUMMARY :

of the AWS Startup Showcase, My pleasure to be here, I'm excited. and how it's revolutionizing and on the enterprise side And also I'd love to hear about in some of the most complex So how are you seeking to in the form of reporting I mean, could you attribute a personality? that she has gained over the years the human side of Olive. know that the last time, is just like the tip of the iceberg? and the sheer cost that you know, being able to be first here in the United and how Olive is uniquely prepared is on the administrative side. and also the unique role

ENTITIES

Entity	Category	Confidence
Rohan D'Souza	PERSON	0.99+
Natalie	PERSON	0.99+
Natalie Erlich	PERSON	0.99+
United States	LOCATION	0.99+
30%	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
two	QUANTITY	0.99+
America	LOCATION	0.99+
Rohan	PERSON	0.99+
Olive	PERSON	0.99+
United States	LOCATION	0.99+
Today	DATE	0.99+
a day	QUANTITY	0.99+
first	QUANTITY	0.99+
today	DATE	0.99+
both	QUANTITY	0.98+
Two	QUANTITY	0.98+
single	QUANTITY	0.97+
Olives	PERSON	0.96+
Olive	ORGANIZATION	0.92+
one	QUANTITY	0.88+
Startup Showcase	EVENT	0.88+
theCUBE	ORGANIZATION	0.88+
single day	QUANTITY	0.82+
pandemic	EVENT	0.81+
about a third	QUANTITY	0.81+
a pair of socks	QUANTITY	0.8+
AWS Startup Showcase	EVENT	0.8+
AWS Startup Showcase	EVENT	0.75+
single human	QUANTITY	0.73+
Skynet	ORGANIZATION	0.68+
US	LOCATION	0.67+
every single	QUANTITY	0.65+
dollar	QUANTITY	0.62+
pair	QUANTITY	0.6+
number	QUANTITY	0.56+
NLP	ORGANIZATION	0.5+
shoes	QUANTITY	0.5+

Zach Booth, Explorium | AWS Startup Showcase | The Next Big Thing in AI, Security, & Life Sciences.

(gentle upbeat music) >> Everyone welcome to the AWS Startup Showcase presented by theCUBE. I'm John Furrier, host of theCUBE. We are here talking about the next big thing in cloud featuring Explorium. For the AI track, we've got AI cybersecurity and life sciences. Obviously AI is hot, machine learning powering that. Today we're joined by Zach Booth, director of global partnerships and channels like Explorium. Zach, thank you for joining me today remotely. Soon we'll be in person, but thanks for coming on. We're going to talk about rethinking external data. Thanks for coming on theCUBE. >> Absolutely, thanks so much for having us, John. >> So you guys are a hot startup. Congratulations, we just wrote about on SiliconANGLE, you're a new $75 million of fresh funding. So you're part of the Amazon partner network and growing like crazy. You guys have a unique value proposition looking at external data and that having a platform for advanced analytics and machine learning. Can you take a minute to explain what you guys do? What is this platform? What's the value proposition and why do you exist? >> Bottom line, we're bringing context to decision-making. The premise of Explorium and kind of this is consistent with the framework of advanced analytics is we're helping customers to reach better, more relevant, external data to feed into their predictive and analytical models. It's quite a challenge to actually integrate and effectively leverage data that's coming from beyond your organization's walls. It's manual, it's tedious, it's extremely time consuming and that's a problem. It's really a problem that Explorium was built to solve. And our philosophy is it shouldn't take so long. It shouldn't be such an arduous process, but it is. So we built a company, a technology that's capable for any given analytical process of connecting a customer to relevant sources that are kind of beyond their organization's walls. And this really impacts decision-making by bringing variety and context into their analytical processes. >> You know, one of the things I see a lot in my interviews with theCUBE and talking to people in the industry is that everyone talks a big game about having some machine learning and AI, they're like, "Okay, I got all this cool stuff". But at the end of the day, people are still using spreadsheets. They're wrangling data. And a lot of it's dominated by these still fenced-off data warehousing and you start to see the emergence of really companies built on the cloud. I saw the snowflake IPO, you're seeing a whole new shift of new brands emerging that are doing things differently, right? And because there's such a need for just move out of the archaic spreadsheet and data presentation layers, it's a slower antiquated, outdated. How do you guys solve that problem? You guys are on the other side of that equation, you're on the new wave of analytics. What are you guys solving? How do you make that work? How do you get on that way? >> So basically the way Explorium sees the world, and I think that most analytical practitioners these days see it in a similar way, but the key to any analytical problem is having the right data. And the challenge that we've talked about and that we're really focused on is helping companies reach that right data. Our focus is on the data part of data science. The science part is the algorithmic side. It's interesting. It was kind of the first frontier of machine learning as practitioners and experts were focused on it and cloud and compute really enabled that. The challenge today isn't so much "What's the right model for my problem?" But it's "What's the right data?" And that's the premise of what we do. Your model's only as strong as the data that it trains on. And going back to that concept of just bringing context to decision-making. Within that framework that we talked about, the key is bringing comprehensive, accurate and highly varied data into my model. But if my model is only being informed with internal data which is wonderful data, but only internal, then it's missing context. And we're helping companies to reach that external variety through a pretty elegant platform that can connect the right data for my analytical process. And this really has implications across several different industries and a multitude of use cases. We're working with companies across consumer packaged goods, insurance, financial services, retail, e-commerce, even software as a service. And the use cases can range between fraud and risk to marketing and lifetime value. Now, why is this such a challenge today with maybe some antiquated or analog means? With a spreadsheet or with a rule-based approach where we're pretty limited, it was an effective means of decision-making to generate and create actions, but it's highly limited in its ability to change, to be dynamic, to be flexible. And with modeling and using data, it's really a huge arsenal that we have at our fingertips. The trick is extracting value from within it. There's obviously latent value from within our org but every day there's more and more data that's being created outside of our org. And that is a challenge to go out and get to effectively filter and navigate and connect to. So we've basically built that tech to help us navigate and query for any given analytical question. Find me the right data rather than starting with what's the problem I'm looking for, now let me think about the right data. Which is kind of akin to going into a library and searching for a specific book. You know which book you're looking for. Instead of saying, there's a world, a universe of data outside there. I want to access it. I want to tap into what's right. Can I use a tool that can effectively query all that data, find what's relevant for me, connect it and match it with my own and distill signals or features from that data to provide more variety into my modeling efforts yielding a robust decision as an output. >> I love that paradigm of just having that searchable kind of paradigm. I got to ask you one of the big things that I've heard people talk about. I want to get your thoughts on this, is that how do I know if I even have the right data? Is the data addressable? Can I find it? Is it even, can I even be queried? How do you solve that problem for customers when they say, "I really want the best analytics but do I even have the data or is it the right data?" How do you guys look at that? >> So the way our technology was built is that it's quite relevant for a few different profile types of customers. Some of these customers, really the genesis of the company started with those cloud-based, model-driven since day one organizations, and they're working with machine learning and they have models in production. They're quite mature in fact. And the problem that they've been facing is, again, our models are only as strong as the data that they're training on. The only data that they're training on is internal data. And we're seeing diminishing returns from those decisions. So now suddenly we're looking for outside data and we're finding that to effectively use outside data, we have to spend a lot of time. 60% of our time spent thinking of data, going out and getting it, cleaning it, validating it, and only then can we actually train a model and assess if there's an ROI. That takes months. And if it doesn't push the needle from an ROI standpoint, then it's an enormous opportunity cost, which is very, very painful, which goes back to their decision-making. Is it even worth it if it doesn't push the needle? That's why there had to be a better way. And what we built is relevant for that audience as well as companies that are in the midst of their digital transformation. We're data rich, but data science poor. We have lots of data. A latent value to extract from within our own data and at the same time tons of valuable data outside of our org. Instead of waiting 18, 36 months to transform ourselves, get our infrastructure in place, our data collection in place, and really start having models in production based on our own data. You can now do this in tandem. And that's what we're seeing with a lot of our enterprise customers. By using their analysts, their data engineers, some of them in their innovation or kind of center of excellences have a data science group as well. And they're using the platform to inform a lot of their different models across lines of businesses. >> I love that expression, "data-rich". A lot of people becoming full of data too. They have a data problem. They have a lot of it. I think I want to get your thoughts but I think that connects to my next question which is as people look at the cloud, for instance, and again, all these old methods were internal, internal to the company, but now that you have this idea of cloud, more integration's happening. More people are connecting with APIs. There's more access to potentially more signals, more data. How does a company go to that next level to connect in and acquire the data and make it faster? Because I can almost imagine that the signals that come from that context of merging external data and that's the topic of this theme, re-imagining external data is extremely valuable signaling capability. And so it sounds like you guys make it go faster. So how does it work? Is it the cloud? Take us through that value proposition. >> Well, it's a real, it's amazing how fast the rate of change organizations have been moving onto the cloud over the past year during COVID and the fact that alternative or external data, depending on how you refer to it, has really, really blown up. And it's really exciting. This is coming in the form of data providers and data marketplaces, and everybody is kind of, more and more organizations are moving from rule-based decision-making to predictive decision making, and that's exciting. Now what's interesting about this company, Explorium, we're working with a lot of different types of customers but our long game has a real high upside. There's more and more companies that are starting to use data and are transformed or already are in the midst of their transformation. So they need outside data. And that challenge that I described is exists for all of them. So how does it really work? Today, if I don't have data outside, I have to think. It's based on hypothesis and it all starts with that hypothesis which is already prone to error from the get-go. You and I might be domain experts for a given use case. Let's say we're focusing on fraud. We might think about a dozen different types of data sources, but going out and getting it like I said, it takes a lot of time harmonizing it, cleaning it, and being able to use it takes even more time. And that's just for each one. So if we have to do that across dozens of data sources it's going to take far too much time and the juice isn't worth the squeeze. And so I'm going to forego using that. And a metaphor that I like to use when I try to describe what Explorium does to my mom. I basically use this connection to buying your first home. It's a very, very important financial decision. You would, when you're buying this home, you're thinking about all the different inputs in your decision-making. It's not just about the blueprint of the house and how many rooms and the criteria you're looking for. You're also thinking external variables. You're thinking about the school zone, the construction, the property value, alternative or similar neighborhoods. That's probably your most important financial decision or one of the largest at least. A machine learning model in production is an extremely important and expensive investment for an organization. Now, the problem is as a consumer buying a home, we have all this data at our fingertips to find out all of those external-based inputs. Organizations don't, which is kind of crazy when I first kind of got into this world. And so, they're making decisions with their first party data only. First party data's wonderful data. It's the best, it's representative, it's high quality, it's high value for their specific decision-making and use cases but it lacks context. And there's so much context in the form of location-based data and business information that can inform decision-making that isn't being used. It translates to sub-optimal decision-making, let's say. >> Yeah, and I think one of the insights around looking at signal data in context is if by merging it with the first party, it creates a huge value window, it gives you observational data, maybe potentially insights into customer behavior. So totally agree, I think that's a huge observation. You guys are definitely on the right side of history here. I want to get into how it plays out for the customer. You mentioned the different industries, obviously data's in every vertical. And vertical specialization with the data it has to be, is very metadata driven. I mean, metadata and oil and gas is different than fintech. I mean, some overlap, but for the most part you got to have that context, acute context, each one. How are you guys working? Take us through an example of someone getting it right, getting that right set up, taking us through the use case of how someone on boards Explorium, how they put it to use, and what are some of the benefits? >> So let's break it down into kind of a three-step phase. And let's use that example of fraud earlier. An organization would have basically past historical data of how many customers were actually fraudulent in the end of the day. So this use case, and it's a core business problem, is with an intention to reduce that fraud. So they would basically provide, going with your description earlier, something similar to an Excel file. This can be pulled from any database out there, we're working with loads of them, and they would provide this what's called training data. This training data is their historical data and would have as an output, the outcome, the conclusion, was this business fraudulent or not? Yes or no. Binary. The platform would understand that data itself to train a model with external context in the form of enrichments. These data enrichments at the end of the day are important, they're relevant, but their purpose is to generate signals. So to your point, signals is the bottom line what everyone's trying to achieve and identify and discover, and even engineer by using data that they have and data that they yet to integrate with. So the platform would connect to your data, infer and understand the meaning of that data. And based on this matching of internal plus external context, the platform automates the process of distilling signals. Or in machine learning this is called, referred to as features. And these features are really the bread and butter of your modeling efforts. If you can leverage features that are coming from data that's outside of your org, and they're quantifiably valuable which the platform measures, then you're putting yourself in a position to generate an edge in your modeling efforts. Meaning now, you might reduce your fraud rate. So your customers get a much better, more compelling offer or service or price point. It impacts your business in a lot of ways. What Explorium is bringing to the table in terms of value is a single access point to a huge universe of external data. It expedites your time to value. So rather than data analysts, data engineers, data scientists, spending a significant amount of time on data preparation, they can now spend most of their time on feature or signal engineering. That's the more fun and interesting part, less so the boring part. But they can scale their modeling efforts. So time to value, access to a huge universe of external context, and scale. >> So I see two things here. Just make sure I get this right 'cause it sounds awesome. So one, the core assets of the engineering side of it, whether it's the platform engineer or data engineering, they're more optimized for getting more signaling which is more impactful for the context acquisition, looking at contexts that might have a business outcome, versus wrangling and doing mundane, heavy lifting. >> Yeah so with it, sorry, go ahead. >> And the second one is you create a democratization for analysts or business people who just are used to dealing with spreadsheets who just want to kind of play and play with data and get a feel for it, or experiment, do querying, try to match planning with policy - >> Yeah, so the way I like to kind of communicate this is Explorium's this one, two punch. It's got this technology layer that provides entity resolution, so matching with external data, which otherwise is a manual endeavor. Explorium's automated that piece. The second is a huge universe of outside data. So this circumvents procurement. You don't have to go out and spend all of these one-off efforts on time finding data, organizing it, cleaning it, etc. You can use Explorium as your single access point to and gateway to external data and match it, so this will accelerate your time to value and ultimately the amount of valuable signals that you can discover and leverage through the platform and feed this into your own pipelines or whatever system or analytical need you have. >> Zach, great stuff. I love talking with you and I love the hot startup action here. Cause you're again, you're on the net new wave here. Like anything new, I was just talking to a colleague here. (indistinct) When you have something new, it's like driving a car for the first time. You need someone to give you some driving lessons or figure out how to operationalize it or take advantage of the one, two, punch as you pointed out. How do you guys get someone up and running? 'Cause let's just say, I'm like, okay, I'm bought into this. So no brainer, you got my attention. I still don't understand. Do you provide a marketplace of data? Do I need to get my own data? Do I bring my own data to the party? Do you guys provide relationships with other data providers? How do I get going? How do I drive this car? How do you answer that? >> So first, explorium.ai is a free trial and we're a product-focused company. So a practitioner, maybe a data analyst, a data engineer, or data scientist would use this platform to enrich their analytical, so BI decision-making or any models that they're working on either in production or being trained. Now oftentimes models that are being trained don't actually make it to production because they don't meet a minimum threshold. Meaning they're not going to have a positive business outcome if they're deployed. With Explorium you can now bring variety into that and increase your chances that your model that's being trained will actually be deployed because it's being fed with the right data. The data that you need that's not just the data that you have. So how a business would start working with us would typically be with a use case that has a high business value. Maybe this could be a fraud use case or a risk use case and B2B, or even B2SMB context. This might be a marketing use case. We're talking about LTV modeling, lookalike modeling, lead acquisition and generation for our CPGs and field sales optimization. Explore and understand your data. It would enrich that data automatically, it would generate and discover new signals from external data plus from your own and feed this into either a model that you have in-house or end to end in the platform itself. We provide customer success to generate, kind of help you build out your first model perhaps, and hold your hands through that process. But typically most of our customers are after a few months time having run in building models, multiple models in production on their own. And that's really exciting because we're helping organizations move from a more kind of rule-based decision making and being their bridge to data science. >> Awesome. I noticed that in your title you handle global partnerships and channels which I'm assuming is you guys have a network and ecosystem you're working with. What are some of the partnerships and channel relationships that you have that you bring to bear in the marketplace? >> So data and analytics, this space is very much an ecosystem. Our customers are working across different clouds, working with all sorts of vendors, technologies. Basically they have a pretty big stack. We're a part of that stack and we want to symbiotically play within our customer stack so that we can contribute value whether they sit here, there, or in another place. Our partners range from consulting and system integration firms, those that perhaps are building out the blueprint for a digital transformation or actually implementing that digital transformation. And we contribute value in both of these cases as a technology innovation layer in our product. And a customer would then consume Explorium afterwards, after that transformation is complete as a part of their stack. We're also working with a lot of the different cloud vendors. Our customers are all cloud-based and data enrichment is becoming more and more relevant with some wonderful machine-learning tools. Be they AutoML, or even some data marketplaces are popping up and very exciting. What we're bringing to the table as an edge is accelerating the connection between the data that I think I want as a company and how to actually extract value from that data. Being part of this ecosystem means that we can be working with and should be working with a lot of different partners to contribute incremental value to our end customers. >> Final question I want to ask you is if I'm in a conference room with my team and someone says, "Hey, we should be rethinking our external data." What would I say? How would I pound my fist on the table or raise my hand in saying, "Hey, I have an idea, we should be thinking this way." What would be my argument to the team, to re-imagine how we deal with external data? >> So it might be a scenario that rather than banging your hands on the table, you might be banging your heads on the table because it's such a challenging endeavor today. Companies have to think about, What's the right data for my specific use cases? I need to validate that data. Is it relevant? Is it real? Is it representative? Does it have good coverage, good depth and good quality? Then I need to procure that data. And this is about getting a license from it. I need to integrate that data with my own. That means I need to have some in-house expertise to do so. And then of course, I need to monitor and maintain that data on an ongoing basis. All of this is a pretty big thing to undertake and undergo and having a partner to facilitate that external data integration and ongoing refresh and monitoring, and being able to trust that this is all harmonized, high quality, and I can find the valuable ones without having to manually pick and choose and try to discover it myself is a huge value add, particularly the larger the organization or partner. Because there's so much data out there. And there's a lot of noise out there too. And so if I can through a single partner or access point, tap into that data and quantify what's relevant for my specific problem, then I'm putting myself in a really good position and optimizing the allocation of my very expensive and valuable data analysts and engineering resources. >> Yeah, I think one of the things you mentioned earlier I thought was a huge point was good call out was it goes beyond the first party data because and even just first party if you just in an internal view, some of the best, most successful innovators that we've been covering with cloud scale is they're extending their first party data to external providers. So they're in the value chains of solutions that share their first party data with other suppliers. And so that's just, again, more of an extension of the first party data. You're kind of taking it to a whole 'nother level of there's another external, external set of data beyond it that's even more important. I think this is a fascinating growth area and I think you guys are onto it. Great stuff. >> Thank you so much, John. >> Well, I really appreciate you coming on Zach. Final word, give a quick plug for the company. What are you up to, and what's going on? >> What's going on with Explorium? We are growing very fast. We're a very exciting company. I've been here since the very early days and I can tell you that we have a stellar working environment, a very, very, strong down to earth, high work ethic culture. We're growing in the sense of our office in San Mateo, New York, and Tel Aviv are growing rapidly. As you mentioned earlier, we raised our series C so that totals Explorium to raising I think 127 million over the past two years and some change. And whether you want to partner with Explorium, work with us as a customer, or join us as an employee, we welcome that. And I encourage everybody to go to explorium.ai. Check us out, read some of the interesting content there around data science, around the processes, around the business outcomes that a lot of our customers are seeing, as well as joining a free trial. So you can check out the platform and everything that has to offer from machine learning engine to a signal studio, as well as what type of information might be relevant for your specific use case. >> All right Zach, thanks for coming on. Zach Booth, director of global partnerships and channels that explorium.ai. The next big thing in cloud featuring Explorium and a part of our AI track, I'm John Furrier, host of theCUBE. Thanks for watching.

Published Date : Jun 24 2021

SUMMARY :

For the AI track, we've Absolutely, thanks so and that having a platform It's quite a challenge to actually of really companies built on the cloud. And that is a challenge to go out and get I got to ask you one of the big things and at the same time tons of valuable data and that's the topic of this theme, And a metaphor that I like to use of the insights around and data that they yet to integrate with. the core assets of the and gateway to external data Do I bring my own data to the party? that's not just the data that you have. What are some of the partnerships a lot of the different cloud vendors. to re-imagine how we and optimizing the allocation of the first party data. plug for the company. that has to offer from and a part of our AI track,

ENTITIES

Entity	Category	Confidence
John	PERSON	0.99+
Zach Booth	PERSON	0.99+
Explorium	ORGANIZATION	0.99+
Zach	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
60%	QUANTITY	0.99+
$75 million	QUANTITY	0.99+
John Furrier	PERSON	0.99+
San Mateo	LOCATION	0.99+
two things	QUANTITY	0.99+
Tel Aviv	LOCATION	0.99+
127 million	QUANTITY	0.99+
Excel	TITLE	0.99+
explorium.ai	OTHER	0.99+
first party	QUANTITY	0.99+
Today	DATE	0.99+
first time	QUANTITY	0.99+
first model	QUANTITY	0.98+
today	DATE	0.98+
both	QUANTITY	0.98+
first home	QUANTITY	0.98+
one	QUANTITY	0.98+
first	QUANTITY	0.98+
three-step	QUANTITY	0.98+
second	QUANTITY	0.97+
two punch	QUANTITY	0.97+
two	QUANTITY	0.97+
first frontier	QUANTITY	0.95+
New York	LOCATION	0.95+
theCUBE	ORGANIZATION	0.94+
AWS	ORGANIZATION	0.93+
explorium.ai	ORGANIZATION	0.91+
each one	QUANTITY	0.9+
second one	QUANTITY	0.9+
single partner	QUANTITY	0.89+
AWS Startup Showcase	EVENT	0.87+
dozens	QUANTITY	0.85+
past year	DATE	0.84+
single access	QUANTITY	0.84+
First party	QUANTITY	0.84+
series C	OTHER	0.79+
COVID	EVENT	0.74+
past two years	DATE	0.74+
36 months	QUANTITY	0.73+
18,	QUANTITY	0.71+
Startup Showcase	EVENT	0.7+
SiliconANGLE	ORGANIZATION	0.55+
tons	QUANTITY	0.53+
things	QUANTITY	0.53+
snowflake IPO	EVENT	0.52+

3 Quick Wins That Drive Big Gains in Enterprise Workloads

hey welcome to analytics unleashed i'm robert christensen your host today thank you for joining us today we have three quick wins that drive big gains in the enterprise workloads and today we have olaf with erickson we have john with orok and we have dragon with dxc welcome thank you for joining me gentlemen yeah good to be here thank you thank you good to have you hey olaf let's start off with you what big problems are you trying to solve today that are doing for those quick wins what are you trying to do today top top of mind yeah when we started looking into this microservices for our financial platform we immediately saw the challenges that we have and we wanted to have a strong partner and we have a good relationship with hp before so we turned to hp because we know that they have the technical support that we need the possibilities that we need in our platform to fulfill our requirements and also the reliability that we would need so tell me i think this is really important you guys are starting into a digital wallet space that correct yeah that's correct so we are in a financial platform so we are spanning across the world and delivering our financial services to our end customers well that's not classically what you hear about ericsson diving into what's really started you guys down that path and specifically these big wins around this digitization no what what we could see earlier was that we have a mobile networks right so we have a lot of a strong user base within them uh both kind of networks and in the where we started in the emerging markets uh you normally they have a lot of unbanked people and that people also were the ones that you want to target so be able to instead of going down and use your cash for example to buy your fruits or your electricity bill etc you could use your mobile wallet and and that's how it all started and now we're also turning into the emerged markets also like the western side part of worlds etc that's fantastic and i hey i want to talk to john here john's with o'rock and he's the one of those early adopters of those container platforms for the uh in the united states here the federal government tell us a little bit about that program and what's going on with that john yeah sure absolutely appreciate it yeah so with orock what we've done is we developed one of the first fedramp authorized container platforms that runs in our moderate and soon to be high cloud and what that does is building on the israel platform gave us the capability of offering customers both commercial as well as federal the capability and the flexibility of running their workloads in a you know as a service model where they can customize and typically what customers have to do is they have to either build it internally or if they go to the cloud they have to be able to take what resources are available then tweak to those designs to make what they need so in this architecture built on open source and with our own infrastructure we offer you know very low cost zero egress capability but the also the workload processing that they would need to run data analytics machine language and other types of high performance processing that typically they would need as we move forward in this computer age so john you you touched on a topic that's i think is really critical and you had mentioned open source why is open source a key aspect for this transformation that we're seeing coming up in like the next decade yeah sure yeah with open source we shifted early on to the company to move to open source only to offer the flexibility we didn't want to be set on one particular platform to operate within so we took and built the cloud infrastructure we went with open source as an open architecture that we can scale and grow within because of that we were one of the very first fedramp authorizations built on open source not on a specific platform and what we've seen from that is the increased performance capability that we would get as well as the flexibility to add additional components that typically you don't get on other platforms so it was a it was a good move we went with and one that the customer will definitely benefit from that that's that's huge actually because performance leads to better cost and better cost leads better performance around that i i'm just super super happy with all the advanced work that you always are doing there is fantastic and dragon so so you're in a space that i think is really interesting you're dealing with what everybody likes to talk about that's autonomous vehicles you're working with automobile manufacturers you're dealing with data at a scale that is unprecedented can you just open that door for us to talk to about these big big wins that you're trying to get over the line with these enterprises yeah absolutely and um thank you robert we approach uh leveraging esmeral from the data fabric angle we practically have a fully integrated the esmeral data fabric into our robotic drive solution rewarding drive solution is actually a game changer as you've mentioned in accelerating the development of autonomous driving vehicles it's a an end-to-end hyper-scale machine learning and ai platform as i mentioned based on the esmeralda data fabric which is used by the some of the largest manufacturers in the world for development of their autonomous driving algorithms and i think we all in technology i think and following up at the same type of news and research right across the globe in in this area so we're pretty proud that we're one of the leaders in actually providing uh hyperscale machine learning platforms for uh kind manufacturers some of them i cannot talk about but bmw is one of uh one of the current manufacturers that we provide uh these type of solutions and they have publicly spoken about their uh d3 platform uh data driven development platform uh just to give you an idea um of the scale as robert mentioned uh daily we collect over 1.5 petabytes of data of raw data did you say daily data daily the storage capacity is over 250 petabytes and growing uh there's over 100 000 cores and over 200 gpus in the in in the compute area um over 50 50 petabytes of data is delivered every two weeks into a hardware in loop right for testing and we have daily uh thousands of engineers and data scientists accessing the relevant data and developing machine learning models on the daily basis right part of it is the simulation right simulation cuts the cost as well as the uh time right for developing of the autonomous uh driving algorithms and uh the the simulations are taking probably 75 percent of the research uh that's being done on this platform that's amazing dragon i i i i the more i get involved with that and i've been part of these conversations with a number of the folks that are involved with it i i computer science me my geekiness my little propeller head starts coming out i might just blows my mind and i think so i'm going to pivot back over to olaf oh left so you're talking about something that is a global network of financial services okay correct and the flow of transactional typically non-relational transactional data flows to actual transactions going through you have issues of potential fraud you have issues a safety and you have multi-geographic regional problems with data and data privacy how are you guys addressing that today so so to answer that question today we have managed to solve that using the container platform to together with the data fabric but as you say we need to span across different regions we need to have the data as secure as possible because we have a lot of legal aspects to look into because if our data disappears but your money is also disappearing so it's a really important area for us with the security and the reliability of the platforms so so that's why we also went this way to make sure that we have this strong partner that could help us with this because just looking at where we are deployed in in more than 23 countries today and and we it's processing more than 900 million us dollars per day in our systems currently so it is a lot of money passing through and you need to take security in a it's as it's a very important point right it really is it really is and so uh john i mean you you uh obviously are dealing with you know a lot of folks that have three letters as acronyms around the government agencies and uh they range in various degrees of certa of security when you say fedramp i mean what could you just uh articulate why the esmerald platform was something that you selected to go to that fedrak compliant container platform because i think that's that that kind of speaks to the to the industrial strength of what we're talking about yeah it all comes down to being able to offer a product that's secure that the customers can trust and when we went with fedramp fedramp has very stringent security requirements that have monthly poems which are performance reviews and and updates that need to be done if not on a daily basis on a monthly basis so the customers there's a lot that goes on behind the scenes that they don't are able to articulate and what by selecting the hp esmerald platform for containers um one of the key strengths that we looked at was the esmo fabric and it's all about the data it's all about securing the data moving the data transferring the data and from a customer's perspective they want to be able to operate in an environment that they can trust no different than being able to turn on their lights or making sure there's water in their utilities you know containers with the israel platform built on orok's infrastructure gives that capability fedramp enables the security tied to the platform that we're able to follow so it's government uh guided which includes this and many and over hundreds of controls that typically you know the customers don't have time or the capability to address so our commercial customers benefit our federal customers you know that you discuss they're able to follow and check the box to meet those requirements and the container platform gives us a capability where now we're able to move files which we'll hear about through the optimal fabric and then we're able to run the workloads in the containers themselves and give isolation and the security element of fed wrapping esmeral gave us that capability in order to paint that environment fedramp authorized that the customers benefit from from security so they have confidence in running their workloads using their data and able to focus on their core job at hand and not worry about their infrastructure the fundamental requirement isn't it that that isolation between that compute and storage and going up a layer there in in a way that provides them a set of services that they can i wouldn't say set it and forget it but really had the confidence that what they're getting is the best performance for the dollars that they're spending uh john my hat's off to what the work that you all do in there thank you we appreciate it yeah yeah and dragon i want to i wanted to pivot a little bit here because you are primarily the the operator what i consider one of the largest data fabrics on the on the planet for that matter um and i just want to talk a little bit about the openness of our architecture right of all the multiple protocols that we support that allow for you know you know some people may have selected a different set of application deployment models and virtualization models that allow to plug into the data fabric you know it did can you talk a little bit about that yeah and i i think um in my mind right um to operate uh such a uh data fabric at scale right um there were three key elements that we were looking for right uh that we found in uh esmeralda fabric ring the first one was a speed cost and scalability right the second one was the globally distributed data lake or ability to distribute data globally and third was certainly the strength of our partnership with with hpe in this case right so if you look at the uh as well data fabric it's it's fast it's cost effective and it's certainly highly scalable because we as you just mentioned stretch the uh sort of the capabilities of the data fabric to hundreds of petabytes and over a million the data points if you will and it important what was important for us was that the esmeralda fabric actually eliminates the need for multiple vendor solutions which would be otherwise required right because it provides integrated file system database or or a data lake right and the data management on top of it right usually you would probably need to incorporate multiple tools right from different vendors and the file system itself it's it's so important right when you're working at scale like this right and honestly in our research maybe there are three file systems in the world that can support uh this kind of size of the auto data fabric the distributed data lake was also important to us and the reason for that is you can imagine that these large car manufacturers are testing and have testing vehicles all around the world right they're not just doing it locally around the uh their data their id centers right so uh collecting the data and this 1.5 petabytes example right uh for for bmw on a daily basis it's it's it's really challenging unless you have the ability to actually leverage the data in a distributed data like fashion right so data can basically reside in different data centers globally or even on-premise and in cloud environments which became uh very important later because a lot of this car manufacturers actually have oems right that would like to get either portions of the data or get access to the data in a in different environments not necessarily in their data center um and truly i think uh to build something at this scale right uh you you need a strong partner and we certainly had that in hpe and uh we got the comprehensive support right for uh for the software um but but more importantly i think uh partner that clearly understood uh criticality of the data fabric trend and the need for the vice fast response right to our clients and you know jointly i think we met all the challenges and it's so doing i think we made the esmo data fabric a much better and stronger product over the over the last few years that's fantastic thank you dragon appreciate it uh hey so if we're going to wrap up here any last words olaf do you want to share with us no looking forward now in from our perspective on helping out with the kobe 19 situation that we have uh enabling people to still be in the market without actually touching each other and and and leaving maybe for action market and being at home etc doing those transactions that's great thank you john in last comment yeah thanks yeah uh look for uh a joint offering announcement coming up between hpe and orok where we're going to be offering sandbox as a service where the data analytics and machine language where people can actually test drive the actual environment as a service and if they like it then they can move into a production-wise environment so stay tuned for that that's great john thank you for that and hey dragon last words yeah last words um we're pretty happy what we have done already for car manufacturers we're taking this solution right in terms of the uh distributed data-like capabilities as well as the uh hyperscale machine learning and ai platform to other industries and we hope to do it jointly with you well we hope that you do it with us as well so thank you very much everybody gentlemen thank you so much for joining us i appreciate it thank you very much thank you very much hey this is robert christensen with analytics unleashed i want to thank all of our guests here today and we'll catch you next time thank you for joining us bye [Music] [Music] [Music] easy [Music] you

Published Date : Mar 17 2021

SUMMARY :

and the reason for that is you can

ENTITIES

Entity	Category	Confidence
robert christensen	PERSON	0.99+
olaf	PERSON	0.99+
75 percent	QUANTITY	0.99+
today	DATE	0.99+
robert	PERSON	0.99+
more than 900 million	QUANTITY	0.99+
john	PERSON	0.99+
more than 23 countries	QUANTITY	0.99+
erickson	PERSON	0.99+
hp	ORGANIZATION	0.99+
over 100 000 cores	QUANTITY	0.99+
orok	ORGANIZATION	0.99+
three key elements	QUANTITY	0.99+
1.5 petabytes	QUANTITY	0.99+
over a million	QUANTITY	0.99+
over 200 gpus	QUANTITY	0.99+
over 250 petabytes	QUANTITY	0.99+
three letters	QUANTITY	0.98+
ericsson	PERSON	0.98+
third	QUANTITY	0.98+
over 1.5 petabytes	QUANTITY	0.98+
hundreds of petabytes	QUANTITY	0.98+
orock	PERSON	0.97+
both	QUANTITY	0.97+
o'rock	PERSON	0.97+
fedrak	ORGANIZATION	0.97+
every two weeks	QUANTITY	0.97+
next decade	DATE	0.96+
three file systems	QUANTITY	0.96+
esmeralda	ORGANIZATION	0.95+
fedramp	ORGANIZATION	0.95+
united states	LOCATION	0.94+
thousands of engineers	QUANTITY	0.93+
orok	PERSON	0.93+
hpe	ORGANIZATION	0.93+
second one	QUANTITY	0.92+
one	QUANTITY	0.9+
lot of money	QUANTITY	0.9+
first	QUANTITY	0.9+
monthly	QUANTITY	0.89+
three quick wins	QUANTITY	0.88+
over 50 50 petabytes of	QUANTITY	0.87+
first one	QUANTITY	0.85+
3	QUANTITY	0.82+
federal government	ORGANIZATION	0.82+
esmo	ORGANIZATION	0.8+
last few years	DATE	0.77+
hundreds of controls	QUANTITY	0.77+
a lot of unbanked people	QUANTITY	0.73+
dragon	PERSON	0.73+
platform	QUANTITY	0.66+
one of the leaders	QUANTITY	0.66+
wrapping	ORGANIZATION	0.64+
israel	ORGANIZATION	0.63+
dragon	TITLE	0.63+
lot of folks	QUANTITY	0.62+
israel	LOCATION	0.61+
daily	QUANTITY	0.54+
source	TITLE	0.48+
kobe	OTHER	0.44+

Breaking Analysis: Big 4 Cloud Revenue Poised to Surpass $100B in 2021

>> From the cube studios in Palo Alto in Boston bringing you data-driven insights from the cube in ETR. This is breaking analysis with Dave Vellante. >> There are four A players, in the IS slash pass hyperscale cloud services space, AWS, Azure, Alibaba, and alphabet, pretty clever, huh? In our view, these four have the resources, the momentum, and stamina to outperform all others virtually indefinitely. Now combined, we believe these companies will generate more than $115 billion in 2021 IaaS and PaaS revenue. That is a substantial chunk of market opportunity that is growing as a whole in the mid 30% range in 2021. Welcome to this week's Wiki bond cube insights, powered by ETR. In this breaking analysis, we are initiating coverage of Alibaba for our IaaS and PaaS market segments. And we'll update you on the latest hyperscale cloud market data, and survey data from ETR. Big week in hyperscale cloud land, Amazon and alphabet reported earnings and AWS CEO Andy Jassy was promoted to lead Amazon overall. I interviewed John Furrier on the cube this week. John has a close relationship with Jassy and a unique perspective on these developments. And we simulcast the interview on clubhouse, and then hosted a two hour clubhouse room that brought together all kinds of great perspectives on the topic. And then, we took the conversation to Twitter. Now in that discussion, we were just riffing on our updated cloud estimates and our numbers. And here's this tweet that inspired the addition of Alibaba. Now this gentleman is a tech journalist out of New Delhi and he pointed out that we were kind of overlooking Alibaba and I responded that no, we do not just discounting them but we just need to do more homework in the company's cloud business. He also said we're ignoring IBM, but really they're not in this conversation as a hyperscale IaaS competitor to the big four in our view. And we'll just leave it at that for now on IBM, but, back to Alibaba and the big four, we actually did some homework. So thank you for that suggestion. And this chart shows our updated IaaS figures and includes the full year 2020 which was pretty close to our Q4 projections. You know, the big change is we've added Alibaba in the mix. Now these four companies last year, accounted for $86 billion in revenue, and they grew it 41% rate combined relative to 2019. Now, notably as your revenue for the first time is more than half of that of AWS's revenue which of course hit over $45 billion. AWS's revenue, over top 45 billion last year, which is just astounding. Alibaba you'll note, is larger than Google cloud. The Google cloud platform, I should say GCP, at just over eight billion for Alibaba. Now, the reason Baba is such a formidable competitor, is because the vast majority of its revenue comes from China inside that country. And the company do have plans to continue their international expansion, so we see Alibaba as a real force here. Their cloud business showed positive EBITDA for the first time in the history of the company last quarter. So that has people excited. Now, Google, as we've often reported, is far behind AWS and Azure, despite its higher growth rates Google's overall cloud business lost 5.6 billion in 2020 which has some people concerned. We on the other hand are thrilled, because as we've reported in our view, Google needs to get its head out of its ads cloud is it's future. And we're very excited about the company pouring investments into its cloud business. Look with $120 billion essentially in the balance sheet, we can think of a better use of its cash. Now, I want to stress that these figures are our best efforts to create an apples to apples comparison across all four clouds. Many people have asked about, how much of these figures represent, for example, Microsoft office 365 or Google G suite, which by the way now is called workspaces. And the answer is our intention is $0. These are our estimates of worldwide IaaS in PaaS revenue. You know, some of said, we're too low. Some of said, we're too high. Hey, if you have better numbers, Please share them, happy to have a look. Now you maybe asking, what are the drivers of these figures and the growth that we're showing here? Well, all four of these companies, of course, they're benefiting from an accelerated shift to digital as a result to COVID, but each one has other tailwinds. You know, for example, AWS, it's Capitalizing on its a large headstart. It's created tremendous brand value. And as well, despite the fact that, while we estimate that more than 75% of AWS revenue comes from compute and storage, AWS is feature and functional differentiation combined with this large ecosystem is a very much a driving force of it's growth. In the case of Azure, in addition to its captive software application estate, the company on its earnings calls cited strong growth in its consumption based business across all of its industries and customer segments. As we've said, many times, Microsoft makes it really easy for customers to tap into Azure and a true consumption pricing model, with no minimums and cancel any time. Those kinds of terms make it extremely attractive to experiment and get hooked. We certainly saw this with AWS over the years. Now for Google it's growth is being powered by its outstanding technology, and in particular its prowess in AI and analytics. As well we suspect that much of the losses in Google cloud are coming from large go to market investments for Google cloud platform, and they're paying growth dividends. Now, as Tim Crawford said on Twitter, 6 billion, you know that's not too shabby. Also Google cited wins at Wayfair in Etsy, that Google is putting forth in our view to signal that many retailers they might be are you reluctant to do business with Amazon, was of course a big retailer competitor. These are two high profile names, we'd like to see more in future quarters and likely will. Now let's give you another view of this data and paint a picture of, how the pie is being carved out in the market. Actually we'll use bars because my, millennials sounding boards they hate pie charts. And I like to pay attention, to these emerging voices. At any rate amongst these four, AWS has more than half of the market. AWS and Azure are well ahead of the rest. And we think we'll continue to hold serve for quite some time. Now while we're impressed with Alibaba, they're currently constrained to doing business mostly in China. And we think it'll take many years for Baba and GCP to close that gap on the two leaders if they'll ever even get there. Now let's take a look at, what the customers are saying within the ETR survey data. The chart that we're showing here, this is X, Y chart that we show all the time. It's got net score or spending moments on the vertical axis, and market share or the pervasiveness in the datasets in the survey on the horizontal axis. Now on the upper right, you can see the net scores and the number of mentions for each company and the detailed behind this data. And what we've done here is cut the January survey data of 1,262 respondents, you can see that in filtered in there on the left, and we've filtered the data by cloud meaning the respondents are answering about the companies, cloud computing offerings only. So we're filtering out anything of the non-cloud spend. That's a nice little capability of the ETR platform. Azure is really quite amazing to us. It's got a net score of 72.6%, and that's across 572 responses out of the 1262. AWS is the next most pervasive in the data set with 492 shared accounts and a net score of 57.1%. Now, you may be wondering, well, why is Azure bigger in the dataset than AWS? And when we just told you that the opposite is the case in the market in the previous slide. And the answer is, like this is a survey and it's a lot of Microsoft out there, they're everywhere. And I have no doubt that the respondants notion of cloud doesn't directly map into IaaS and PaaS views of the world, but the trends are clear and consistent. Amazon and Azure, they dominate in this market space. Now for context, we've included functions in the form of AWS Lambda as your functions and Google cloud functions. Because, as you can see, there's a lot of spending momentum in these capabilities in these services. You'll also note, that we've added Alibaba to this chart, and it's got a respectable 63.6% net Score, but there are only 11 shared responses in the data. So they'll go into the bank on these numbers, but look, 11 data points, we'll take it. It's better than zero data points. We've also added VMware cloud on AWS in this chart, and you can see that, that capability that service, that has the momentum and you can see those ones that we've highlighted above the 40% red dotted line, that's where the real action in the market is. So all of those offerings have very strong or strong spending velocity in the ETR data set. Now, for context, we've put Oracle and IBM in the chart. And you can see, they both have, you know they've got a decent presence in the data set. They have 132 mentions and 81 responses respectively. So Oracle, they've got a positive net score of 16.7%, and IBM is in a negative 6.2%. Now, remember this is for their cloud offerings, as the respondents in the data set see them. So what does this mean? It says that among the 132 survey respondents answering that they use Oracle cloud, 16.7% more customers are spending more on Oracle's cloud than are spending less. In the case of IBM, it says more customers are spending less than spending more. Both companies are in the red zone, and show far less momentum than the leaders. Look, I've said many times that the good news is, that Oracle and IBM at least have clouds. But they're not direct competitors of the big four in our view, there just not. They have a large software business, and they can migrate their customers, to their respective clouds and market hybrid cloud services. Their definition of cloud is most certainly different than that of AWS, which is fine, but both companies use what I call a kitchen sink method of reporting their cloud business. Oracle includes, cloud and license support, often with revenue recognition at the time of contract, With a term that's renewable and, it also includes on-prem fees, for things like database and middleware, and if, you want to call that cloud, fine. IBM is just as bad, maybe they're worse and includes so much legacy stuff and its cloud number to hide the ball. It's just not even worth trying to unpack for this episode, I have previously and frankly, it's just not a good use of time. Now, as I've said before, both companies they're in the game that can make good money provisioning infrastructure to support their respective software businesses. I just don't consider them hyperscale class clouds which are defined by the big four, and really only those four. And I'm sure I'll get hate mail about that statement, and I'm happy to defend that position, so please reach out. Okay, but one other important thing that we want to discuss is something that came up this week in our Twitter conversation. Here's a tweet from Matt Baker who had strategic planning for Dell. He was responding to someone who commented on our cloud data, basically saying that, with all that cloud revenue who took the hit, which pockets did it come out of, and Matt was saying, look, it's coming out of customer pockets, but can we please end this zero sum game narrative. In other words, it's not a dollar for cloud that doesn't translate into a lost dollar from on-prem for the legacy companies. So let's take a look at that. For first I would agree, with Matt Baker, it's not a one for one swap of spend but there's definitely been an impact. And here's some data from ETR that can, maybe give us some insight here. What this chart shows is a cut of 915 hyperscale cloud accounts. So within those big four, and within those accounts we show the spending velocity or net score cut within further sectors representative of these on-prem players. So servers, storage and networking, so we cut the data on those three segments. And we're looking here at, VMware, Cisco, Dell, HPE, and IBM, for 2020 and into 2021. It's kind of an interesting picture, it shows the net scores for the January of 20 April, July and October 20 surveys and the January 21 surveys. Now all the on-prem players, they were of course impacted by COVID, IBM seems to be that counter trend line. Not that they weren't impacted, but they have this notable mainframe cycle thing going on. And you know, they're in a down cycle now. So it's kind of opposite of the other guys in terms of the survey momentum. And you can see pretty much, all the others are showing upticks headed into 2021, Cisco, you know kind of flattish, but stable and held up a bit. So to Matt Baker's point, despite the 35% or so growth expected for the big four and 2021 the on-prem leaders are showing some signs of positive spending momentum. So let's dig into this a little bit further, 'cause we're not saying cloud hasn't hurt on prem spending. You know, of course it has. Here's that same picture, over a 10 year view. So you're seeing this long, slow, decline occur, and it's no surprise. If you think about the prevailing model for servers, storage, and networking, on prem in particular. Servers have been perpetually under utilized, even with virtualization. You know, with the exception of like backup jobs, there aren't many workloads that can max out server utilization. So we kept buying more servers to give us performance headroom and ran at 20, 30% utilization, you know in a good day. Yes I know some folks can get up over 50%, but generally speaking servers are well under utilized in storage my gosh, it's kind of the same story, maybe even worse. Because for years it was powered by a mechanical system. So more spindles are required to gain performance, lots of copying going on, lots of, you know, pre-flash waste. And in networking it was a story of got to buy more ports. You've got to buy more ports. In the case of these segments, customers will just defense essentially, forced in this endless cycle of planning, procuring, you know, first planning. They got to get the secure the CapEx, and then they procure, and then they over-provision, and then they manage, you know, ongoing. So then along comes AWS, and says, try this on for size and you can see from that chart, the impact of cloud on those bellwether on-prem infrastructure players. Now, just to give you a little bit more insight on this topic, here's a picture of the wheel charts from the ETR data set. For AWS Microsoft, Google, and we brought in VMware to compare them. A wheel chart shows the percent of customers saying they'll either add a platform new that's the lime green. Increased spending by more than 5%, that's the forest green spend flat relative to last year. That's the gray spend less by more than 5% down, that's the pinkish or leave the platform, that's the Bright red. You subtract the red from the green and you get a percentage that represents net score, AWS with a net score of 60% is off the charts good. Microsoft remember, this includes the entire Microsoft business portfolio, not just Azure, so it's still really strong. Google, frankly, we'd like to see higher net scores and VMware's, you know, so there's a gold standard for on-prem. So we include them, so you can see for reference the strong, but notice they got a much, much bigger flat spending, which is what you would expect from some of these more mature players. Now let's compare these scores to the other, on-prem Kings. So this is not surprising to see, but the greens, they go down, the flats that gray area goes up compared to the cloud guys and the red which is virtually non-existent within AWS, goes into the high teens with the exception of Cisco which despite its exposure to virtually all industries including those hard hit by COVID shows pretty low read scores. So that's, that's good. And I got to share one other, look at this wheel chart for pure storage. We're not really not sure what's happening here, but this is impressive. We're seeing a huge rebound, and you can see we've superimposed as candlestick over comparing previous quarters surveys and, look at the huge up check in the January survey for pure that blue line. That's highlighted in that red dot at ellipse, jumps to a 63% net score from below 20% last quarter. You know, we'll see, I've never seen that kind of uptick before for an established company. And, you know, maybe it's pent up demand or some other anomaly in the data. We'll find out when pure reports in 2021, because remember these are forward looking surveys. But the point is, you still see action going on in hybrid and on-prem, and despite the freight train that is cloud, coming at the legacy players. You know, not that pure is legacy, but it's, you know, it's no longer a lanky teenager. And I think the bottom line, coming back to Matt Baker's point, is there are opportunities that the on-prem players can pursue in hybrid and multi-cloud, and we've talked about this a lot where you're building abstraction layer, on top of the hyperscale clouds and letting them build out their data center presence worldwide, spend on capex, they're going to outspend everybody. And these guys, these on-prem, and hybrid and multi-cloud folks they're going to have to add value on top of that. Now if they move fast, you no doubt there'll be acquiring startups to make that happen. They're going to have to put forth the value proposition and execute on that, in a way that adds clear value above and beyond what the hyperscalers are going to do. Now, the challenge, is picking those right spots, moving fast enough and balancing wall street promises with innovation. There's that same old dilemma. Let's face It. Amazon for years could lose tons of money and not get killed in the street. Google, they got so much cash, they can't spend it fast enough and Microsoft after years of going sideways is finally figured out and the some. Alibaba they're new to our analysis, but it's looking like you know, it's the Amazon of China, Plus ANT despite its regulatory challenges with the Chinese government. So all four of these players, are in the driver's seat in our view. And they're leading in not only cloud, but AI. And of course the data keeps flowing into their cloud. So they're really are in a strong position. Bottom line is we're still early into the cloud platform era and it's morphing. It's from a collection of remote cloud services, into this ubiquitous, sensing, thinking, anticipatory system, that's increasingly automated and working towards full automation. It's intelligent and it's hyper decentralizing toward the edge. One thing's for sure, the next 10 years, they're not going to be the same as the past 10. Okay, that's it for now. Remember I publish each week on Wikibond.com and siliconANGLE.com, these episodes they're all available as podcasts just search for breaking analysis podcast. You can always connect on Twitter. I'm @dvellante or email me at david.Vellante@siliconANGLE.com. I love the comments on LinkedIn and of course in clubhouse the new social app. So please follow me, so that you can get notified when we start a room and riff on these topics. And don't forget to check out etr.plus for all the survey action. This is Dave Vellante for the cube insights powered by ETR be well, and we'll see you next time. (upbeat music)

Published Date : Feb 5 2021

SUMMARY :

From the cube studios Oracle and IBM in the chart.

ENTITIES

Entity	Category	Confidence
IBM	ORGANIZATION	0.99+
Tim Crawford	PERSON	0.99+
Jassy	PERSON	0.99+
Oracle	ORGANIZATION	0.99+
Dell	ORGANIZATION	0.99+
Cisco	ORGANIZATION	0.99+
Alibaba	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
$0	QUANTITY	0.99+
Matt Baker	PERSON	0.99+
January 21	DATE	0.99+
China	LOCATION	0.99+
Google	ORGANIZATION	0.99+
Andy Jassy	PERSON	0.99+
16.7%	QUANTITY	0.99+
2021	DATE	0.99+
$120 billion	QUANTITY	0.99+
Palo Alto	LOCATION	0.99+
2019	DATE	0.99+
HPE	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
New Delhi	LOCATION	0.99+
January	DATE	0.99+
20	QUANTITY	0.99+
63.6%	QUANTITY	0.99+
John	PERSON	0.99+
Matt	PERSON	0.99+
72.6%	QUANTITY	0.99+
1,262 respondents	QUANTITY	0.99+
2020	DATE	0.99+
57.1%	QUANTITY	0.99+
John Furrier	PERSON	0.99+
two leaders	QUANTITY	0.99+
5.6 billion	QUANTITY	0.99+
last year	DATE	0.99+
$86 billion	QUANTITY	0.99+
572 responses	QUANTITY	0.99+

Big Ideas with Alan Cohen | AWS re:Invent 2020

>>From around the globe. If the cube with digital coverage of AWS reinvent 20, 20 special coverage sponsored by AWS worldwide public sector. >>Okay. Welcome back everyone. To the cubes, virtual coverage of AWS reinvent 2020, this is the cube virtual. I'm your host John farrier with the cube. The cube normally is there in person this year. It's all virtual. This is the cube virtual. We're doing the remote interviews and we're bringing in commentary and discussion around the themes of re-invent. And this today is public sector, worldwide public sector day. And the theme from Teresa Carlson, who heads up the entire team is to think big and look at the data. And I wanted to bring in a special cube alumni and special guests. Alan Cohen. Who's a partner at data collective venture capital or DCVC, um, which we've known for many, many years, founders, Matt OCO and Zachary Bogue, who started the firm, um, to over at about 10 years ago. We're on the really the big data wave and have grown into a really big firm thought big data, data, collective big ideas. That's the whole purpose of your firm. Alan. You're now a partner retired, retired, I mean a venture capitalist over at being a collective. Great to see you. Thanks for coming on. >>Great to see you as well. John, thanks for being so honest this morning. >>I love to joke about being retired because the VC game, it's not, um, a retirement for you. You guys made, you made some investments. Data collective has a unique, um, philosophy because you guys invest in essentially moonshots or big ideas, hard problems. And if I look at what's going on with Amazon, specifically in the public sector, genome sequencing now available in what they call the open data registry. You've got healthcare expanding, huge, you got huge demand and education, real societal benefits, uh, cybersecurity contested in space, more contention and congestion and space. Um, there's a lot of really hard science problems that are going on at the cloud. And AI are enabling, you're investing in entrepreneurs that are trying to solve these problems. What's your view of the big ideas? What are people missing? >>Well, I don't know if they're missing, but I think what I'd say, John, is that we're starting to see a shift. So if you look at the last, I don't know, forever 40, 50 years in the it and the tech industry, we took a lot of atoms. We built networks and data warehouses and server farms, and we, we kind of created software with it. So we took Adam's and we turned them into bets. Now we're seeing things move in the other direction where we're targeting bits, software, artificial intelligence, massive amount of compute power, which you can get from companies like, like AWS. And now we're creating better atoms. That means better met medicines and vaccines we're investor, um, and a company called abs Celera, which is the therapeutic treatment that J and J has, um, taken to market. Uh, people are actually spaces, a commercial business. >>If it's not a science fiction, novel we're investors in planet labs and rocket labs and compel a space so people can see right out. So you're sitting on your terrorists of your backyard from a satellite that was launched by a private company without any government money. Um, you talked about gene sequencing, uh, folding of proteins. Um, so I think the big ideas are we can look at some of the world's most intractable issues and problems, and we can go after them and turn them into commercial opportunities. Uh, and we would have been able to do that before, without the advent of big data and obviously the processing capabilities and on now artificial intelligence that are available from things like AWS. So, um, it's kind of, it's kind of payback from the physical world to the physical world, from the virtual world. Okay. >>Pella space was featured in the keynote by Teresa Carlson. Um, great to tie that in great tie in there, but this is the kind of hard problems. And I want to get your take because entrepreneurs, you know, it reminds me of the old days where, you know, when you didn't go back to the.com, when that bubble was going on, and then you got the different cycles and the different waves, um, the consumer always got the best kind of valuations and got the most attention. And now B to B's hot, you got the enterprise is super hot, mainly because of Amazon >>Sure. Into the Jordash IPO. Obviously this morning, >>Jordache IPO, I didn't get a phone call for friends and family and one of their top customers. They started in Palo Alto. We know them since the carton Jordache, these are companies that are getting massive, uh, zoom. Um, the post pandemic is coming. It's going to be a hybrid world. I think there's clear recognition that this some economic values are digital being digitally enabled and using cloud and AI for efficiencies and philosophy of new things. But it's going to get back to the real world. What's your, it's still hard problems out there. I mean, all the valuations, >>Well, there's always hard problems, but what's different now. And from a perspective of venture and, and investors is that you can go after really hard problems with venture scale level of investments. Uh, traditionally you think about these things as like a division of a company like J and J or general electric or some very massive global corporation, and because of the capabilities that are available, um, in the computing world, um, as well as kind of great scientific research and we fund more PhDs probably than any other, uh, any other type of background, uh, for, for founders, they can go after these things, they can create. Uh, we, uh, we have a company called pivot bio, uh, and I think I've spoken to you about them in the past, Sean, they have created a series of microbes that actually do a process called nitrogen fixation. Um, so it attaches the nitrogen to the roots of corn, sorghum and wheat. >>So you don't have to use chemical fertilizer. Well, those microbes were all created through an enormous amount of machine learning. And where did that machine learning come from? So what does that mean? That means climate change. That means more profitable farmers. Uh, that means water and air management, all major issues in our society where if we didn't have the computing capabilities we have today, we wouldn't have been able to do that. We clearly would have not been able to do that, um, as a venture level of investments to get it started. So I think what's missing for a lot of people is a paucity of imagination. And you have to actually, you know, you actually have to take these intractable problems and say, how can I solve them and then tear it apart to its actual molecules, just the little inside joke, right? And, and then move that through. >>And, you know, this means that you have to be able to invest in work on things. You know, these companies don't happen in two or three years or five years. They take sometimes seven, 10, 15 years. So it's life work for people. Um, but though, but we're seeing that, uh, you know, that everywhere, I mean, rocket lab, a company of ours out of New Zealand and now out of DC, which we actually launched the last couple of space, um, satellites, they print their rocket engines with a 3d printer, a metal printer. So think about that. How did all that, that come to bear? Um, and it started as a dangerous scale style of investments. So, you know, Peter Beck, the founder of that company had a dream to basically launch a rocket, you know, once a year, once a month, once a week, and eventually to once a day. So he's effectively creating a huge, um, huge upswing in the ability of people to commercialize space. And then what does space do? It gives you better observability on the planet from a, not just from a security point of view, but from a weather and a commerce point of view. So all kinds of other things that looked like they were very difficult to go after it now starts to become enabled. Yeah. >>I love the, uh, your investment in Capella space because I think that speaks volumes. And one of the things that the founder was talking about was getting the data down is the hard part. He he's up, he's up there now. He can see everything, but now I've got to get the data down because say, say the wildfires in California, or whether, um, things happening around the globe now that you have the, uh, the observation space, you got to get the data down there. This is the huge scale challenge. >>Well, let me, let me, let me give you something. That's also, so w you know, we are in a fairly difficult time in this country, right? Because of the covert virus, uh, we are going to maybe as quickly as next week, start to deliver, even though not as many as we'd like vaccines and therapeutics into this virus situation, literally in a year, how did all these things, I mean, obviously one of the worst public health crisis of our lifetimes, and maybe, you know, uh, of the past century, uh, how did that happen? How did it all day? Well, you know, some, I mean, the ability to use, um, computing power in, in assistance, in laboratory, in, in, uh, in, um, development of, of pharmaceutical and therapeutics is a huge change. So something that is an intractable problem, because the traditional methods of creating vaccines that take anywhere from three to seven years, we would have a much worse public health crisis. I'm not saying that this one is over, right. We're in a really difficult situation, but our ability to start to address it, the worst public health crisis in our lifetime is being addressed because of the ability of people to apply technology and to accelerate the ability to create vaccines. So great points, absolutely amazing. >>Let's just, let's just pause that let's double down on that and just unpack that, think about that for a second. If you didn't, and then the Amazon highlight is on Andy Jesse's keynote carrier, which makes air conditioning. They also do refrigeration and transport. So one IOT application leveraging their cloud is they may call it cold chain managing the value chain of the transport, making sure food. And in this case vaccine, they saw huge value to reduce carbon emissions because of it does the waste involved in food alone was a problem, but the vaccine, they had the cold, the cold, cold, cold chain. Can you hear me? >>Maybe this year, the cold chain is more valuable than the blockchain. Yeah. >>Cold don't think he was cold chain. Sounds like a band called play. Um, um, I had to get that in and Linda loves Coldplay. Um, but if you think about like where we are to your point, imagine if this hit 15 years ago or 20 years ago, um, you know, YouTube was just hitting the scene 20 years ago, 15 years ago, you know, so, you know, that kind of culture, we didn't have zoom education would be where we would be Skyping. Um, there's no bandwidth. So, I mean, you, you know, the, the bandwidth Wars you would live through those and your career, you had no bandwidth. You had no video conferencing, no real IOT, no real supply chain management and therapeutics would have taken what years. What's your reaction to, to that and compare and contrast that to what's on full display in the real world stage right now on digital enablement, digital transformation. >>Well, look, I mean, ultimately I'm an optimist because of what this technology allows you to do. I'm a realist that, you know, you know, we're gonna lose a lot of people because of this virus, but we're also going to be able to reduce a lot of, um, uh, pain for people and potentially death because of the ability to accelerate, um, these abilities to react. I think the biggest and the, the thing that I look for and I hope for, so when Theresa says, how do you think big, the biggest lesson I think we're going to we've learned in the last year is how to build resilience. So all kinds of parts of our economy, our healthcare systems, our personal lives, our education, our children, even our leisure time have been tested from a resilience point of view and the ability of technology to step in and become an enabler for that of resilience. >>Like there isn't like people don't love zoom school, but without zoom school, what we're going to do, there is no school, right? So, which is why zoom has become an indispensable utility of our lives, whether you're on a too much, or you've got zoom fatigue, does it really matter the concept? What we're going to do, call into a conference call and listen to your teacher, um, right in, you know, so how are you going to, you're going to do that, the ability to repurpose, um, our supply chain and, you know, uh, we, we, we see this, we're going to see a lot of change in the, in the global supply chain. You're going to see, uh, whether it's re domestication of manufacturing or tightening of that up, uh, because we're never going to go without PPE again, and other vital elements. We've seen entire industries repurposed from B2B to B to C and their ability to package, deliver and service customers. That is, those are forms of resilience. >>And, and, and, and taking that to the next level. If you think about what's actually happening on full display, and again, on my one-on-one with Andy Jassy prior to the event, and he laid this out on stage, he kind of talks about this, every vertical being disrupted, and then Dr. Matt wood, who's the machine learning lead there in Swami says, Hey, you know, cloud compute with chips now, and with AI and machine learning, every industry, vertical global industry is going to be disrupted. And so, you know, I get that. We've been saying that in the queue for a long time, that that's just going to happen. So we've been kind of on this wave of horizontal, scalability and vertical specialization with data and modern applications with machine learning, making customization really high-fidelity decisions. Or as you say, down to the molecule level or atomic level, but this is clear what, what I found interesting. And I want to get your thoughts because you have one been there, done that through many ways of innovation and now investor leading investor >>Investor, and you made up a word. I like it. Okay. >>Jesse talks about leadership to invent and reinvent. Can't fight gravity. You've got to get talent hungry for invention, solve real-world problems. Speed. Don't complexify. That's his message. I said to him, in my interview, you need a wartime conciliary cause he's a big movie buff. I quote the godfather. Yeah. Don't you don't want to be the Tom Hagen. You don't want to be that guy, right? You're not a wartime. Conciliary this is a time there's times in companies' histories where there's peace and there's wartime, wartime being the startup, trying to find its way. And then they get product market fit and you're growing and scaling. You're operating, you're hiring people to operate. Then you get into a pivot or a competitive situation. And then you got to get out there and, and, and get dirty and reinvent or re-imagine. And then you're back to peace. Having the right personnel is critical. So one of the themes this year is if you're in the way, get out of the way, you know, and some people don't want to hold on to hold onto the past. That's the way we did it before I built this system. Therefore it has to work this way. Otherwise the new ways, terrible, the mainframe, we've got to keep the mainframe. So you have a kind of a, um, an accelerated leadership, uh, thin man mantra happening. What is your take on this? Because, >>Sorry. So if you're going to have your F R R, if you're going to, if you are going to use, um, mob related better for is I'll share one with you from the final season of the Soprano's, where Tony's Prado is being hit over the head with a bunch of nostalgia from one of his associates. And he goes, remember, when is the lowest form of conversation and which is iconic. I think what you're talking about and what Andy is talking about is that the thing that makes great leadership, and what I look for is that when you invest in somebody or you put somebody in a leadership position to build something, 50% of their experience is really important. And 50% of it is not applicable in the new situation. And the hard leadership initiative has to understand which 50 matters in which 50 doesn't matter. >>So I think the issue is that, yeah, I think it is, you know, lead follow or get out of the way, but it's also, what am I doing? Am I following a pattern for a, for a, for an, a, for a technology, a market, a customer base, or a set of people are managing that doesn't really exist anymore, that the world has moved on. And I think that we're going to be kind of permanent war time on some level we're going to, we're going to be co we're because I think the economy is going to shift. We're going to have other shocks to the economy and we don't get back to a traditional normal any time soon. Yep. So I, I think that is the part that leadership in, in technology really has to, would adopt. And it's like, I mean, uh, you know, the first great CEO of Intel reminded us, right. Then only the paranoid survive. Right. Is that it's you, some things work and some things don't work and that's, that's the hard part on how you parse it. So I always like to say that you always have to have a crisis, and if there is no crisis, you create the crisis. Yeah. And, you know, >>Sam said, don't let a good crisis go to waste. You know? Um, as a manager, you take advantage of the crisis. >>Yeah. I mean, look, it wouldn't have been bad to be in the Peloton business this year. Right, too. Right. Which is like, when people stayed home and like that, you know, you know, th that will fade. People will get back on their bikes and go outside. I'm a cyclist, but you know, a lot more people are going to look at that as an alternative way to exercise or exercising, then when it's dark or when the weather is inclement. So what I think is that you see these things, they go in waves, they crest, they come back, but they never come back all the way to where they were. And as a manager, and then as a builder in the technology industry, you may not get like, like, like, okay, maybe we will not spend as much time on zoom, um, in a year from now, but we're going to still spend a lot of time on zoom and it's going to still be very important. >>Um, what I, what I would say, for example, and I, and looking at the COVID crisis and from my own personal investments, when I look at one thing is clear, we're going to get our arms around this virus. But if you look at the history of airborne illnesses, they are accelerating and they're coming every couple of years. So being able to be in that position to, to more react, more rapidly, create vaccines, the ability to foster trials more quickly to be able to use that information, to make decisions. And so the duration when people are not covered by therapeutics or vaccines, um, short, and this, that is going to be really important. So that form of resilience and that kind of speed is going to happen again and again, in healthcare, right. There's going to be in, you know, in increasing pressure across that in part of the segment food supply, right. I mean, the biggest problem in our food supply today is actually the lack of labor. Um, and so you have far, I mean, you know, farmers have had a repurpose, they don't sell to their traditional, like, so you're going to see increased amount of optimization automation and mechanization. >>Lauren was on the, um, keynote today talking about how their marketplaces collected as a collective, you know, um, people were working together, um, given that, given the big ideas. Well, let's, let's just, as we end the segment here, let's connect big ideas. And the democratization of, I mean, you know, the old expression Silicon Valley go big or go home. Well, I think now we're at a time where you can actually go big and stay and, and, and be big and get to be big at your own pace because the, the mantra has been thinking big in years, execute plan in months and execute weekly and month daily, you know, you can plan around, there's a management technique potentially to leverage cloud and AI to really think about bit the big idea. Uh, if I'm a manager, whether I'm in public sector or commercial or any vertical industry, I can still have that big idea that North star and then work backwards and figure that out. >>That sounds to the Amazon way. What's your take on how people should be. What's the right way to think about executing down that path so that someone who's say trying to re-imagine education. And I know a, some people that I've talked to here in California are looking at it and saying, Hey, I don't need to have silos students, faculty, alumni, and community. I can unify them together. That's an idea. I mean, execution of that is, you know, move all these events. So they've been supplying siloed systems to them. Um, I mean, cause people want to interact online. The Peloton is a great example of health and fitness. So there's, there's everyone is out there waiting for this playbook. >>Yeah. Unfortunately I, I had the playbook. I'd mail it to you. Uh, but you know, I think there's a couple of things that are really important to do. Maybe good to help the bed is one where is there structural change in an industry or a segment or something like that. And sorry to just people I'm home today, right? It's, everybody's running out of the door. Um, and you know, so I talked about this structural change and you, we talked about the structural change in healthcare. We talked about kind of maybe some of the structural change that's coming to agriculture. There's a change in people's expectations and how they're willing to work and what they're willing to do. Um, you, as you pointed out the traditional silos, right, since we have so much information at our fingertips, um, you know, people's responsibility as opposed to having products and services to deliver them, what they're willing to do on their own is really changed. >>Um, I think the other thing is that, uh, leadership is ultimately the most important aspect. And we have built a lot of companies in the industry based on forms of structural relations industry, um, background, I'm a product manager, I'm a sales person, I'm a CEO, I'm a finance person. And what we're starting to see is more whole thinking. Um, uh, particularly in early stage investors where they think less functionally about what people's jobs are and more about what the company is trying to get done, what the market is like. And it's infusing a lot more, how people do that. So ultimately most of this comes down to leadership. Um, uh, and, and that's what people have to do. They have to see themselves as a leader in their company, in their, in the business. They're trying to build, um, not just in their function, but in the market they're trying to win, which means you go out and you talk to a lot more people. >>You do a lot, you take a lot fewer things for granted. Um, you read less textbooks on how to build companies and you spend more time talking to your customers and your engineers, and you start to look at enabling. So the, we have made between machine learning, computer vision, and the amount of processing power that's available from things like AWS, including the services that you could just click box in places like the Amazon store. You actually have to be much more expansive in how you think about what you can get done without having to build a lot of things. Cause it's actually right there at your fingertips. Hopefully that kind of gets a little bit to what you were asking. >>Well, Alan, it's always great to have you on and great insight and, uh, always a pleasure to talk candidly. Um, normally we're a little bit more boisterous, but given how terrible the situation is with COVID while working at home, I'm usually in person, but you've been great. Take a minute to give a plug for the data collective venture capital firm. DCVC you guys have a really unique investment thesis you're in applied AI, computational biology, um, computational care, um, enterprise enablement. Geospatial is about space and Capella, which was featured carbon health, smart agriculture transportation. These are kind of like not on these are off the beaten path of like traditional herd mentality of venture capital. You guys are going after big problems. Give us an update on the firm. I know that firm has gotten bigger lately. You guys have >>No, I mean the further firm has gotten bigger, I guess since Matt, Zach started about a decade ago. So we have about $2.3 billion under management. We also have bio fund, uh, kind of a sister fund. That's part of that. I mean, obviously we are, uh, traditionally an early stage investor, but we have gone much longer now with these additional, um, um, investment funds and, and the confidence of our LPs. Uh, we are looking for bears. You said John, really large intractable, um, industry problems and transitions. Uh, we tend to back very technical founders and work with them very early in the creation of their business. Um, and we have a huge network of some of the leading people in our industry who work with us. Uh, we, uh, it's a little bit of our secret weapon. We call it our equity partner network. Many of them have been on the cube. >>Um, and these are people that work with us in the create, uh, you know, the creation of this. Uh, we've never been more excited because there's never been more opportunity. And you'll start to see, you know, you're starting to hear more and more about them, uh, will probably be a couple of years of report. We're a household name. Um, but you know, we've, we we're, we're washing deal flow. And the good news is I think more people want to invest in and build the things that we've. So we're less than itchy where people want to do what we're doing. And I think some of the large exits that starting to come our way or we'll attract more, more great entrepreneurs in that space. >>I really saw the data models, data, data trend early, you saw a Realty impacted, and I'll say that's front and center on Amazon web services reinvent this year. You guys were early super important firm. I'm really glad you guys exist. And you guys will be soon a household name if not already. Thanks for coming on. Right, >>Alan. Thanks. Thank you. Appreciate >>It. Take care. I'm John ferry with the cube. You're watching a reinvent coverage. This is the cube live portion of the coverage. Three weeks wall to wall. Check out the cube.net. Also go to the queue page on the Amazon event page, there's a little click through the bottom and the metadata is Mainstage tons of video on demand and live programming there too. Thanks for watching.

Published Date : Dec 9 2020

SUMMARY :

If the cube with digital coverage of AWS And the theme from Teresa Carlson, who heads up the entire team is to think big and look at the data. Great to see you as well. um, philosophy because you guys invest in essentially moonshots or big ideas, So if you look at the last, I don't know, forever 40, 50 years in the it Um, you talked about gene sequencing, And now B to B's hot, you got the enterprise is super hot, mainly because of Amazon Obviously this morning, I mean, all the valuations, Um, so it attaches the nitrogen to the roots of corn, sorghum and wheat. And you have to but though, but we're seeing that, uh, you know, that everywhere, I mean, rocket lab, a company of ours things happening around the globe now that you have the, uh, the observation space, you got to get the data down Well, you know, some, I mean, the ability to use, um, If you didn't, and then the Amazon highlight is on Andy Jesse's keynote carrier, Maybe this year, the cold chain is more valuable than the blockchain. um, you know, YouTube was just hitting the scene 20 years ago, 15 years ago, you know, because of the ability to accelerate, um, these abilities to react. our supply chain and, you know, uh, we, we, we see this, we're going to see a lot of change And so, you know, I get that. Investor, and you made up a word. I said to him, in my interview, you need a wartime conciliary cause he's a big movie buff. And the hard leadership initiative has to understand which 50 matters in which 50 doesn't matter. So I always like to say that you always have to have a crisis, and if there is no crisis, you create the crisis. Um, as a manager, you take advantage of the crisis. Which is like, when people stayed home and like that, you know, you know, There's going to be in, you know, in increasing pressure And the democratization of, I mean, you know, the old expression Silicon Valley go big or go And I know a, some people that I've talked to here in California are looking at it and saying, Um, and you know, so I talked about this structural change but in the market they're trying to win, which means you go out and you talk to a lot more people. You actually have to be much more expansive in how you think about what you can get done without having Well, Alan, it's always great to have you on and great insight and, uh, always a pleasure to talk candidly. Um, and we have a huge network of some of the leading people in our industry who work with us. Um, and these are people that work with us in the create, uh, you know, I really saw the data models, data, data trend early, you saw a Realty impacted, of the coverage.

ENTITIES

Entity	Category	Confidence
Teresa Carlson	PERSON	0.99+
Linda	PERSON	0.99+
Alan Cohen	PERSON	0.99+
Lauren	PERSON	0.99+
John	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Sean	PERSON	0.99+
Alan	PERSON	0.99+
Zachary Bogue	PERSON	0.99+
Theresa	PERSON	0.99+
Zach	PERSON	0.99+
Peter Beck	PERSON	0.99+
California	LOCATION	0.99+
Andy Jassy	PERSON	0.99+
Tom Hagen	PERSON	0.99+
seven	QUANTITY	0.99+
Andy	PERSON	0.99+
New Zealand	LOCATION	0.99+
AWS	ORGANIZATION	0.99+
Jesse	PERSON	0.99+
Sam	PERSON	0.99+
John farrier	PERSON	0.99+
Matt	PERSON	0.99+
Andy Jesse	PERSON	0.99+
five years	QUANTITY	0.99+
two	QUANTITY	0.99+
Palo Alto	LOCATION	0.99+
YouTube	ORGANIZATION	0.99+
Matt OCO	PERSON	0.99+
50%	QUANTITY	0.99+
three years	QUANTITY	0.99+
Intel	ORGANIZATION	0.99+
Matt wood	PERSON	0.99+
Three weeks	QUANTITY	0.99+
three	QUANTITY	0.99+
DC	LOCATION	0.99+
last year	DATE	0.99+
once a year	QUANTITY	0.99+
about $2.3 billion	QUANTITY	0.99+
once a month	QUANTITY	0.99+
first	QUANTITY	0.99+
once a day	QUANTITY	0.99+
next week	DATE	0.99+
15 years	QUANTITY	0.99+
Jordash	ORGANIZATION	0.99+
15 years ago	DATE	0.99+
this year	DATE	0.98+
10	QUANTITY	0.98+
once a week	QUANTITY	0.98+
one	QUANTITY	0.98+
20 years ago	DATE	0.98+
50	QUANTITY	0.98+
20	QUANTITY	0.98+
today	DATE	0.98+
Peloton	LOCATION	0.98+
Jordache	ORGANIZATION	0.98+
40	QUANTITY	0.98+
cube.net	OTHER	0.98+
pivot bio	ORGANIZATION	0.97+
seven years	QUANTITY	0.97+
J	PERSON	0.96+
Swami	PERSON	0.96+
Coldplay	ORGANIZATION	0.95+
Silicon Valley	LOCATION	0.94+
50 years	QUANTITY	0.93+

UNLIST TILL 4/2 - Vertica Big Data Conference Keynote

>> Joy: Welcome to the Virtual Big Data Conference. Vertica is so excited to host this event. I'm Joy King, and I'll be your host for today's Big Data Conference Keynote Session. It's my honor and my genuine pleasure to lead Vertica's product and go-to-market strategy. And I'm so lucky to have a passionate and committed team who turned our Vertica BDC event, into a virtual event in a very short amount of time. I want to thank the thousands of people, and yes, that's our true number who have registered to attend this virtual event. We were determined to balance your health, safety and your peace of mind with the excitement of the Vertica BDC. This is a very unique event. Because as I hope you all know, we focus on engineering and architecture, best practice sharing and customer stories that will educate and inspire everyone. I also want to thank our top sponsors for the virtual BDC, Arrow, and Pure Storage. Our partnerships are so important to us and to everyone in the audience. Because together, we get things done faster and better. Now for today's keynote, you'll hear from three very important and energizing speakers. First, Colin Mahony, our SVP and General Manager for Vertica, will talk about the market trends that Vertica is betting on to win for our customers. And he'll share the exciting news about our Vertica 10 announcement and how this will benefit our customers. Then you'll hear from Amy Fowler, VP of strategy and solutions for FlashBlade at Pure Storage. Our partnership with Pure Storage is truly unique in the industry, because together modern infrastructure from Pure powers modern analytics from Vertica. And then you'll hear from John Yovanovich, Director of IT at AT&T, who will tell you about the Pure Vertica Symphony that plays live every day at AT&T. Here we go, Colin, over to you. >> Colin: Well, thanks a lot joy. And, I want to echo Joy's thanks to our sponsors, and so many of you who have helped make this happen. This is not an easy time for anyone. We were certainly looking forward to getting together in person in Boston during the Vertica Big Data Conference and Winning with Data. But I think all of you and our team have done a great job, scrambling and putting together a terrific virtual event. So really appreciate your time. I also want to remind people that we will make both the slides and the full recording available after this. So for any of those who weren't able to join live, that is still going to be available. Well, things have been pretty exciting here. And in the analytic space in general, certainly for Vertica, there's a lot happening. There are a lot of problems to solve, a lot of opportunities to make things better, and a lot of data that can really make every business stronger, more efficient, and frankly, more differentiated. For Vertica, though, we know that focusing on the challenges that we can directly address with our platform, and our people, and where we can actually make the biggest difference is where we ought to be putting our energy and our resources. I think one of the things that has made Vertica so strong over the years is our ability to focus on those areas where we can make a great difference. So for us as we look at the market, and we look at where we play, there are really three recent and some not so recent, but certainly picking up a lot of the market trends that have become critical for every industry that wants to Win Big With Data. We've heard this loud and clear from our customers and from the analysts that cover the market. If I were to summarize these three areas, this really is the core focus for us right now. We know that there's massive data growth. And if we can unify the data silos so that people can really take advantage of that data, we can make a huge difference. We know that public clouds offer tremendous advantages, but we also know that balance and flexibility is critical. And we all need the benefit that machine learning for all the types up to the end data science. We all need the benefits that they can bring to every single use case, but only if it can really be operationalized at scale, accurate and in real time. And the power of Vertica is, of course, how we're able to bring so many of these things together. Let me talk a little bit more about some of these trends. So one of the first industry trends that we've all been following probably now for over the last decade, is Hadoop and specifically HDFS. So many companies have invested, time, money, more importantly, people in leveraging the opportunity that HDFS brought to the market. HDFS is really part of a much broader storage disruption that we'll talk a little bit more about, more broadly than HDFS. But HDFS itself was really designed for petabytes of data, leveraging low cost commodity hardware and the ability to capture a wide variety of data formats, from a wide variety of data sources and applications. And I think what people really wanted, was to store that data before having to define exactly what structures they should go into. So over the last decade or so, the focus for most organizations is figuring out how to capture, store and frankly manage that data. And as a platform to do that, I think, Hadoop was pretty good. It certainly changed the way that a lot of enterprises think about their data and where it's locked up. In parallel with Hadoop, particularly over the last five years, Cloud Object Storage has also given every organization another option for collecting, storing and managing even more data. That has led to a huge growth in data storage, obviously, up on public clouds like Amazon and their S3, Google Cloud Storage and Azure Blob Storage just to name a few. And then when you consider regional and local object storage offered by cloud vendors all over the world, the explosion of that data, in leveraging this type of object storage is very real. And I think, as I mentioned, it's just part of this broader storage disruption that's been going on. But with all this growth in the data, in all these new places to put this data, every organization we talk to is facing even more challenges now around the data silo. Sure the data silos certainly getting bigger. And hopefully they're getting cheaper per bit. But as I said, the focus has really been on collecting, storing and managing the data. But between the new data lakes and many different cloud object storage combined with all sorts of data types from the complexity of managing all this, getting that business value has been very limited. This actually takes me to big bet number one for Team Vertica, which is to unify the data. Our goal, and some of the announcements we have made today plus roadmap announcements I'll share with you throughout this presentation. Our goal is to ensure that all the time, money and effort that has gone into storing that data, all the data turns into business value. So how are we going to do that? With a unified analytics platform that analyzes the data wherever it is HDFS, Cloud Object Storage, External tables in an any format ORC, Parquet, JSON, and of course, our own Native Roth Vertica format. Analyze the data in the right place in the right format, using a single unified tool. This is something that Vertica has always been committed to, and you'll see in some of our announcements today, we're just doubling down on that commitment. Let's talk a little bit more about the public cloud. This is certainly the second trend. It's the second wave maybe of data disruption with object storage. And there's a lot of advantages when it comes to public cloud. There's no question that the public clouds give rapid access to compute storage with the added benefit of eliminating data center maintenance that so many companies, want to get out of themselves. But maybe the biggest advantage that I see is the architectural innovation. The public clouds have introduced so many methodologies around how to provision quickly, separating compute and storage and really dialing-in the exact needs on demand, as you change workloads. When public clouds began, it made a lot of sense for the cloud providers and their customers to charge and pay for compute and storage in the ratio that each use case demanded. And I think you're seeing that trend, proliferate all over the place, not just up in public cloud. That architecture itself is really becoming the next generation architecture for on-premise data centers, as well. But there are a lot of concerns. I think we're all aware of them. They're out there many times for different workloads, there are higher costs. Especially if some of the workloads that are being run through analytics, which tend to run all the time. Just like some of the silo challenges that companies are facing with HDFS, data lakes and cloud storage, the public clouds have similar types of siloed challenges as well. Initially, there was a belief that they were cheaper than data centers, and when you added in all the costs, it looked that way. And again, for certain elastic workloads, that is the case. I don't think that's true across the board overall. Even to the point where a lot of the cloud vendors aren't just charging lower costs anymore. We hear from a lot of customers that they don't really want to tether themselves to any one cloud because of some of those uncertainties. Of course, security and privacy are a concern. We hear a lot of concerns with regards to cloud and even some SaaS vendors around shared data catalogs, across all the customers and not enough separation. But security concerns are out there, you can read about them. I'm not going to jump into that bandwagon. But we hear about them. And then, of course, I think one of the things we hear the most from our customers, is that each cloud stack is starting to feel even a lot more locked in than the traditional data warehouse appliance. And as everybody knows, the industry has been running away from appliances as fast as it can. And so they're not eager to get locked into another, quote, unquote, virtual appliance, if you will, up in the cloud. They really want to make sure they have flexibility in which clouds, they're going to today, tomorrow and in the future. And frankly, we hear from a lot of our customers that they're very interested in eventually mixing and matching, compute from one cloud with, say storage from another cloud, which I think is something that we'll hear a lot more about. And so for us, that's why we've got our big bet number two. we love the cloud. We love the public cloud. We love the private clouds on-premise, and other hosting providers. But our passion and commitment is for Vertica to be able to run in any of the clouds that our customers choose, and make it portable across those clouds. We have supported on-premises and all public clouds for years. And today, we have announced even more support for Vertica in Eon Mode, the deployment option that leverages the separation of compute from storage, with even more deployment choices, which I'm going to also touch more on as we go. So super excited about our big bet number two. And finally as I mentioned, for all the hype that there is around machine learning, I actually think that most importantly, this third trend that team Vertica is determined to address is the need to bring business critical, analytics, machine learning, data science projects into production. For so many years, there just wasn't enough data available to justify the investment in machine learning. Also, processing power was expensive, and storage was prohibitively expensive. But to train and score and evaluate all the different models to unlock the full power of predictive analytics was tough. Today you have those massive data volumes. You have the relatively cheap processing power and storage to make that dream a reality. And if you think about this, I mean with all the data that's available to every company, the real need is to operationalize the speed and the scale of machine learning so that these organizations can actually take advantage of it where they need to. I mean, we've seen this for years with Vertica, going back to some of the most advanced gaming companies in the early days, they were incorporating this with live data directly into their gaming experiences. Well, every organization wants to do that now. And the accuracy for clickability and real time actions are all key to separating the leaders from the rest of the pack in every industry when it comes to machine learning. But if you look at a lot of these projects, the reality is that there's a ton of buzz, there's a ton of hype spanning every acronym that you can imagine. But most companies are struggling, do the separate teams, different tools, silos and the limitation that many platforms are facing, driving, down sampling to get a small subset of the data, to try to create a model that then doesn't apply, or compromising accuracy and making it virtually impossible to replicate models, and understand decisions. And if there's one thing that we've learned when it comes to data, prescriptive data at the atomic level, being able to show end of one as we refer to it, meaning individually tailored data. No matter what it is healthcare, entertainment experiences, like gaming or other, being able to get at the granular data and make these decisions, make that scoring applies to machine learning just as much as it applies to giving somebody a next-best-offer. But the opportunity has never been greater. The need to integrate this end-to-end workflow and support the right tools without compromising on that accuracy. Think about it as no downsampling, using all the data, it really is key to machine learning success. Which should be no surprise then why the third big bet from Vertica is one that we've actually been working on for years. And we're so proud to be where we are today, helping the data disruptors across the world operationalize machine learning. This big bet has the potential to truly unlock, really the potential of machine learning. And today, we're announcing some very important new capabilities specifically focused on unifying the work being done by the data science community, with their preferred tools and platforms, and the volume of data and performance at scale, available in Vertica. Our strategy has been very consistent over the last several years. As I said in the beginning, we haven't deviated from our strategy. Of course, there's always things that we add. Most of the time, it's customer driven, it's based on what our customers are asking us to do. But I think we've also done a great job, not trying to be all things to all people. Especially as these hype cycles flare up around us, we absolutely love participating in these different areas without getting completely distracted. I mean, there's a variety of query tools and data warehouses and analytics platforms in the market. We all know that. There are tools and platforms that are offered by the public cloud vendors, by other vendors that support one or two specific clouds. There are appliance vendors, who I was referring to earlier who can deliver package data warehouse offerings for private data centers. And there's a ton of popular machine learning tools, languages and other kits. But Vertica is the only advanced analytic platform that can do all this, that can bring it together. We can analyze the data wherever it is, in HDFS, S3 Object Storage, or Vertica itself. Natively we support multiple clouds on-premise deployments, And maybe most importantly, we offer that choice of deployment modes to allow our customers to choose the architecture that works for them right now. It still also gives them the option to change move, evolve over time. And Vertica is the only analytics database with end-to-end machine learning that can truly operationalize ML at scale. And I know it's a mouthful. But it is not easy to do all these things. It is one of the things that highly differentiates Vertica from the rest of the pack. It is also why our customers, all of you continue to bet on us and see the value that we are delivering and we will continue to deliver. Here's a couple of examples of some of our customers who are powered by Vertica. It's the scale of data. It's the millisecond response times. Performance and scale have always been a huge part of what we have been about, not the only thing. I think the functionality all the capabilities that we add to the platform, the ease of use, the flexibility, obviously with the deployment. But if you look at some of the numbers they are under these customers on this slide. And I've shared a lot of different stories about these customers. Which, by the way, it still amaze me every time I talk to one and I get the updates, you can see the power and the difference that Vertica is making. Equally important, if you look at a lot of these customers, they are the epitome of being able to deploy Vertica in a lot of different environments. Many of the customers on this slide are not using Vertica just on-premise or just in the cloud. They're using it in a hybrid way. They're using it in multiple different clouds. And again, we've been with them on that journey throughout, which is what has made this product and frankly, our roadmap and our vision exactly what it is. It's been quite a journey. And that journey continues now with the Vertica 10 release. The Vertica 10 release is obviously a massive release for us. But if you look back, you can see that building on that native columnar architecture that started a long time ago, obviously, with the C-Store paper. We built it to leverage that commodity hardware, because it was an architecture that was never tightly integrated with any specific underlying infrastructure. I still remember hearing the initial pitch from Mike Stonebreaker, about the vision of Vertica as a software only solution and the importance of separating the company from hardware innovation. And at the time, Mike basically said to me, "there's so much R&D in innovation that's going to happen in hardware, we shouldn't bake hardware into our solution. We should do it in software, and we'll be able to take advantage of that hardware." And that is exactly what has happened. But one of the most recent innovations that we embraced with hardware is certainly that separation of compute and storage. As I said previously, the public cloud providers offered this next generation architecture, really to ensure that they can provide the customers exactly what they needed, more compute or more storage and charge for each, respectively. The separation of compute and storage, compute from storage is a major milestone in data center architectures. If you think about it, it's really not only a public cloud innovation, though. It fundamentally redefines the next generation data architecture for on-premise and for pretty much every way people are thinking about computing today. And that goes for software too. Object storage is an example of the cost effective means for storing data. And even more importantly, separating compute from storage for analytic workloads has a lot of advantages. Including the opportunity to manage much more dynamic, flexible workloads. And more importantly, truly isolate those workloads from others. And by the way, once you start having something that can truly isolate workloads, then you can have the conversations around autonomic computing, around setting up some nodes, some compute resources on the data that won't affect any of the other data to do some things on their own, maybe some self analytics, by the system, etc. A lot of things that many of you know we've already been exploring in terms of our own system data in the product. But it was May 2018, believe it or not, it seems like a long time ago where we first announced Eon Mode and I want to make something very clear, actually about Eon mode. It's a mode, it's a deployment option for Vertica customers. And I think this is another huge benefit that we don't talk about enough. But unlike a lot of vendors in the market who will dig you and charge you for every single add-on like hit-buy, you name it. You get this with the Vertica product. If you continue to pay support and maintenance, this comes with the upgrade. This comes as part of the new release. So any customer who owns or buys Vertica has the ability to set up either an Enterprise Mode or Eon Mode, which is a question I know that comes up sometimes. Our first announcement of Eon was obviously AWS customers, including the trade desk, AT&T. Most of whom will be speaking here later at the Virtual Big Data Conference. They saw a huge opportunity. Eon Mode, not only allowed Vertica to scale elastically with that specific compute and storage that was needed, but it really dramatically simplified database operations including things like workload balancing, node recovery, compute provisioning, etc. So one of the most popular functions is that ability to isolate the workloads and really allocate those resources without negatively affecting others. And even though traditional data warehouses, including Vertica Enterprise Mode have been able to do lots of different workload isolation, it's never been as strong as Eon Mode. Well, it certainly didn't take long for our customers to see that value across the board with Eon Mode. Not just up in the cloud, in partnership with one of our most valued partners and a platinum sponsor here. Joy mentioned at the beginning. We announced Vertica Eon Mode for Pure Storage FlashBlade in September 2019. And again, just to be clear, this is not a new product, it's one Vertica with yet more deployment options. With Pure Storage, Vertica in Eon mode is not limited in any way by variable cloud, network latency. The performance is actually amazing when you take the benefits of separate and compute from storage and you run it with a Pure environment on-premise. Vertica in Eon Mode has a super smart cache layer that we call the depot. It's a big part of our secret sauce around Eon mode. And combined with the power and performance of Pure's FlashBlade, Vertica became the industry's first advanced analytics platform that actually separates compute and storage for on-premises data centers. Something that a lot of our customers are already benefiting from, and we're super excited about it. But as I said, this is a journey. We don't stop, we're not going to stop. Our customers need the flexibility of multiple public clouds. So today with Vertica 10, we're super proud and excited to announce support for Vertica in Eon Mode on Google Cloud. This gives our customers the ability to use their Vertica licenses on Amazon AWS, on-premise with Pure Storage and on Google Cloud. Now, we were talking about HDFS and a lot of our customers who have invested quite a bit in HDFS as a place, especially to store data have been pushing us to support Eon Mode with HDFS. So as part of Vertica 10, we are also announcing support for Vertica in Eon Mode using HDFS as the communal storage. Vertica's own Roth format data can be stored in HDFS, and actually the full functionality of Vertica is complete analytics, geospatial pattern matching, time series, machine learning, everything that we have in there can be applied to this data. And on the same HDFS nodes, Vertica can actually also analyze data in ORC or Parquet format, using External tables. We can also execute joins between the Roth data the External table holds, which powers a much more comprehensive view. So again, it's that flexibility to be able to support our customers, wherever they need us to support them on whatever platform, they have. Vertica 10 gives us a lot more ways that we can deploy Eon Mode in various environments for our customers. It allows them to take advantage of Vertica in Eon Mode and the power that it brings with that separation, with that workload isolation, to whichever platform they are most comfortable with. Now, there's a lot that has come in Vertica 10. I'm definitely not going to be able to cover everything. But we also introduced complex types as an example. And complex data types fit very well into Eon as well in this separation. They significantly reduce the data pipeline, the cost of moving data between those, a much better support for unstructured data, which a lot of our customers have mixed with structured data, of course, and they leverage a lot of columnar execution that Vertica provides. So you get complex data types in Vertica now, a lot more data, stronger performance. It goes great with the announcement that we made with the broader Eon Mode. Let's talk a little bit more about machine learning. We've been actually doing work in and around machine learning with various extra regressions and a whole bunch of other algorithms for several years. We saw the huge advantage that MPP offered, not just as a sequel engine as a database, but for ML as well. Didn't take as long to realize that there's a lot more to operationalizing machine learning than just those algorithms. It's data preparation, it's that model trade training. It's the scoring, the shaping, the evaluation. That is so much of what machine learning and frankly, data science is about. You do know, everybody always wants to jump to the sexy algorithm and we handle those tasks very, very well. It makes Vertica a terrific platform to do that. A lot of work in data science and machine learning is done in other tools. I had mentioned that there's just so many tools out there. We want people to be able to take advantage of all that. We never believed we were going to be the best algorithm company or come up with the best models for people to use. So with Vertica 10, we support PMML. We can import now and export PMML models. It's a huge step for us around that operationalizing machine learning projects for our customers. Allowing the models to get built outside of Vertica yet be imported in and then applying to that full scale of data with all the performance that you would expect from Vertica. We also are more tightly integrating with Python. As many of you know, we've been doing a lot of open source projects with the community driven by many of our customers, like Uber. And so now with Python we've integrated with TensorFlow, allowing data scientists to build models in their preferred language, to take advantage of TensorFlow. But again, to store and deploy those models at scale with Vertica. I think both these announcements are proof of our big bet number three, and really our commitment to supporting innovation throughout the community by operationalizing ML with that accuracy, performance and scale of Vertica for our customers. Again, there's a lot of steps when it comes to the workflow of machine learning. These are some of them that you can see on the slide, and it's definitely not linear either. We see this as a circle. And companies that do it, well just continue to learn, they continue to rescore, they continue to redeploy and they want to operationalize all that within a single platform that can take advantage of all those capabilities. And that is the platform, with a very robust ecosystem that Vertica has always been committed to as an organization and will continue to be. This graphic, many of you have seen it evolve over the years. Frankly, if we put everything and everyone on here wouldn't fit on a slide. But it will absolutely continue to evolve and grow as we support our customers, where they need the support most. So, again, being able to deploy everywhere, being able to take advantage of Vertica, not just as a business analyst or a business user, but as a data scientists or as an operational or BI person. We want Vertica to be leveraged and used by the broader organization. So I think it's fair to say and I encourage everybody to learn more about Vertica 10, because I'm just highlighting some of the bigger aspects of it. But we talked about those three market trends. The need to unify the silos, the need for hybrid multiple cloud deployment options, the need to operationalize business critical machine learning projects. Vertica 10 has absolutely delivered on those. But again, we are not going to stop. It is our job not to, and this is how Team Vertica thrives. I always joke that the next release is the best release. And, of course, even after Vertica 10, that is also true, although Vertica 10 is pretty awesome. But, you know, from the first line of code, we've always been focused on performance and scale, right. And like any really strong data platform, the execution engine, the optimizer and the execution engine are the two core pieces of that. Beyond Vertica 10, some of the big things that we're already working on, next generation execution engine. We're already actually seeing incredible early performance from this. And this is just one example, of how important it is for an organization like Vertica to constantly go back and re-innovate. Every single release, we do the sit ups and crunches, our performance and scale. How do we improve? And there's so many parts of the core server, there's so many parts of our broader ecosystem. We are constantly looking at coverages of how we can go back to all the code lines that we have, and make them better in the current environment. And it's not an easy thing to do when you're doing that, and you're also expanding in the environment that we are expanding into to take advantage of the different deployments, which is a great segue to this slide. Because if you think about today, we're obviously already available with Eon Mode and Amazon, AWS and Pure and actually MinIO as well. As I talked about in Vertica 10 we're adding Google and HDFS. And coming next, obviously, Microsoft Azure, Alibaba cloud. So being able to expand into more of these environments is really important for the Vertica team and how we go forward. And it's not just running in these clouds, for us, we want it to be a SaaS like experience in all these clouds. We want you to be able to deploy Vertica in 15 minutes or less on these clouds. You can also consume Vertica, in a lot of different ways, on these clouds. As an example, in Amazon Vertica by the Hour. So for us, it's not just about running, it's about taking advantage of the ecosystems that all these cloud providers offer, and really optimizing the Vertica experience as part of them. Optimization, around automation, around self service capabilities, extending our management console, we now have products that like the Vertica Advisor Tool that our Customer Success Team has created to actually use our own smarts in Vertica. To take data from customers that give it to us and help them tune automatically their environment. You can imagine that we're taking that to the next level, in a lot of different endeavors that we're doing around how Vertica as a product can actually be smarter because we all know that simplicity is key. There just aren't enough people in the world who are good at managing data and taking it to the next level. And of course, other things that we all hear about, whether it's Kubernetes and containerization. You can imagine that that probably works very well with the Eon Mode and separating compute and storage. But innovation happens everywhere. We innovate around our community documentation. Many of you have taken advantage of the Vertica Academy. The numbers there are through the roof in terms of the number of people coming in and certifying on it. So there's a lot of things that are within the core products. There's a lot of activity and action beyond the core products that we're taking advantage of. And let's not forget why we're here, right? It's easy to talk about a platform, a data platform, it's easy to jump into all the functionality, the analytics, the flexibility, how we can offer it. But at the end of the day, somebody, a person, she's got to take advantage of this data, she's got to be able to take this data and use this information to make a critical business decision. And that doesn't happen unless we explore lots of different and frankly, new ways to get that predictive analytics UI and interface beyond just the standard BI tools in front of her at the right time. And so there's a lot of activity, I'll tease you with that going on in this organization right now about how we can do that and deliver that for our customers. We're in a great position to be able to see exactly how this data is consumed and used and start with this core platform that we have to go out. Look, I know, the plan wasn't to do this as a virtual BDC. But I really appreciate you tuning in. Really appreciate your support. I think if there's any silver lining to us, maybe not being able to do this in person, it's the fact that the reach has actually gone significantly higher than what we would have been able to do in person in Boston. We're certainly looking forward to doing a Big Data Conference in the future. But if I could leave you with anything, know this, since that first release for Vertica, and our very first customers, we have been very consistent. We respect all the innovation around us, whether it's open source or not. We understand the market trends. We embrace those new ideas and technologies and for us true north, and the most important thing is what does our customer need to do? What problem are they trying to solve? And how do we use the advantages that we have without disrupting our customers? But knowing that you depend on us to deliver that unified analytics strategy, it will deliver that performance of scale, not only today, but tomorrow and for years to come. We've added a lot of great features to Vertica. I think we've said no to a lot of things, frankly, that we just knew we wouldn't be the best company to deliver. When we say we're going to do things we do them. Vertica 10 is a perfect example of so many of those things that we from you, our customers have heard loud and clear, and we have delivered. I am incredibly proud of this team across the board. I think the culture of Vertica, a customer first culture, jumping in to help our customers win no matter what is also something that sets us massively apart. I hear horror stories about support experiences with other organizations. And people always seem to be amazed at Team Vertica's willingness to jump in or their aptitude for certain technical capabilities or understanding the business. And I think sometimes we take that for granted. But that is the team that we have as Team Vertica. We are incredibly excited about Vertica 10. I think you're going to love the Virtual Big Data Conference this year. I encourage you to tune in. Maybe one other benefit is I know some people were worried about not being able to see different sessions because they were going to overlap with each other well now, even if you can't do it live, you'll be able to do those sessions on demand. Please enjoy the Vertica Big Data Conference here in 2020. Please you and your families and your co-workers be safe during these times. I know we will get through it. And analytics is probably going to help with a lot of that and we already know it is helping in many different ways. So believe in the data, believe in data's ability to change the world for the better. And thank you for your time. And with that, I am delighted to now introduce Micro Focus CEO Stephen Murdoch to the Vertica Big Data Virtual Conference. Thank you Stephen. >> Stephen: Hi, everyone, my name is Stephen Murdoch. I have the pleasure and privilege of being the Chief Executive Officer here at Micro Focus. Please let me add my welcome to the Big Data Conference. And also my thanks for your support, as we've had to pivot to this being virtual rather than a physical conference. Its amazing how quickly we all reset to a new normal. I certainly didn't expect to be addressing you from my study. Vertica is an incredibly important part of Micro Focus family. Is key to our goal of trying to enable and help customers become much more data driven across all of their IT operations. Vertica 10 is a huge step forward, we believe. It allows for multi-cloud innovation, genuinely hybrid deployments, begin to leverage machine learning properly in the enterprise, and also allows the opportunity to unify currently siloed lakes of information. We operate in a very noisy, very competitive market, and there are people, who are in that market who can do some of those things. The reason we are so excited about Vertica is we genuinely believe that we are the best at doing all of those things. And that's why we've announced publicly, you're under executing internally, incremental investment into Vertica. That investments targeted at accelerating the roadmaps that already exist. And getting that innovation into your hands faster. This idea is speed is key. It's not a question of if companies have to become data driven organizations, it's a question of when. So that speed now is really important. And that's why we believe that the Big Data Conference gives a great opportunity for you to accelerate your own plans. You will have the opportunity to talk to some of our best architects, some of the best development brains that we have. But more importantly, you'll also get to hear from some of our phenomenal Roth Data customers. You'll hear from Uber, from the Trade Desk, from Philips, and from AT&T, as well as many many others. And just hearing how those customers are using the power of Vertica to accelerate their own, I think is the highlight. And I encourage you to use this opportunity to its full. Let me close by, again saying thank you, we genuinely hope that you get as much from this virtual conference as you could have from a physical conference. And we look forward to your engagement, and we look forward to hearing your feedback. With that, thank you very much. >> Joy: Thank you so much, Stephen, for joining us for the Vertica Big Data Conference. Your support and enthusiasm for Vertica is so clear, and it makes a big difference. Now, I'm delighted to introduce Amy Fowler, the VP of Strategy and Solutions for FlashBlade at Pure Storage, who was one of our BDC Platinum Sponsors, and one of our most valued partners. It was a proud moment for me, when we announced Vertica in Eon mode for Pure Storage FlashBlade and we became the first analytics data warehouse that separates compute from storage for on-premise data centers. Thank you so much, Amy, for joining us. Let's get started. >> Amy: Well, thank you, Joy so much for having us. And thank you all for joining us today, virtually, as we may all be. So, as we just heard from Colin Mahony, there are some really interesting trends that are happening right now in the big data analytics market. From the end of the Hadoop hype cycle, to the new cloud reality, and even the opportunity to help the many data science and machine learning projects move from labs to production. So let's talk about these trends in the context of infrastructure. And in particular, look at why a modern storage platform is relevant as organizations take on the challenges and opportunities associated with these trends. The answer is the Hadoop hype cycles left a lot of data in HDFS data lakes, or reservoirs or swamps depending upon the level of the data hygiene. But without the ability to get the value that was promised from Hadoop as a platform rather than a distributed file store. And when we combine that data with the massive volume of data in Cloud Object Storage, we find ourselves with a lot of data and a lot of silos, but without a way to unify that data and find value in it. Now when you look at the infrastructure data lakes are traditionally built on, it is often direct attached storage or data. The approach that Hadoop took when it entered the market was primarily bound by the limits of networking and storage technologies. One gig ethernet and slower spinning disk. But today, those barriers do not exist. And all FlashStorage has fundamentally transformed how data is accessed, managed and leveraged. The need for local data storage for significant volumes of data has been largely mitigated by the performance increases afforded by all Flash. At the same time, organizations can achieve superior economies of scale with that segregation of compute and storage. With compute and storage, you don't always scale in lockstep. Would you want to add an engine to the train every time you add another boxcar? Probably not. But from a Pure Storage perspective, FlashBlade is uniquely architected to allow customers to achieve better resource utilization for compute and storage, while at the same time, reducing complexity that has arisen from the siloed nature of the original big data solutions. The second and equally important recent trend we see is something I'll call cloud reality. The public clouds made a lot of promises and some of those promises were delivered. But cloud economics, especially usage based and elastic scaling, without the control that many companies need to manage the financial impact is causing a lot of issues. In addition, the risk of vendor lock-in from data egress, charges, to integrated software stacks that can't be moved or deployed on-premise is causing a lot of organizations to back off the all the way non-cloud strategy, and move toward hybrid deployments. Which is kind of funny in a way because it wasn't that long ago that there was a lot of talk about no more data centers. And for example, one large retailer, I won't name them, but I'll admit they are my favorites. They several years ago told us they were completely done with on-prem storage infrastructure, because they were going 100% to the cloud. But they just deployed FlashBlade for their data pipelines, because they need predictable performance at scale. And the all cloud TCO just didn't add up. Now, that being said, well, there are certainly challenges with the public cloud. It has also brought some things to the table that we see most organizations wanting. First of all, in a lot of cases applications have been built to leverage object storage platforms like S3. So they need that object protocol, but they may also need it to be fast. And the said object may be oxymoron only a few years ago, and this is an area of the market where Pure and FlashBlade have really taken a leadership position. Second, regardless of where the data is physically stored, organizations want the best elements of a cloud experience. And for us, that means two main things. Number one is simplicity and ease of use. If you need a bunch of storage experts to run the system, that should be considered a bug. The other big one is the consumption model. The ability to pay for what you need when you need it, and seamlessly grow your environment over time totally nondestructively. This is actually pretty huge and something that a lot of vendors try to solve for with finance programs. But no finance program can address the pain of a forklift upgrade, when you need to move to next gen hardware. To scale nondestructively over long periods of time, five to 10 years plus is a crucial architectural decisions need to be made at the outset. Plus, you need the ability to pay as you use it. And we offer something for FlashBlade called Pure as a Service, which delivers exactly that. The third cloud characteristic that many organizations want is the option for hybrid. Even if that is just a DR site in the cloud. In our case, that means supporting appplication of S3, at the AWS. And the final trend, which to me represents the biggest opportunity for all of us, is the need to help the many data science and machine learning projects move from labs to production. This means bringing all the machine learning functions and model training to the data, rather than moving samples or segments of data to separate platforms. As we all know, machine learning needs a ton of data for accuracy. And there is just too much data to retrieve from the cloud for every training job. At the same time, predictive analytics without accuracy is not going to deliver the business advantage that everyone is seeking. You can kind of visualize data analytics as it is traditionally deployed as being on a continuum. With that thing, we've been doing the longest, data warehousing on one end, and AI on the other end. But the way this manifests in most environments is a series of silos that get built up. So data is duplicated across all kinds of bespoke analytics and AI, environments and infrastructure. This creates an expensive and complex environment. So historically, there was no other way to do it because some level of performance is always table stakes. And each of these parts of the data pipeline has a different workload profile. A single platform to deliver on the multi dimensional performances, diverse set of applications required, that didn't exist three years ago. And that's why the application vendors pointed you towards bespoke things like DAS environments that we talked about earlier. And the fact that better options exists today is why we're seeing them move towards supporting this disaggregation of compute and storage. And when it comes to a platform that is a better option, one with a modern architecture that can address the diverse performance requirements of this continuum, and allow organizations to bring a model to the data instead of creating separate silos. That's exactly what FlashBlade is built for. Small files, large files, high throughput, low latency and scale to petabytes in a single namespace. And this is importantly a single rapid space is what we're focused on delivering for our customers. At Pure, we talk about it in the context of modern data experience because at the end of the day, that's what it's really all about. The experience for your teams in your organization. And together Pure Storage and Vertica have delivered that experience to a wide range of customers. From a SaaS analytics company, which uses Vertica on FlashBlade to authenticate the quality of digital media in real time, to a multinational car company, which uses Vertica on FlashBlade to make thousands of decisions per second for autonomous cars, or a healthcare organization, which uses Vertica on FlashBlade to enable healthcare providers to make real time decisions that impact lives. And I'm sure you're all looking forward to hearing from John Yavanovich from AT&T. To hear how he's been doing this with Vertica and FlashBlade as well. He's coming up soon. We have been really excited to build this partnership with Vertica. And we're proud to provide the only on-premise storage platform validated with Vertica Eon Mode. And deliver this modern data experience to our customers together. Thank you all so much for joining us today. >> Joy: Amy, thank you so much for your time and your insights. Modern infrastructure is key to modern analytics, especially as organizations leverage next generation data center architectures, and object storage for their on-premise data centers. Now, I'm delighted to introduce our last speaker in our Vertica Big Data Conference Keynote, John Yovanovich, Director of IT for AT&T. Vertica is so proud to serve AT&T, and especially proud of the harmonious impact we are having in partnership with Pure Storage. John, welcome to the Virtual Vertica BDC. >> John: Thank you joy. It's a pleasure to be here. And I'm excited to go through this presentation today. And in a unique fashion today 'cause as I was thinking through how I wanted to present the partnership that we have formed together between Pure Storage, Vertica and AT&T, I want to emphasize how well we all work together and how these three components have really driven home, my desire for a harmonious to use your word relationship. So, I'm going to move forward here and with. So here, what I'm going to do the theme of today's presentation is the Pure Vertica Symphony live at AT&T. And if anybody is a Westworld fan, you can appreciate the sheet music on the right hand side. What we're going to what I'm going to highlight here is in a musical fashion, is how we at AT&T leverage these technologies to save money to deliver a more efficient platform, and to actually just to make our customers happier overall. So as we look back, and back as early as just maybe a few years ago here at AT&T, I realized that we had many musicians to help the company. Or maybe you might want to call them data scientists, or data analysts. For the theme we'll stay with musicians. None of them were singing or playing from the same hymn book or sheet music. And so what we had was many organizations chasing a similar dream, but not exactly the same dream. And, best way to describe that is and I think with a lot of people this might resonate in your organizations. How many organizations are chasing a customer 360 view in your company? Well, I can tell you that I have at least four in my company. And I'm sure there are many that I don't know of. That is our problem because what we see is a repetitive sourcing of data. We see a repetitive copying of data. And there's just so much money to be spent. This is where I asked Pure Storage and Vertica to help me solve that problem with their technologies. What I also noticed was that there was no coordination between these departments. In fact, if you look here, nobody really wants to play with finance. Sales, marketing and care, sure that you all copied each other's data. But they actually didn't communicate with each other as they were copying the data. So the data became replicated and out of sync. This is a challenge throughout, not just my company, but all companies across the world. And that is, the more we replicate the data, the more problems we have at chasing or conquering the goal of single version of truth. In fact, I kid that I think that AT&T, we actually have adopted the multiple versions of truth, techno theory, which is not where we want to be, but this is where we are. But we are conquering that with the synergies between Pure Storage and Vertica. This is what it leaves us with. And this is where we are challenged and that if each one of our siloed business units had their own stories, their own dedicated stories, and some of them had more money than others so they bought more storage. Some of them anticipating storing more data, and then they really did. Others are running out of space, but can't put anymore because their bodies aren't been replenished. So if you look at it from this side view here, we have a limited amount of compute or fixed compute dedicated to each one of these silos. And that's because of the, wanting to own your own. And the other part is that you are limited or wasting space, depending on where you are in the organization. So there were the synergies aren't just about the data, but actually the compute and the storage. And I wanted to tackle that challenge as well. So I was tackling the data. I was tackling the storage, and I was tackling the compute all at the same time. So my ask across the company was can we just please play together okay. And to do that, I knew that I wasn't going to tackle this by getting everybody in the same room and getting them to agree that we needed one account table, because they will argue about whose account table is the best account table. But I knew that if I brought the account tables together, they would soon see that they had so much redundancy that I can now start retiring data sources. I also knew that if I brought all the compute together, that they would all be happy. But I didn't want them to tackle across tackle each other. And in fact that was one of the things that all business units really enjoy. Is they enjoy the silo of having their own compute, and more or less being able to control their own destiny. Well, Vertica's subclustering allows just that. And this is exactly what I was hoping for, and I'm glad they've brought through. And finally, how did I solve the problem of the single account table? Well when you don't have dedicated storage, and you can separate compute and storage as Vertica in Eon Mode does. And we store the data on FlashBlades, which you see on the left and right hand side, of our container, which I can describe in a moment. Okay, so what we have here, is we have a container full of compute with all the Vertica nodes sitting in the middle. Two loader, we'll call them loader subclusters, sitting on the sides, which are dedicated to just putting data onto the FlashBlades, which is sitting on both ends of the container. Now today, I have two dedicated storage or common dedicated might not be the right word, but two storage racks one on the left one on the right. And I treat them as separate storage racks. They could be one, but i created them separately for disaster recovery purposes, lashing work in case that rack were to go down. But that being said, there's no reason why I'm probably going to add a couple of them here in the future. So I can just have a, say five to 10, petabyte storage, setup, and I'll have my DR in another 'cause the DR shouldn't be in the same container. Okay, but I'll DR outside of this container. So I got them all together, I leveraged subclustering, I leveraged separate and compute. I was able to convince many of my clients that they didn't need their own account table, that they were better off having one. I eliminated, I reduced latency, I reduced our ticketing I reduce our data quality issues AKA ticketing okay. I was able to expand. What is this? As work. I was able to leverage elasticity within this cluster. As you can see, there are racks and racks of compute. We set up what we'll call the fixed capacity that each of the business units needed. And then I'm able to ramp up and release the compute that's necessary for each one of my clients based on their workloads throughout the day. And so while they compute to the right before you see that the instruments have already like, more or less, dedicated themselves towards all those are free for anybody to use. So in essence, what I have, is I have a concert hall with a lot of seats available. So if I want to run a 10 chair Symphony or 80, chairs, Symphony, I'm able to do that. And all the while, I can also do the same with my loader nodes. I can expand my loader nodes, to actually have their own Symphony or write all to themselves and not compete with any other workloads of the other clusters. What does that change for our organization? Well, it really changes the way our database administrators actually do their jobs. This has been a big transformation for them. They have actually become data conductors. Maybe you might even call them composers, which is interesting, because what I've asked them to do is morph into less technology and more workload analysis. And in doing so we're able to write auto-detect scripts, that watch the queues, watch the workloads so that we can help ramp up and trim down the cluster and subclusters as necessary. There has been an exciting transformation for our DBAs, who I need to now classify as something maybe like DCAs. I don't know, I have to work with HR on that. But I think it's an exciting future for their careers. And if we bring it all together, If we bring it all together, and then our clusters, start looking like this. Where everything is moving in harmonious, we have lots of seats open for extra musicians. And we are able to emulate a cloud experience on-prem. And so, I want you to sit back and enjoy the Pure Vertica Symphony live at AT&T. (soft music) >> Joy: Thank you so much, John, for an informative and very creative look at the benefits that AT&T is getting from its Pure Vertica symphony. I do really like the idea of engaging HR to change the title to Data Conductor. That's fantastic. I've always believed that music brings people together. And now it's clear that analytics at AT&T is part of that musical advantage. So, now it's time for a short break. And we'll be back for our breakout sessions, beginning at 12 pm Eastern Daylight Time. We have some really exciting sessions planned later today. And then again, as you can see on Wednesday. Now because all of you are already logged in and listening to this keynote, you already know the steps to continue to participate in the sessions that are listed here and on the previous slide. In addition, everyone received an email yesterday, today, and you'll get another one tomorrow, outlining the simple steps to register, login and choose your session. If you have any questions, check out the emails or go to www.vertica.com/bdc2020 for the logistics information. There are a lot of choices and that's always a good thing. Don't worry if you want to attend one or more or can't listen to these live sessions due to your timezone. All the sessions, including the Q&A sections will be available on demand and everyone will have access to the recordings as well as even more pre-recorded sessions that we'll post to the BDC website. Now I do want to leave you with two other important sites. First, our Vertica Academy. Vertica Academy is available to everyone. And there's a variety of very technical, self-paced, on-demand training, virtual instructor-led workshops, and Vertica Essentials Certification. And it's all free. Because we believe that Vertica expertise, helps everyone accelerate their Vertica projects and the advantage that those projects deliver. Now, if you have questions or want to engage with our Vertica engineering team now, we're waiting for you on the Vertica forum. We'll answer any questions or discuss any ideas that you might have. Thank you again for joining the Vertica Big Data Conference Keynote Session. Enjoy the rest of the BDC because there's a lot more to come

Published Date : Mar 30 2020

SUMMARY :

And he'll share the exciting news And that is the platform, with a very robust ecosystem some of the best development brains that we have. the VP of Strategy and Solutions is causing a lot of organizations to back off the and especially proud of the harmonious impact And that is, the more we replicate the data, Enjoy the rest of the BDC because there's a lot more to come

ENTITIES

Entity	Category	Confidence
Stephen	PERSON	0.99+
Amy Fowler	PERSON	0.99+
Mike	PERSON	0.99+
John Yavanovich	PERSON	0.99+
Amy	PERSON	0.99+
Colin Mahony	PERSON	0.99+
AT&T	ORGANIZATION	0.99+
Boston	LOCATION	0.99+
John Yovanovich	PERSON	0.99+
Vertica	ORGANIZATION	0.99+
Joy King	PERSON	0.99+
Mike Stonebreaker	PERSON	0.99+
John	PERSON	0.99+
May 2018	DATE	0.99+
100%	QUANTITY	0.99+
Wednesday	DATE	0.99+
Colin	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Vertica Academy	ORGANIZATION	0.99+
five	QUANTITY	0.99+
Joy	PERSON	0.99+
2020	DATE	0.99+
two	QUANTITY	0.99+
Uber	ORGANIZATION	0.99+
Stephen Murdoch	PERSON	0.99+
Vertica 10	TITLE	0.99+
Pure Storage	ORGANIZATION	0.99+
one	QUANTITY	0.99+
today	DATE	0.99+
Philips	ORGANIZATION	0.99+
tomorrow	DATE	0.99+
AT&T.	ORGANIZATION	0.99+
September 2019	DATE	0.99+
Python	TITLE	0.99+
www.vertica.com/bdc2020	OTHER	0.99+
One gig	QUANTITY	0.99+
Amazon	ORGANIZATION	0.99+
Second	QUANTITY	0.99+
First	QUANTITY	0.99+
15 minutes	QUANTITY	0.99+
yesterday	DATE	0.99+

Breaking Analysis: RPA: Over-Hyped or the Next Big Thing?

from the silicon angle media office in Boston Massachusetts it's the queue now here's your host David on tape hello everyone and welcome to this week's episode of wiki bots cube insights powered by EGR in this breaking analysis we take a deeper dive into the world of robotic process automation otherwise known as RPA it's one of the hottest sectors in software today in fact Gartner says it's the fastest growing software sector that they follow in this session I want to break down three questions one is the RP a market overvalued - how large is the total available market for RP a and three who were the winners and losers in this space now before we address the first question here's what you need to know about RP a the market today is small but it's growing fast the software only revenue for the space was about 1 billion dollars in 2019 and it's growing it between 80 to a hundred percent annually RP a has been very popular in larger organizations especially in back-office functions really in regulated industries like financial services and healthcare RP a has been successful at automating the mundane repeatable deterministic tasks and most automations today are unattended the industry is very well funded with the top two firms raising nearly 1 billion dollars in the past couple of years they have a combined market value of nearly 14 billion now some people in the art community have said that RP a is hyped and looks like a classic pump and dump situation we're gonna look into that and really try to explore the valuation and customer data and really try to come to some conclusions there we see big software companies like Microsoft and sa P entering the scene and we want to comment on that a little later in this segment now RBA players have really cleverly succeeded in selling to the business lines and often a bypassed IT now sometimes that creates tension in or as I said customers are typically very large organizations who can shell out the hundred thousand dollar plus entry point to get into the RP a game the Tam is expanding beyond back office into broader on a broader automation agenda hyper automation is the buzzword of the day and there are varying definitions Gartner looks at hyper automation as the incorporation of RPA along with intelligent business process management I BPM and I pass or intelligent platform-as-a-service Gardner's definition takes a holistic view of the enterprise incorporating legacy on-prem app apps as well as emerging systems now this is good but I question whether the hyper term applies here as we see hyper automation as the extension of our PA to include process mining to discover new automations or new automation opportunities and the use of machine intelligence ml and a I applied to process data data where that combination drives intelligence analytics that further drives digital business process transformation across the enterprise so the point is that we envision a more agile framework and definition for hyper automation we see legacy BPM systems informing the transformation but not necessarily adjudicating the path forward we liken this to the early days of big data where legacy data warehouses and ETL processes provided useful context but organizations had to develop a new tech stack that broke the stranglehold of technical debt we're seeing this emerge in the form of new workloads powered by emerging analytic databases like redshift and snowflake with ml tools applied and cloud driving agile insights in that so-called Big Data space so we think a similar renaissance is happening here with with automation really driven by the money the mandate for digital business transformation along with machine intelligence and that tooling applied for a really driving automation across the enterprise in a form of augmentation with attended BOTS at scale becoming much much more important over time ok now let's shift gears a little bit question is the RP a market overhyped and overvalued now to answer this let's go through a bit of a thought exercise that we've put together and look at some data what this chart shows is some critical data points that will begin to help answer the question that we've posed in the top part of the chart we show the company the VC funding projected valuations and revenue estimates for 2019 and 2020 and as you can see uipath an automation any where are the hot companies right now they're private so much of this data is estimated but we know how much money they've raised and we know the valuations that have been reported so the RP a software market is around a billion dollars today and we have it almost doubling in 2020 now the bottom part of this chart shows the projected market revenue growth and the implied valuations for the market as a whole so you can see today we show a mark that is trading at about 15 to 17 times revenue which seems like a very high multiple but over time we show that multiple shrinking and settling in mid decade at just over 5x which for software is pretty conservative especially for high-growth software now what we've done on this next chart is we brought down that market growth and the implied valuation data and highlighted twenty twenty-five at seventy-five billion dollars the market growth will have slowed by then to twenty percent in this model and this thought exercise with a revenue multiple of five point four x for the overall market now eventually as growth slows RBA software will start to throw off profits at least it better so what we show here is a sensitivity analysis assuming a 20% 25% 30% and 35% for the market as a whole we're using that as a proxy and we show a 20/20 X even multiple which for a market growing the software market growing this fast you know we think is pretty reasonable consider the tech overall typically is gonna have a an even multiple of ten to fifteen you know X it really should be easy your enterprise value over a bit it's really a more accurate measure but but this is back in the Afghan on the balance sheet date and I'm a forecast all-out but we're trying to just sort of get to the question is is this market overvalued and as you can see in the Far column given these assumptions we're in the range of that seventy five billion dollar market valuation with that Delta now reality you're going to have some companies growing faster than the market overall and we'll see a lot of consolidation in this space but at the macro level it would seem that the company which can lead and when the Spoils is gonna really benefit okay so these figures actually suggest in my view that the market could be undervalued that sounds crazy right but look at companies like ServiceNow and work day and look at snowflakes recent valuation at twelve billion dollars so are the valuations for uipath and automation anywhere justified well in part it depends on the size of the market the TAM total available market in their ability to break out of back-office niches and deliver these types of revenue figures and growth you know maybe my forecasts are a little too aggressive in the early days but in my experience the traditional forecast that we see in the marketplace tend to underestimate transformative technologies you tend to have these sort of o guides where you know it takes off and really steep ins and it has a sharp curve and then tapers off so we'll see but let's take a closer look at the Tam but you know first I want to introduce a customer view point here's Eric's Lac Eric Lex who's an RPA pro at GE talking about his company's RPA journey play the clip I would say in terms of our journey 2017 was kind of our year to prove the technology we wanted to see if this stuff could really work long term and operate at scale given that I'm still here obviously we proved that was correct and then 2018 was kind of the year of scaling and operationalizing kind of a a sustainable model to support our business units across the board from an RPA standpoint so really building out a proper structure building out the governance that goes along with building robots and building a kind of a resource team to continue to support the bots that that you know we were at scale at that point so maintaining those bots is critically important that's the direction we're moving in 2019 we've kind of perfected the concept of the back office robot and the development of those and running those at scale and now we're moving towards you know a whole new market share when it comes to attended automation and citizen Development so this is a story we've heard from many customers and we've tried to reflect it in this graphic that we're showing here start small get some wins prove out the tech really in the back office and then drive customer facing activities we see this as the starting point for more SME driven digital transformations where business line pros are rethinking processes and developing new automations you know either in low code scenarios or with Centers of Excellence now this vision of hyper automation we think comes from the ability to do process mining and identify automation opportunities and then bring our PA to the table using machine learning and AI to understand text voice visual context and ultimately use that process data to transform the business this is an outcome driven model where organizations are optimizing on business KPIs and incentives are aligned accordingly so we see this vision as potentially unlocking a very large Tam that perhaps exceeds 30 billion dollars go now let's bring in some of these spending data and take a look at what the ETR data set tells us about the RPA market now the first thing that jumps out at you is our PA is one of the fastest growing segments in the data set you can see that green box and that blue dot at around 20% that's the change in spending velocity in the 2020 survey versus last year now the one caveat is I'm isolating on global 2000 companies in this data set and as you can see in in that red bar up on the left and remember our PA today is really hot in large companies but not nearly as fast growing when you analyze the overall respondent base and which includes smaller organizations nonetheless this chart shows net scores and market shares for our PA across all respondents remember net score is a measure of spending velocity and market share is a measure of pervasiveness in the survey and what you see here is that our PA net scores are holding steadily the nice rate and market shares are creeping up relative to other segments in the data set now remember this is across all companies but we want to use the ETR data understand who is winning in this space now what this chart shows is net score or spending velocity on the vertical axis and market share or pervasiveness on the horizontal axis for each individual player and as we run through this sequence from January 18 survey through today across the nine surveys look at uipath an automation anywhere but look at uipath in particular they really appear to be breaking away from the pack now here's another look at the data it shows net scores or spending velocity for uipath automation anywhere blue prism pegye systems and work fusion now these are all very strong net scores which are essentially calculated by subtracting the percent of customers spending less from those spending more the two leaders here uipath and automation anywhere August but the rest rest are actually quite good there in the green but look at this look what happens when you isolate on the 349 global 2,000 respondents in the survey uipath jumps into the 80 percent net score territory again spending velocity automation anywhere dips a little bit pegye systems interestingly jumps up nicely but look at blue prism they fall back in the larger global 2000 accounts which is a bit of a concern now the other key point on this chart is that 85% of UI customers and 70% of automation anywhere customers plan to spend more this year than they spent last year that is pretty impressive now as you can see here in this chart the global 2000 have been pretty consistent spenders on our PA for the past three survey snapshots uipath again showing net scores or spending intensity solidly in the 80% plus range and even though it's a smaller end you can see pay go with a nice uptick in the last two surveys within these larger accounts now finally let's look at what ETR calls market share which is a measure of pervasiveness in the survey this chart shows data from all 1000 plus respondents and as you can see UI path appears to be breaking out from the pack automation anywhere in pega are showing an uptick in the january survey and blue prism is trending down a little bit which is something to watch but you can see in the upper right all four companies are in the green with regard to net score or against pending velocity so let's summarize it and wrap up is this market overhyped well it probably is overhyped but is it overvalued I don't think so the customer feedback that we have in the community and the proof points are really starting to stack up so with continued revenue growth and eventually profits you can make the case that whoever comes out on top will really do well and see huge returns in this market space let's come back to that in a moment how large is this market I think this market can be very large at am of 30 billion pluses not out of the question in my view now that realization will be a function of RPAs ability to break into more use cases with deeper business integration RBA has an opportunity in our view to cross the chasm and deliver lower code solutions to subject matter experts in business lines that are in a stronger position to drive change now a lot of people poopoo this notion and this concept but I think it's something that is a real possibility this idea of hyper automation is buzzword e but it has meaning companies that bring RPA together with process mining and machine intelligence that tries process analytics has great potential if organizational stovepipes can be broken down in other words put process data and analytics at the core to drive decision-making and change now who wins let me say this the company that breaks out and hits escape velocity is going to make a lot of money here now unlike what I said in last week's braking analysis on cloud computing this is more of a winner-take-all market it's not a trillion dollar team like cloud it's tens of billions and maybe north to 30 billion but it's somewhat of a zero-sum game in my opinion the number one player is going to make a lot of dough number two will do okay and in my view everyone else is going to struggle for profits now the big wildcard is the degree to which the big software players like Microsoft and sa P poison the RPA well now here's what I think I think these big software players are taking an incremental view of the market and are bundling in RPA is a check off item they will not be the ones to drive radical process transformation rather they will siphon off some demand but organizations that really want to benefit from so-called hyper automation will be leaning heavily on software from specialists who have the vision the resources the culture in the focus to drive digital process transformation alright that's a wrap as always I really appreciate the comments that I get on my LinkedIn posts and on Twitter I'm at at D Volante so thanks for that and thanks for watching everyone this is Dave Volante for the cube insights powered by ETR and we'll see you next time

Published Date : Feb 15 2020

**Summary and Sentiment Analysis are not been shown because of improper transcript**

ENTITIES

Entity	Category	Confidence
January 18	DATE	0.99+
Microsoft	ORGANIZATION	0.99+
2019	DATE	0.99+
twenty percent	QUANTITY	0.99+
2018	DATE	0.99+
2020	DATE	0.99+
85%	QUANTITY	0.99+
first question	QUANTITY	0.99+
30 billion	QUANTITY	0.99+
80 percent	QUANTITY	0.99+
seventy-five billion dollars	QUANTITY	0.99+
70%	QUANTITY	0.99+
80%	QUANTITY	0.99+
tens of billions	QUANTITY	0.99+
Dave Volante	PERSON	0.99+
twelve billion dollars	QUANTITY	0.99+
GE	ORGANIZATION	0.99+
35%	QUANTITY	0.99+
20%	QUANTITY	0.99+
David	PERSON	0.99+
30 billion dollars	QUANTITY	0.99+
two leaders	QUANTITY	0.99+
three questions	QUANTITY	0.99+
Gartner	ORGANIZATION	0.99+
last year	DATE	0.99+
2017	DATE	0.99+
last week	DATE	0.99+
hundred thousand dollar	QUANTITY	0.99+
today	DATE	0.99+
August	DATE	0.99+
Delta	ORGANIZATION	0.99+
ten	QUANTITY	0.99+
Gardner	PERSON	0.99+
nine surveys	QUANTITY	0.98+
EGR	ORGANIZATION	0.98+
Boston Massachusetts	LOCATION	0.98+
january	DATE	0.98+
twenty twenty-five	QUANTITY	0.98+
around a billion dollars	QUANTITY	0.98+
first thing	QUANTITY	0.98+
nearly 14 billion	QUANTITY	0.98+
about 1 billion dollars	QUANTITY	0.97+
nearly 1 billion dollars	QUANTITY	0.97+
ServiceNow	ORGANIZATION	0.97+
80	QUANTITY	0.97+
three	QUANTITY	0.97+
around 20%	QUANTITY	0.96+
a lot of money	QUANTITY	0.96+
this year	DATE	0.96+
one caveat	QUANTITY	0.96+
Eric Lex	PERSON	0.96+
2000 companies	QUANTITY	0.95+
349	QUANTITY	0.95+
seventy five billion dollar	QUANTITY	0.95+
17 times	QUANTITY	0.95+
first	QUANTITY	0.95+
25%	QUANTITY	0.95+
five	QUANTITY	0.95+
2,000 respondents	QUANTITY	0.95+
this week	DATE	0.95+
one	QUANTITY	0.93+
two firms	QUANTITY	0.93+
30%	QUANTITY	0.93+
ETR	ORGANIZATION	0.93+
D Volante	ORGANIZATION	0.92+
each individual player	QUANTITY	0.92+
fifteen	QUANTITY	0.91+
trillion dollar	QUANTITY	0.9+
uipath	ORGANIZATION	0.9+
pega	LOCATION	0.89+
about 15	QUANTITY	0.87+
1000 plus respondents	QUANTITY	0.86+
past couple of years	DATE	0.85+
wiki	TITLE	0.82+
LinkedIn	ORGANIZATION	0.81+
over 5x	QUANTITY	0.79+
number one	QUANTITY	0.78+
RP a	ORGANIZATION	0.77+
Centers of Excellence	ORGANIZATION	0.75+
Afghan	LOCATION	0.72+
a hundred percent	QUANTITY	0.72+
Twitter	ORGANIZATION	0.71+

Infrastructure For Big Data Workloads

>> From the SiliconANGLE media office in Boston, Massachusetts, it's theCUBE! Now, here's your host, Dave Vellante. >> Hi, everybody, welcome to this special CUBE Conversation. You know, big data workloads have evolved, and the infrastructure that runs big data workloads is also evolving. Big data, AI, other emerging workloads need infrastructure that can keep up. Welcome to this special CUBE Conversation with Patrick Osborne, who's the vice president and GM of big data and secondary storage at Hewlett Packard Enterprise, @patrick_osborne. Great to see you again, thanks for coming on. >> Great, love to be back here. >> As I said up front, big data's changing. It's evolving, and the infrastructure has to also evolve. What are you seeing, Patrick, and what's HPE seeing in terms of the market forces right now driving big data and analytics? >> Well, some of the things that we see in the data center, there is a continuous move to move from bare metal to virtualized. Everyone's on that train. To containerization of existing apps, your apps of record, business, mission-critical apps. But really, what a lot of folks are doing right now is adding additional services to those applications, those data sets, so, new ways to interact, new apps. A lot of those are being developed with a lot of techniques that revolve around big data and analytics. We're definitely seeing the pressure to modernize what you have on-prem today, but you know, you can't sit there and be static. You gotta provide new services around what you're doing for your customers. A lot of those are coming in the form of this Mode 2 type of application development. >> One of the things that we're seeing, everybody talks about digital transformation. It's the hot buzzword of the day. To us, digital means data first. Presumably, you're seeing that. Are organizations organizing around their data, and what does that mean for infrastructure? >> Yeah, absolutely. We see a lot of folks employing not only technology to do that. They're doing organizational techniques, so, peak teams. You know, bringing together a lot of different functions. Also, too, organizing around the data has become very different right now, that you've got data out on the edge, right? It's coming into the core. A lot of folks are moving some of their edge to the cloud, or even their core to the cloud. You gotta make a lot of decisions and be able to organize around a pretty complex set of places, physical and virtual, where your data's gonna lie. >> There's a lot of talk, too, about the data pipeline. The data pipeline used to be, you had an enterprise data warehouse, and the pipeline was, you'd go through a few people that would build some cubes and then they'd hand off a bunch of reports. The data pipeline, it's getting much more complex. You've got the edge coming in, you've got, you know, core. You've got the cloud, which can be on-prem or public cloud. Talk about the evolution of the data pipeline and what that means for infrastructure and big data workloads. >> For a lot of our customers, and we've got a pretty interesting business here at HPE. We do a lot with the Intelligent Edge, so, our Edgeline servers in Aruba, where a a lot of the data is sitting outside of the traditional data center. Then we have what's going on in the core, which, for a lot of customers, they are moving from either traditional EDW, right, or even Hadoop 1.0 if they started that transformation five to seven years ago, to, a lot of things are happening now in real time, or a combination thereof. The data types are pretty dynamic. Some of that is always getting processed out on the edge. Results are getting sent back to the core. We're also seeing a lot of folks move to real-time data analytics, or some people call it fast data. That sits in your core data center, so utilizing things like Kafka and Spark. A lot of the techniques for persistent storage are brand new. What it boils down to is, it's an opportunity, but it's also very complex for our customers. >> What about some of the technical trends behind what's going on with big data? I mean, you've got sprawl, with both data sprawl, you've got workload sprawl. You got developers that are dealing with a lot of complex tooling. What are you guys seeing there, in terms of the big mega-trends? >> We have, as you know, HPE has quite a few customers in the mid-range in enterprise segments. We have some customers that are very tech-forward. A lot of those customers are moving from this, you know, Hadoop 1.0, Hadoop 2.0 system to a set of essentially mixed workloads that are very multi-tenant. We see customers that have, essentially, a mix of batch-oriented workloads. Now they're introducing these streaming type of workloads to folks who are bringing in things like TensorFlow and GPGPUs, and they're trying to apply some of the techniques of AI and ML into those clusters. What we're seeing right now is that that is causing a lot of complexity, not only in the way you do your apps, but the number of applications and the number of tenants who use that data. It's getting used all day long for various different, so now what we're seeing is it's grown up. It started as an opportunity, a science project, the POC. Now it's business-critical. Becoming, now, it's very mission-critical for a lot of the services that drives. >> Am I correct that those diverse workloads used to require a bespoke set of infrastructure that was very siloed? I'm inferring that technology today will allow you to bring those workloads together on a single platform. Is that correct? >> A couple of things that we offer, and we've been helping customers to get off the complexity train, but provide them flexibility and elasticity is, a lot of the workloads that we did in the past were either very vertically-focused and integrated. One app server, networking, storage, to, you know, the beginning of the analytics phase was really around symmetrical clusters and scaling them out. Now we've got a very rich and diverse set of components and infrastructure that can essentially allow a customer to make a data lake that's very scalable. Compute, storage-oriented nodes, GPU-oriented nodes, so it's very flexible and helps us, helps the customers take complexity out of their environment. >> In thinking about, when you talk to customers, what are they struggling with, specifically as it relates to infrastructure? Again, we talked about tooling. I mean, Hadoop is well-known for the complexity of the tooling. But specifically from an infrastructure standpoint, what are the big complaints that you hear? >> A couple things that we hear is that my budget's flat for the next year or couple years, right? We talked earlier in the conversation about, I have to modernize, virtualize, containerizing my existing apps, that means I have to introduce new services as well with a very different type of DevOps, you know, mode of operations. That's all with the existing staff, right? That's the number one issue that we hear from the customers. Anything that we can do to help increase the velocity of deployment through automation. We hear now, frankly, the battle is for whether I'm gonna run these type of workloads on-prem versus off-prem. We have a set of technology as well as services, enabling services with Pointnext. You remember the acquisition we made around cloud technology partners to right-place where those workloads are gonna go and become like a broker in that conversation and assist customers to make that transition and then, ultimately, give them an elastic platform that's gonna scale for the diverse set of workloads that's well-known, sized, easy to deploy. >> As you get all this data, and the data's, you know, Hadoop, it sorta blew up the data model. Said, "Okay, we'll leave the data where it is, "we'll bring the compute there." You had a lot of skunk works projects growing. What about governance, security, compliance? As you have data sprawl, how are customers handling that challenge? Is it a challenge? >> Yeah, it certainly is a challenge. I mean, we've gone through it just recently with, you know, GDPR is implemented. You gotta think about how that's gonna fit into your workflow, and certainly security. The big thing that we see, certainly, is around if the data's residing outside of your traditional data center, that's a big issue. For us, when we have Edgeline servers, certainly a lot of things are coming in over wireless, there's a big buildout in advent of 5G coming out. That certainly is an area that customers are very concerned about in terms of who has their data, who has access to it, how can you tag it, how can you make sure it's secure. That's a big part of what we're trying to provide here at HPE. >> What specifically is HPE doing to address these problems? Products, services, partnerships, maybe you could talk about that a little bit. Maybe even start with, you know, what's your philosophy on infrastructure for big data and AI workloads? >> I mean, for us, we've over the last two years have really concentrated on essentially two areas. We have the Intelligent Edge, which is, certainly, it's been enabled by fantastic growth with our Aruba products in the networks in space and our Edgeline systems, so, being able to take that type of compute and get it as far out to the edge as possible. The other piece of it is around making hybrid IT simple, right? In that area, we wanna provide a very flexible, yet easy-to-deploy set of infrastructure for big data and AI workloads. We have this concept of the Elastic Platform for Analytics. It helps customers deploy that for a whole myriad of requirements. Very compute-oriented, storage-oriented, GPUs, cold and warm data lakes, for that matter. And the third area, what we've really focused on is the ecosystem that we bring to our customers as a portfolio company is evolving rapidly. As you know, in this big data and analytics workload space, the software development portion of it is super dynamic. If we can bring a vetted, well-known ecosystem to our customers as part of a solution with advisory services, that's definitely one of the key pieces that our customers love to come to HP for. >> What about partnerships around things like containers and simplifying the developer experience? >> I mean, we've been pretty public about some of our efforts in this area around OneSphere, and some of these, the models around, certainly, advisory services in this area with some recent acquisitions. For us, it's all about automation, and then we wanna be able to provide that experience to the customers, whether they want to develop those apps and deploy on-prem. You know, we love that. I think you guys tag it as true private cloud. But we know that the reality is, most people are embracing very quickly a hybrid cloud model. Given the ability to take those apps, develop them, put them on-prem, run them off-prem is pretty key for OneSphere. >> I remember Antonio Neri, when you guys announced Apollo, and you had the astronaut there. Antonio was just a lowly GM and VP at the time, and now he's, of course, CEO. Who knows what's in the future? But Apollo, generally at the time, it was like, okay, this is a high-performance computing system. We've talked about those worlds, HPC and big data coming together. Where does a system like Apollo fit in this world of big data workloads? >> Yeah, so we have a very wide product line for Apollo that helps, you know, some of them are very tailored to specific workloads. If you take a look at the way that people are deploying these infrastructures now, multi-tenant with many different workloads. We allow for some compute-focused systems, like the Apollo 2000. We have very balanced systems, the Apollo 4200, that allow a very good mix of CPU, memory, and now customers are certainly moving to flash and storage-class memory for these type of workloads. And then, Apollo 6500 were some of the newer systems that we have. Big memory footprint, NVIDIA GPUs allowing you to do very high calculations rates for AI and ML workloads. We take that and we aggregate that together. We've made some recent acquisitions, like Plexxi, for example. A big part of this is around simplification of the networking experience. You can probably see into the future of automation of the networking level, automation of the compute and storage level, and then having a very large and scalable data lake for customers' data repositories. Object, file, HTFS, some pretty interesting trends in that space. >> Yeah, I'm actually really super excited about the Plexxi acquisition. I think it's because flash, it used to be the bottleneck was the spinning disk, flash pushes the bottleneck largely to the network. Plexxi gonna allow you guys to scale, and I think actually leapfrog some of the other hyperconverged players that are out there. So, super excited to see what you guys do with that acquisition. It sounds like your focus is on optimizing the design for I/O. I'm sure flash fits in there as well. >> And that's a huge accelerator for, even when you take a look at our storage business, right? So, 3PAR, Nimble, All-Flash, certainly moving to NVMe and storage-class memory for acceleration of other types of big data databases. Even though we're talking about Hadoop today, right now, certainly SAP HANA, scale-out databases, Oracle, SQL, all these things play a part in the customer's infrastructure. >> Okay, so you were talking before about, a little bit about GPUs. What is this HPE Elastic Platform for big data analytics? What's that all about? >> I mean, we have a lot of the sizing and scalability falls on the shoulders of our customers in this space, especially in some of these new areas. What we've done is, we have, it's a product/a concept, and what we do is we have this, it's called the Elastic Platform for Analytics. It allows, with all those different components that I rattled off, all great systems in of their own, but when it comes to very complex multi-tenant workloads, what we do is try to take the mystery out of that for our customers, to be able to deploy that cookie-cutter module. We're even gonna get to a place pretty soon where we're able to offer that as a consumption-based service so you don't have to choose for an elastic type of acquisition experience between on-prem and off-prem. We're gonna provide that as well. It's not only a set of products. It's reference architectures. We do a lot of sizing with our partners. The Hortonworks, CloudEra's, MapR's, and a lot of the things that are out in the open source world. It's pretty good. >> We've been covering big data, as you know, for a long, long time. The early days of big data was like, "Oh, this is great, "we're just gonna put white boxes out there "and off the shelf storage!" Well, that changed as big data got, workloads became more enterprise, mainstream, they needed to be enterprise-ready. But my question to you is, okay, I hear you. You got products, you got services, you got perspectives, a philosophy. Obviously, you wanna sell some stuff. What has HPE done internally with regard to big data? How have you transformed your own business? >> For us, we wanna provide a really rich experience, not just products. To do that, you need to provide a set of services and automation, and what we've done is, with products and solutions like InfoSight, we've been able to, we call it AI for the Data Center, or certainly, the tagline of predictive analytics is something that Nimble's brought to the table for a long time. To provide that level of services, InfoSight, predictive analytics, AI for the Data Center, we're running our own big data infrastructure. It started a number of years ago even on our 3PAR platforms and other products, where we had scale-up databases. We moved and transitioned to batch-oriented Hadoop. Now we're fully embedded with real-time streaming analytics that come in every day, all day long, from our customers and telemetry. We're using AI and ML techniques to not only improve on what we've done that's certainly automating for the support experience, and making it easy to manage the platforms, but now introducing things like learning, automation engines, the recommendation engines for various things for our customers to take, essentially, the hands-on approach of managing the products and automate it and put into the products. So, for us, we've gone through a multi-phase, multi-year transition that's brought in things like Kafka and Spark and Elasticsearch. We're using all these techniques in our system to provide new services for our customers as well. >> Okay, great. You're practitioners, you got some street cred. >> Absolutely. >> Can I come back on InfoSight for a minute? It came through an acquisition of Nimble. It seems to us that you're a little bit ahead, and maybe you say a lot a bit ahead of the competition with regard to that capability. How do you see it? Where do you see InfoSight being applied across the portfolio, and how much of a lead do you think you have on competitors? >> I'm paranoid, so I don't think we ever have a good enough lead, right? You always gotta stay grinding on that front. But we think we have a really good product. You know, it speaks for itself. A lot of the customers love it. We've applied it to 3PAR, for example, so we came out with some, we have VMVision for a 3PAR that's based on InfoSight. We've got some things in the works for other product lines that are imminent pretty soon. You can think about what we've done for Nimble and 3PAR, we can apply similar type of logic to Elastic Platform for Analytics, like running at that type of cluster scale to automate a number of items that are pretty pedantic for the customers to manage. There's a lot of work going on within HPE to scale that as a service that we provide with most of our products. >> Okay, so where can I get more information on your big data offerings and what you guys are doing in that space? >> Yeah, so, we have, you can always go to hp.com/bigdata. We've got some really great information out there. We're in our run-up to our big end user event that we do every June in Las Vegas. It's HPE Discover. We have about 15,000 of our customers and trusted partners there, and we'll be doing a number of talks. I'm doing some work there with a British telecom. We'll give some great talks. Those'll be available online virtually, so you'll hear about not only what we're doing with our own InfoSight and big data services, but how other customers like BTE and 21st Century Fox and other folks are applying some of these techniques and making a big difference for their business as well. >> That's June 19th to the 21st. It's at the Sands Convention Center in between the Palazzo and the Venetian, so it's a good conference. Definitely check that out live if you can, or if not, you can all watch online. Excellent, Patrick, thanks so much for coming on and sharing with us this big data evolution. We'll be watching. >> Yeah, absolutely. >> And thank you for watcihing, everybody. We'll see you next time. This is Dave Vellante for theCUBE. (fast techno music)

Published Date : Jun 12 2018

SUMMARY :

From the SiliconANGLE media office and the infrastructure that in terms of the market forces right now to modernize what you have on-prem today, One of the things that we're seeing, of their edge to the cloud, of the data pipeline A lot of the techniques What about some of the technical trends for a lot of the services that drives. Am I correct that a lot of the workloads for the complexity of the tooling. You remember the acquisition we made the data where it is, is around if the data's residing outside Maybe even start with, you know, of the Elastic Platform for Analytics. Given the ability to take those apps, GM and VP at the time, automation of the compute So, super excited to see what you guys do in the customer's infrastructure. Okay, so you were talking before about, and a lot of the things But my question to you and automate it and put into the products. you got some street cred. bit ahead of the competition for the customers to manage. that we do every June in Las Vegas. Definitely check that out live if you can, We'll see you next time.

ENTITIES

Entity	Category	Confidence
Patrick	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Aruba	LOCATION	0.99+
Antonio	PERSON	0.99+
BTE	ORGANIZATION	0.99+
Patrick Osborne	PERSON	0.99+
HPE	ORGANIZATION	0.99+
June 19th	DATE	0.99+
Antonio Neri	PERSON	0.99+
Las Vegas	LOCATION	0.99+
Pointnext	ORGANIZATION	0.99+
Hewlett Packard Enterprise	ORGANIZATION	0.99+
NVIDIA	ORGANIZATION	0.99+
third area	QUANTITY	0.99+
21st Century Fox	ORGANIZATION	0.99+
Apollo 4200	COMMERCIAL_ITEM	0.99+
@patrick_osborne	PERSON	0.99+
Apollo 6500	COMMERCIAL_ITEM	0.99+
InfoSight	ORGANIZATION	0.99+
MapR	ORGANIZATION	0.99+
Sands Convention Center	LOCATION	0.99+
Boston, Massachusetts	LOCATION	0.98+
Apollo 2000	COMMERCIAL_ITEM	0.98+
CloudEra	ORGANIZATION	0.98+
HP	ORGANIZATION	0.98+
Nimble	ORGANIZATION	0.98+
Spark	TITLE	0.98+
SAP HANA	TITLE	0.98+
next year	DATE	0.98+
GDPR	TITLE	0.98+
One app	QUANTITY	0.98+
Venetian	LOCATION	0.98+
two areas	QUANTITY	0.98+
today	DATE	0.98+
hp.com/bigdata	OTHER	0.97+
one	QUANTITY	0.97+
Hortonworks	ORGANIZATION	0.97+
Mode 2	OTHER	0.96+
single platform	QUANTITY	0.96+
SQL	TITLE	0.96+
One	QUANTITY	0.96+
21st	DATE	0.96+
Elastic Platform	TITLE	0.95+
3PAR	TITLE	0.95+
Hadoop 1.0	TITLE	0.94+
seven years ago	DATE	0.93+
CUBE Conversation	EVENT	0.93+
Palazzo	LOCATION	0.93+
Hadoop	TITLE	0.92+
Kafka	TITLE	0.92+
Hadoop 2.0	TITLE	0.91+
Elasticsearch	TITLE	0.9+
Plexxi	ORGANIZATION	0.87+
Apollo	ORGANIZATION	0.87+
of years ago	DATE	0.86+
Elastic Platform for Analytics	TITLE	0.85+
Oracle	ORGANIZATION	0.83+
TensorFlow	TITLE	0.82+
Edgeline	ORGANIZATION	0.82+
Intelligent Edge	ORGANIZATION	0.81+
about 15,000 of	QUANTITY	0.78+
one issue	QUANTITY	0.77+
five	DATE	0.74+
HPE Discover	ORGANIZATION	0.74+
both data	QUANTITY	0.73+
data	ORGANIZATION	0.73+
years	DATE	0.72+
SiliconANGLE	LOCATION	0.71+
EDW	TITLE	0.71+
Edgeline	COMMERCIAL_ITEM	0.71+
HPE	TITLE	0.7+
OneSphere	ORGANIZATION	0.68+
couple	QUANTITY	0.64+
3PAR	ORGANIZATION	0.63+

Big Data Silicon Valley 2018 Recap

>> Dave: Good morning everybody and welcome to Big Data SV. >> Come down, hang out with us today as we have continued conversations. >> Will this trend, this Big Data trend, solve the problems that decision support and business intelligence couldn't solve. We're going to talk about that today. Gentlemen, welcome to theCUBE. (energetic rock music) >> Dave: We're setting up for the digital business era. >> What do people really want to do? And it's big data analytics. I want to ingest a lot of information. I want to enrich it. I want to analyze it and I want to take actions and then I want to go park it. >> Leveraging everything that is open source to build models and put models in production. >> We talk a little bit like it's Google Docs for your data. >> So I no longer have to send daily data dumps to partners. They can simply query the data themselves. >> We've taken the two approaches of enterprise analytics and self-services and tried to create a scenario where you kind of get the best of both worlds. >> The epicenter of this whole data management has to move to cloud. >> It saves you a lot of time and effort. You can focus on more strategic projects. >> Do you agree it's kind of bifurcated. There's the Spotifys, and the Ubers, and the AirBnBs that are crushing it and then there's a lot of traditional enterprises that are still stovepipe and struggling. >> Marketing people, operational people, finance people, they need data to do their jobs. Their jobs are becoming more data-driven but they're not necessarily data people. >> They're depending on the vendor landscape to provide them with an entry level set of tools. >> Don't make me work harder and add new staff. Solve the problem. >> Yeah, it's all about solving problems. >> A lot more on machine learning now and artificial intelligence and frankly a lot of discussion around ethics. >> Data governance, it is in fact a business imperative. >> Marketers want all the customer data they can get, right? But there's social security numbers, PII-- Who should be able to see and use what because if this data is used inappropriately then it can cause a lot of problems. >> Creating that visibility is very important. >> The biggest casualty is going to be their customer relationship if they don't do this because most companies don't know their customers fully. >> The key that digital transformation is really a lauder on the concept of real time. >> If anybody deals with the data that's in motion, you lose because I'm analyzing as it's happening and then you would be analyzing after at rest. >> Speed is so important these days and the new companies that are grasping data aggressively, putting it somewhere where they can make decisions on it on a day-to-day basis, they're winning. >> Come on down, be part of our audience. We also have a great party tonight where you can network with some of our experts and analysts. (energetic rock music) >> Our expectation is that as the tooling gets better, we will see more people be able to present themselves truly as capable of doing this, and that will accelerate the process. >> To me, one of the first things a CDO has to do is understand how a company gets value out of its data. >> You can either run away from that data and say, look, I'm going to not, I'm going to bury my head in the sand, I'm going to be a business, I'm just going to forget about that data stuff and that's certainly a way to go. Right? It's a way to go away. >> It's easy to get overwhelmed for companies, you have to pick somewhere, right? >> You don't have to go sit in the basement for a year having something that is 'the thing', the unicorn in the business, it's small quick wins. >> We're not afraid of makin' mistakes. If we provision infrastructure and we don't get it right the first time, we just change it. >> That's something that we would just never be able to do previously in a data center. >> When companies get started with the right first project they can build on that success and invest more, whereas if you're not experimenting and trying things and moving, you're never going to get there. >> Dave: Thanks for watching, everybody. This is thCUBE. We're live from Big Data SV. >> And we're clear. Thank you. (audience applauds)

Published Date : Mar 12 2018

SUMMARY :

to Big Data SV. Come down, hang out with us today We're going to talk about that today. and I want to take actions and then I want to go park it. to build models and put models in production. So I no longer have to send daily data dumps to partners. We've taken the two approaches of enterprise analytics has to move to cloud. It saves you a lot of time and effort. and the AirBnBs that are crushing it they need data to do their jobs. to provide them with an entry level set of tools. Solve the problem. and artificial intelligence and frankly Who should be able to see and use what The biggest casualty is going to be on the concept of real time. If anybody deals with the data that's in motion, that are grasping data aggressively, putting it somewhere We also have a great party tonight where you can network Our expectation is that as the tooling gets better, To me, one of the first things a CDO has to do I'm going to be a business, I'm just going to forget You don't have to go sit in the basement for a year the first time, we just change it. able to do previously in a data center. and invest more, whereas if you're not experimenting This is thCUBE. And we're clear.

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
Ubers	ORGANIZATION	0.99+
AirBnBs	ORGANIZATION	0.99+
Spotifys	ORGANIZATION	0.99+
tonight	DATE	0.99+
today	DATE	0.98+
Google Docs	TITLE	0.98+
both worlds	QUANTITY	0.98+
Big Data	ORGANIZATION	0.98+
first time	QUANTITY	0.97+
one	QUANTITY	0.97+
first project	QUANTITY	0.97+
two approaches	QUANTITY	0.97+
first	QUANTITY	0.95+
a year	QUANTITY	0.93+
2018	DATE	0.92+
Big Data SV	ORGANIZATION	0.8+
Valley	TITLE	0.65+
Silicon	LOCATION	0.6+
Big Data	TITLE	0.46+

Lewis Kaneshiro & Karthik Ramasamy, Streamlio | Big Data SV 2018

(upbeat techno music) >> Narrator: Live, from San Jose, it's theCUBE! Presenting Big Data Silicon Valley, brought to you by SiliconANGLE Media and its ecosystem partners. >> Welcome back to Big Data SV, everybody. My name is Dave Vellante and this is theCUBE, the leader in live tech coverage. You know, this is our 10th big data event. When we first started covering big data, back in 2010, it was Hadoop, and everything was a batch job. About four or five years ago, everybody started talking about real time and the ability to affect outcomes before you lose the customer. Lewis Kaneshiro was here. He's the CEO of Streamlio and he's joined by Karthik Ramasamy who's the chief product officer. They're both co-founders. Gentlemen, welcome to theCUBE. My first question is, why did you start this company? >> Sure, we came together around a vision that enterprises need to access the value around fast data. And so as you mentioned, enterprises are moving out of the slow data era, and looking for a fast data value to their data, to really deliver that back to their users or their use cases. And so, coming together around that idea of real time action what we did was we realized that enterprises can't all access this data with projects right now that are not meant to work together, that are very difficult, perhaps, to stitch together. So what we did was create an intelligent platform for fast data that's really accessible to enterprises of all sizes. What we do is we unify the core components to access fast data, which is messaging, compute and stream storage, accessing the best of breed open-source technology that's really open-source out of Twitter and Yahoo! >> It's a good thing I was going to ask why does the world need to know there are, you know, streaming platforms, but Lewis kind of touched on it, 'cause it's too hard. It's too complicated, so you guys are trying to simplify all that. >> Yep, the reason mainly we wanted to simplify it because, based on all our experiences at Twitter and Yahoo! one of the key aspects was to to simplify it so that it's conceivable by regular enterprise because Twitter and Yahoo! kind of our position can afford the talent and the expertise in order to do this real time platforms. But when it goes to normal enterprises, they don't have access to the expertise and the cost benefits that they might have to reincur. So, because of that we wanted to use these open-source projects, the Twitter and the Yahoo!'s provider, combine them, and make sure that you have a simple, easy, drag and drop kind of interface, so that it's easily conceivable for any enterprise. Essentially, what we are trying to do is reduce the (mumbles) for enterprises for real time, for all enterprises. >> Dave: Yeah, enterprises will pay up... >> Yes. >> For a solution. The companies that you used to work for, they all gladly throw engineering at the problem. >> Yeah. >> Sure. >> To save time, but most organizations, they don't have the resources and so. Okay, so how does it, would it work prior to Streamlio? Maybe take us through sort of how a company would attack this problem, the complexities of what they have to deal with, and what life is like with you guys. >> So, current state of the world is it's fragmented solution, today. So the state of the world is where you take multiple pieces of different projects and you'd assemble them together in formats so that you can do (mumbles) right? So the reason why people end up doing is each of these big data projects that people use was the same for completely different purpose. Like messaging is one, and compute is another one, and third one is storage one. So, essentially what we have done as company is to simplify this aspect by integrating this well-known, best-of-the-breed projects called, for messaging we use something called Apache Poser, for compute we use something called Apache Krem, from Twitter, and similarly for storage, for real time storage, we use something called Apache Bookkeeper, so and to unify them, so that, under the hoods, it may be three systems, but, as a user, when you are using it, it serves or functions as a single system. So you install the system, and ingest your data, express your computation, and get the results out, in one single system. >> So you've unified or converged these functions. If I understand it correctly, we talking off camera a little bit, the team, Lewis, that you've assembled actually developed a lot of these, or hugely committed to these open-source projects, right? >> Absolutely, co-creators of each of the projects and what that allows us to do is to really integrate, at a deep level, each project. For example, Pulsar is actually a pub/sub system that is built on Bookkeeper, and Bookkeeper, in our minds, is a pure list best-of-breed stream storage solution. So, fast and durable storage. That storage is also used in Apache Heron to store State. So, as you can see, enterprises, rather than stitching together multiple different solutions for queuing, streaming, compute, and storage, now have one option that they can install in a very small cluster, and operationally it's very simple to scale up. We simply add nodes if you get data spikes. And what this allows is enterprises to access new and exciting use cases that really weren't possible before. For example, machine learning model deployment to real time. So I'm a data scientist and what I found is in data science, you spend a lot of time training models in batch mode. It's a legacy type of approach, but once the model is trained, you want to put that model into production in real time so that you can deliver that value back to a user in real time. Let's call it under two second SLA. So, that has been a great use case for Streamlio because we are a ready made intelligent platform for fast data, for MLai deployment. >> And the use cases are typically stateful and your persisting data, is that right? >> Yes, use cases, it can be used for stateless use cases also, but the key advantage that we bring to a table is stateful storage. And since we ship along with the storage (mumbles) stateful storage becomes much easier because of the fact that it can be used to store a real intermediate state of the computation or it can be used for the staging (mumbles) data when it spills over from what the memory is it's automatically stored to disk or you can even in the data for as long as you want so that you can unlock the value later after the data has been processed for the fast data. You can access the lazy data later, in time. >> So give us the run-down on the company, funding, you know, VCs, head count. Give us the basics. >> Sure, we raise Series A from Lightspeed Venture Partners, lead by John Vrionis and Sudip Chakrabarti. We've raised seven and a half million and emerged from stealth back in August. That allowed us to ramp up our team to 17, now, mainly engineers, in order to really have a very solid product, but we launched post rev, prelaunch and some of our customers are really looking at geo replication across multiple data centers and so active, active geo replication is an open-source feature in Apache Pulsar, and that's been a huge draw, compared to some other solutions that are out there. As you can see, this theme of simplifying architecture is where Streamlio sits, so unifying, queuing and streaming allows us to replace a number of different legacy systems. So that's been one avenue to help growth. The other, obviously is on the compute piece. As enterprises are finding new and exciting use cases to deliver back to their users, the compute piece needs to scale up and down. We also announce Pulsar Functions, which is stream-native compute that allows very simple function computation in native Python and Java, so you spin out the Apache Python cluster or Streamlio platform, and you simply have compute functionality. That allows us to access edge use cases, so IOT is a huge, kind of exciting POC's for us right now where we have connected car examples that don't need heavyweight schedule or deployment at the edge. It's Pulsar Pulsar functions. What that allows us to do are things like fraud detection, anomaly detection at the edge, model deployment at the edge, interpolation, observability, and alerts. >> And, so how do you charge for this? Is it usage based. >> Sure. What we found is enterprise are more comfortable on a per node basis, simply because we have the ambition to really scale up and help enterprises really use Streamlio as their fast data platform across the entire enterprise. We found that having a per data charge rate actually would limit that growth, and so per node and shared architecture. So, we took an early investment in optimizing around Kubernetes. And so, as enterprises are adopting Kubernetes, we are the most simple installation on Kubernetes, so on-prem, multicloud, at the edge. >> I love it, so I mean for years we've just been talking about the complexity headwinds in this big data space. We certainly saw that with Hadoop. You know, Spark was designed to certainly solve some of those problems, but. Sounds like you're doing some really good work to take that further. Lewis and Karthik, thank you so much for coming on theCUBE. I really appreciate it. >> Thanks for having us, Dave. >> All right, thank you for watching. We're here at Big Data SV, live from San Jose. We'll be right back. (techno music)

Published Date : Mar 9 2018

SUMMARY :

brought to you by SiliconANGLE Media and the ability to affect outcomes And so as you mentioned, enterprises are moving out so you guys are trying to simplify all that. and the cost benefits that they might have to reincur. The companies that you used to work for, and what life is like with you guys. so that you can do (mumbles) right? the team, Lewis, that you've assembled so that you can deliver that value so that you can unlock the value later you know, VCs, head count. the compute piece needs to scale up and down. And, so how do you charge for this? have the ambition to really scale up and help enterprises Lewis and Karthik, thank you so much for coming on theCUBE. All right, thank you for watching.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Karthik Ramasamy	PERSON	0.99+
Karthik	PERSON	0.99+
Lewis Kaneshiro	PERSON	0.99+
Dave	PERSON	0.99+
San Jose	LOCATION	0.99+
Lightspeed Venture Partners	ORGANIZATION	0.99+
John Vrionis	PERSON	0.99+
Lewis	PERSON	0.99+
2010	DATE	0.99+
August	DATE	0.99+
three systems	QUANTITY	0.99+
Streamlio	ORGANIZATION	0.99+
Yahoo!	ORGANIZATION	0.99+
each	QUANTITY	0.99+
Twitter	ORGANIZATION	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
Java	TITLE	0.99+
first question	QUANTITY	0.99+
Sudip Chakrabarti	PERSON	0.99+
one option	QUANTITY	0.99+
Python	TITLE	0.99+
both	QUANTITY	0.99+
seven and a half million	QUANTITY	0.99+
17	QUANTITY	0.98+
each project	QUANTITY	0.98+
third one	QUANTITY	0.98+
Kubernetes	TITLE	0.98+
single system	QUANTITY	0.98+
first	QUANTITY	0.96+
Pulsar	TITLE	0.96+
Streamlio	TITLE	0.96+
Spark	TITLE	0.94+
Bookkeeper	TITLE	0.94+
one	QUANTITY	0.93+
one single system	QUANTITY	0.92+
theCUBE	ORGANIZATION	0.91+
today	DATE	0.91+
Big Data SV 2018	EVENT	0.9+
Apache	ORGANIZATION	0.89+
Silicon Valley	LOCATION	0.89+
SLA	TITLE	0.89+
one avenue	QUANTITY	0.89+
Series A	OTHER	0.88+
five years ago	DATE	0.86+
Big Data	EVENT	0.85+
About four	DATE	0.85+
Big Data SV	EVENT	0.82+
IOT	TITLE	0.81+
Poser	TITLE	0.75+
Big Data SV	ORGANIZATION	0.71+
10th big	QUANTITY	0.67+
Apache Heron	TITLE	0.65+
under two second	QUANTITY	0.62+
data	EVENT	0.61+
Streamlio	PERSON	0.54+
event	QUANTITY	0.48+
Hadoop	TITLE	0.45+
Krem	TITLE	0.32+

John Furrier & Dave Vellante unpack the Russion Hack | Big Data SV 2018

>> Announcer: Live from San Jose. It's theCUBE. Presenting big data, Silicon Valley. Brought to you by SiliconANGLE Media and its ecosystem partners. >> Hello everyone, I'm John Furrier, co-host of theCube. I'm here with Dave Vellante, my co-host. Exclusive conversation around the role of data, data for good and bad. We always cover the role of data. We used to talk about AI and data for good but in this exclusive interview... And we have some exclusive material about data for bad. Dave, we've been talking about weaponizing data a year ago in SiliconEAGLE in theCUBE, around how data is being weaponized, and certainly in the elections. We know the Russians were involved. We know that data, you can buy journalists, you can create fake news. And for every click-bate and fake news is bad content. But also on the other side of this, there's good bate; good news. So the world's changin'. There needs to be a better place, needs to be some action taken, because there's now evidence that the role that the Russians had, using fake news and weaponizing it to sway the election and other things has been out there. So this is somethin' that we've been talkin' about. >> Yeah I mean the signature of the hacks is pretty clear. I think there is a distinct signature when you talk to the experts of when it's China or when it's Russia. Russia, very clever, about the way they target somebody whose maybe a pawn; but they try to make him or her feel like a king, grab their credentials and then work their way in. They've been doing this for decades, right? >> And the thing is to, is that now it's not just state-sponsored, there's new groups out there that they can enable open source tools. We report on theCUBE that terrorist organizations and bad actors, are taking open source tools and threats from state nations, posing as threats to democracy in the U.S. and other countries. This is a huge problem. >> And it's, in a way, it's harder than the nuclear problem. We had weapons pointed at each other, right. This is... The United States has a lot to lose. If we go on the offense, others can attack us and attack our systems, which are pretty mature. So, recently we talked to Garry Kasparov. I had an exclusive interview with him. He's very outspoken. Kasparov is the greatest chess player in history, by most accounts. And he is a political activist, he's an author. And he had a number of things to say about this. Let's listen to him, it's about a couple minute clip, and then we'll come back and talk about it. Watch this. >> Garry: Knowing Vladimir Putin and the mentality of the KGB mentality and the way he has been approaching the global problems; I had no doubt that the question was not if Putin would attack somewhere, but the question is when and where? And the attack on U.S. democracy was a surprise here but it was not surprise for us because we could see how they built these capabilities for more than a decade. Because they have been creating fake news industry in Russia to deal with Russian opposition 2004, 2005. Then they used against neighboring countries like Estonia in 2007. Then they moved to eastern Europe and then through western Europe. So when they ended up attacking the United States, they would've had almost a decade of experience. And it's quite unfortunate that, while there was kind of information about this attacks, the previous administration decided just to take it easy. And the result is that we have this case of interference; I hope there will be more indictments. I hope we'll get to the bottom of that. Because, we know that they are still pretty active in Europe. And they will never seize there-- >> Dave: Germany, France-- >> Garry: Exactly. But it's... I call Putin as: merchant of doubt. Because, unlike Soviet propaganda machine, he's not selling one ideology. All he wants is to spread chaos. So that's why it's not about and, oh this is the only, the right teaching. No, no, no. No, it's wrong, it's wrong, everything... Yeah, maybe there are 10 different ways of saying the truth. Truth is relevant. And that's a very powerful message because it's spreading these doubts. And he's very good in just creating these confusions and actually, bringing people to fight each other. And I have to say he succeeded-- >> Dave: Our president is taken a page out of that. Unfortunately. But I also think the big issue we face as a country, in the United States, is 2020. Is the election in 2020 is going to be about who leverages social media and the weaponization of social media. And the Russian attackers you talk to the black hats, very sophisticated, very intriguing how they come in, they find the credentials-- >> Garry: But look, we know, Jesus, every expert knows that in this industry, if you are trying to defend yourself, if you are on the defense all the time you will lose. It's a losing proposition. So the only way to deter the aggression is to make sure that they won't be counterattacks. So that there will be devastating blows, those who are attacking the United States. And you need the political will because, technology is here; America is still the leading power in the world. But the political will, unfortunately-- >> Dave: However, I would say that, but it's different than with nuclear warheads. Robert Gates was on theCUBE, he said to me, and I asked him about offense versus defense. He said the only thing about the Unite States is we have a lot to lose. So we have to be careful. (laughter) How aggressive we can be. >> Garry: No, exactly. That is just, it's, yes. It's a great error of uncertainty: what can you lose? If you show strength. But I can tell you exactly how you are going to lose everything, if you are not-- >> Dave: Vigilant. >> Garry: If you are not vigilant. If you are not deterrent. If you are not sending the right signal to the Putins of this world that aggression against America will have the price that you cannot bear. >> So John, pretty unequivocal comments from Garry Kasparov. So a lot of people don't believe that you can actually manipulate social media that way. You've been in social for a long time, since the beginning days. Maybe you could explain how one, would a country or a state sponsored terrorism; how would they go about manipulating individuals? >> You know Dave, I've been involved in internet infrastructure from the beginning days of Web 1.0 and through search engines. Student of the data. I've seen the data. I've seen our, the data that we have from our media company. I've seen the data on Facebook and here's the deal: there's bad actors doin' fake news, controlling everything, creating bad outcomes. It's important for everyone to understand that there's an actual opposite spectrum. Which is the exact opposite of the bad; there's a good version. So what we can learn from this is that there's a positive element of this, if we can believe it, which is actually a way to make it work for good. And that is trust, high-quality data, reputation and context. That is a very hard problem. Facebook is tryin' to solve it. You know we're workin' on solving that. But here's the anatomy of the hack. If you control the narrative, you can control the meme. If you can control the meme, you can control the idea. If you can control the idea, you can control the belief system. If you can control the belief system, you can control the population. That is exactly what has happened with the election. That is what's happening now in social networks. That's why so many people are turning off to social networks. Because this is hackable; you can actually hack the brains and outcomes of people. Because, controlling the narrative, controlling the meme, controlling the idea, controlling the belief system: you can impact the population. That has absolutely been done. >> Without firin' a shot. >> Without firing a shot. This is the new cold social network wars that are goin' on. And again, that has been identified, but there's an opposite effect. And the opposite effect is having a trust system, a short cut to trust; there will be a Google in our future, Google, like what Google did to search engines. It will be for social networks. That is, whoever can nail the trust, reputation, context: what is real and what is not. Will ultimately have all the users goin' to their doorstep. This is the opportunity for news organizations, for platforms and it's all going to be driven by new infrastructure, new software. This is something we can learn from. But there is a way to hack, it's been done. I've just laid it out. That's what's happening. >> Well, blockchain solved or play a role in solving this problem of reputation in your opinion. >> Well you know that I believe centralized is bad. 'Cause you can hack a centralized database and the data. Ownership is huge. I personally believe that blockchain and this notion of decentralized data ownership will ultimately go back to the people and that the decentralized applications and cryptocurrency leads a path, it's not yet proven, there's no clear visibility yet. But many believe that the wallet is a new browser and that cryptocurrency can put the power to the people; so that new data can emerge. To vet in a person who says they're something that they're not. News that says they're somethin' that they're not. This is a trust. This is something that is not yet available. That's what I'm sayin'. You can't get it with Google, you can't get it with Facebook. You can't get it in these platforms. So the world has to change at an infrastructure level. That's the opportunity to blockchain. Aside from all the things like who's going to give the power for the miners; a variety of technical issues. But conceptually, there is a path there. That's a new democracy. This is global phenomenon. It's a societal change. This is so cutting edge, but it's yet very promising at the same time. >> This is super important because I can't tell you how many times have you've received an email from one political persuasion or the other that lays out emphatically, that this individual did that or... And you do some research and you find out it's fake news. It happens all the time. >> There's no context for these platforms. Facebook optimizes their data for advertising optimization and you're going to see data being optimized for user control, community control, community curation. More objective not subjective data. This is the new algorithm, this is what machine learning in AI will make a difference. This is the new trust equation that will emerge. This is a phenomenal opportunity for entrepreneurs. If you're in the media business and you're not thinking about this, you will be out of business. That's our opinion. >> Excellent John. Well thanks for your thoughts and sharing with us how these hacks are done. This is real. The midterm elections, 2020 is really going to be won or lost on social media. Appreciate that. >> And Facebook's fumbling and they're going to try to do good. We'll see what they do. >> Alright. >> Alright. >> That's a wrap. Good job. >> Thanks for watching.

Published Date : Mar 9 2018

SUMMARY :

Brought to you by SiliconANGLE Media that the role that the Russians had, using fake news Yeah I mean the signature of the hacks is pretty clear. And the thing is to, is that now it's not Kasparov is the greatest chess player in history, I had no doubt that the question was not the right teaching. And the Russian attackers you talk to the black hats, America is still the leading power in the world. He said the only thing about the Unite States is we It's a great error of uncertainty: what can you lose? If you are not sending the right signal So a lot of people don't believe that you can actually Which is the exact opposite of the bad; This is the new cold social network wars that are goin' on. in solving this problem of reputation in your opinion. and that cryptocurrency can put the power to the people; This is super important because I can't tell you This is the new algorithm, this is what machine learning This is real. And Facebook's fumbling and they're going to try to do good. That's a wrap.

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Putin	PERSON	0.99+
Garry	PERSON	0.99+
2007	DATE	0.99+
Robert Gates	PERSON	0.99+
John Furrier	PERSON	0.99+
John	PERSON	0.99+
Facebook	ORGANIZATION	0.99+
Garry Kasparov	PERSON	0.99+
2004	DATE	0.99+
San Jose	LOCATION	0.99+
Jesus	PERSON	0.99+
Kasparov	PERSON	0.99+
Google	ORGANIZATION	0.99+
2005	DATE	0.99+
United States	LOCATION	0.99+
2020	DATE	0.99+
Europe	LOCATION	0.99+
Vladimir Putin	PERSON	0.99+
Putins	PERSON	0.99+
10 different ways	QUANTITY	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
KGB	ORGANIZATION	0.99+
Silicon Valley	LOCATION	0.99+
U.S.	LOCATION	0.99+
more than a decade	QUANTITY	0.98+
eastern Europe	LOCATION	0.98+
western Europe	LOCATION	0.98+
Russian	OTHER	0.98+
America	ORGANIZATION	0.98+
Russia	LOCATION	0.97+
decades	QUANTITY	0.96+
a year ago	DATE	0.96+
Estonia	LOCATION	0.94+
theCube	ORGANIZATION	0.92+
Germany,	LOCATION	0.89+
theCUBE	ORGANIZATION	0.89+
one ideology	QUANTITY	0.87+
a couple minute	QUANTITY	0.81+
Unite States	LOCATION	0.81+
Soviet	OTHER	0.78+
Russians	PERSON	0.77+
Russion Hack	TITLE	0.75+
China	LOCATION	0.74+
United States	ORGANIZATION	0.7+
almost a decade	QUANTITY	0.69+
one political persuasion	QUANTITY	0.68+
Russia	ORGANIZATION	0.65+
Big Data SV 2018	TITLE	0.6+
SiliconEAGLE	ORGANIZATION	0.56+
theCUBE	TITLE	0.53+
France	LOCATION	0.49+
Web	OTHER	0.43+

Jaspreet Singh, Druva & Jake Burns, Live Nation | Big Data SV 2018

>> Narrator: Live from San Jose, it's theCUBE. Presenting: Big Data Silicon Valley. Brought to you by SiliconANGLE Media, and its ecosystem partners. >> Welcome back, everyone, we're here live at San Jose for Big Data SV, Big Data Silicon Valley. I'm John Furrier, cohost of theCUBE. We're here with two great guests, Jaspreet Singh, founder and CEO of Druva, and Jake Burns, VP of Cloud Services of Live Nation Entertainment. Welcome to theCUBE, so what's going on with Cloud? Apps are out there, backup, recovery, what's going on? >> So, we went all in with AWS, and late 2015 and through 2016 we moved all of our corporate infrastructure into AWS, and I think we're a little bit unique in that situation, so in terms of our posture, we're 100% Cloud. >> John: Jaspreet, what's going on with you guys in the Cloud, because we've talked about this before, with a lot of the apps in the cloud, backup is really important. What's the key thing that you guys are doing together with Live Nation? >> Sure, so I think the notion of data is now pretty much everywhere. The data is captured, controlled in data center, now it's getting decentralized into getting into apps and ecosystems, and softwares and services deployed either at the edge or in the Cloud. As the data gets more and more decentralized, the notion of data management, bead backup, BD discovery. Anything has to get more and more centralized. And we strongly believe the epicenter of this whole data management has to move to Cloud. So, Druva is a size based provider for data management. And we work with Live Nation to predict the apps not just in the data center. But, also at the edge and also the Cloud data center. The applications deployed in the Cloud, be it Live Nation or Ticketmaster. >> And what are some of the workloads you guys are backing up? That's with Druva. >> Yeah so, it's pretty much all corporate, IT applications. You know, typical things you'd find in any IT shop really. So, you know, we have our financial systems and we have some of our smaller ticketing systems and you know, corporate websites. Things of that nature. So, it's like we have 120 applications that are running and it's just really kind of one of everything. >> We were talking before we came on camera about the history of computing and the Cloud has obviously changed the game. How would you compare the Cloud as a trend relative to operationalizing the role of data and obviously GDPR, Ransomware. These are things that now with the perimeter gone. There's worries. So now, how do you guys look at the Cloud? So Jake, I will start with you. If you can compare and contrast, where we have come from and where we are going. Role of the Cloud. Significant primary, expanding. How would you compare that? And how would you talk to someone who says Hey I'm still in the data center world? What's going on with Cloud? >> Well, yeah, it's significant and it's expanding, both. And you know, it's really transforming the way we do business. So you know just from a high level, things like shortening the time to market for applications, going from three to six months just to get a proof of concept started to today, you know, in the Cloud. Being able to innovate really by trying things trying to... we try 20 different things, decide what works, what doesn't work. And at very low cost. So, it allows us to really do things that just weren't possible before. So, also, we we move more quickly because, you know, we're not afraid of making mistakes. If we provision infrastructure and we don't get it right the first time, we just change it. You know, that's something that we would just never be able to do previously in the data center. So to answer your question, everything is different. >> And as a service model's been kind of key. Is the consumption on your end different like I mean radically different? Like give an example of like how much time would be saved or taken to use other the traditional approaches. >> Oh for sure. You know, in the role of IT has completely changed because you know, instead of worrying about nuts and bolts and servers and storage arrays and data centers. You know, we could really focus on the things that are important to the business. You know, those things delivering results for the business. So, bringing value, bringing applications online and trying things that are going to help you know, us do business rather than focusing on all the minutiae. All that stuff's now been outsourced to Cloud providers. So, really, we kind of have a similar head count and staff. But, we are focused on things that bring value rather than things that are just kind of frivolous. >> Jaspreet, you guys have been very successful startup growing rapidly. The Cloud been a good friend that trend is your friend with the Cloud. >> What's different operationally that you guys are tapping into? What's that tail wind for Druva that's making you guys successful? And is it the ease of use? Is it the ease of consumption? Is it the tech? What's the secret to success with Druva? >> Sure, so, we believe cloud is a very big business transformation trend more than a technology trend. It's how you consumer service with a fixed SLA, with a fixed service agreement across the globe. So, it's ease of consumption. It's simplicity of views. It's orchestration. It's cost control. All those things. So, our promise to our customers is the complexity of data management, backups, archives, data production, which is a risk mitigation project. You know, can be completely abstracted by a simple service. For example, you know, Live Nation consumers, consumer drove a service through Amazon Marketplace. So, think about consuming a critical service like data management through simplicity of marketplace, pay as you go, as you consume the service. Across the globe. In the US, in Australia, and Europe. And also, helps the vendors like us to innovate better. Because we have a control environment to understand how different customers are using the service and be able to orchestrate better security pusher, better threat prevention, better cost control. DevOps. So, it improves the pusher of the service being offered and helps the customer consumer. >> You both are industry veterans by today's standards unless you're like 24 doing some of the cryptocurrency stuff that, you know, doesn't know the old IT baggage. How would you guys view the multi-Cloud conversation? Because we hear that all the time. Multi-Cloud has come up so many times. What does it mean? Jake, what does multi-Cloud actually mean? Is it the same workload across multiple Clouds? Is it the fact that there is multiple Clouds? Certainly, there will be multiple Clouds? But, so, help us digest what that even means these days. >> Yeah, that's a great question and it's a really interesting topic. Multi-Cloud is one of those things where, you know, there's so many benefits to using more than one Cloud provider. But, there are also a lot of pitfalls. So, people really underestimate the difference in the technology and the complexity of managing the technology when you change Cloud providers. I'm talking primarily about infrastructure service providers like Amazon web services. So, you know, I think there's a lot of good reasons to be multi-Cloud to get the best features out of different providers, to not have, you know, the risk of having all your data in one place with one vendor. But, you know, it needs to be done in such a way where you don't take that hit in overhead and complexity and you know, I think that's kind of a prohibitive barrier for most enterprises. >> And what are the big pitfalls that you see? Is it mainly underestimating the stack complexity between them or is it more of just operational questions? I mean what is the pitfalls that you've observed? >> Yeah, so, moving from like a typical IT data center environment to public Cloud provider like AWS. You're essentially asking all your technical staff to start speaking in a new language. Now if you were to introduce a second Cloud provider to that environment, now you're asking them to learn a third language as well. And that's a lot to ask. So, you really have two scenarios where you can make that work today without using a third party. And that's ask all of your staff to know both and that's just not feasible. Or have two tech teams. One for each Cloud platform. That's really not something businesses want to do. So, I think the real answer is to rely on a third party that can come in and abstract one of those Cloud complexities Well, one of those Cloud providers out. So, you don't have to directly manage it. And in that way, you can get the benefit of being multi-Cloud, that data protection of being multi-Cloud. But, not have to introduce that complexity to your environment. >> To provide some abstraction layer. Some sort of software approach. >> Yeah, like for example, if you have your primary systems in AWS, and you use a software like Druva Phoenix to backup your data and you put that data into a second Cloud provider. You don't have to an account with that second Cloud provider. You don't have to have the risk of associating without a complexity associated without that is I think is a very >> And that's where you're looking for differentiation. We look at venues, say hey don't make me work harder. >> Right. >> And add new staff. Solve the problem. >> Yeah, it's all about solving problems right? And that's why we're doing this. >> So, Druva talk about this thing. Because we talked about it earlier about To me we could be oh we're on Azure. Well, they have Office 365 of course they're going to have Microsoft. A lot of people have a lot going on and AWS. So, maybe we're not there at the world where you can actually use provision across Clouds, the same workload, It would be nice to have that someday if it was seamless. But, I think that's might be the nirvana. But at the end of the day, an enterprise might have Office 365 and some Azure. But, I got some mostly Amazon over here I'm doing a lot of development on and doing a DevOps, and I'm on-prim. How do you talk to that? Because that's like you got to backup Office 365, you got to do the on-prim thing, you got to do the Amazon thing. How do you guys solve that problem? What's the conversation? >> Absolutely. I think over time we believe best of breed will win. So, people will deploy different type of cloud for different workloads. Pete's has hosted IaaS or platform like PaaS. When they do that, when they host multiple services, softwares to deploy services. I think its hard to control where the data will go. What we can orchestrate or anybody can orchestrate is the centralizing the data management part of it. So, Druva has the best pusher, has the best coverage across multiple heterogeneous Cloud breed. You know. Services like Office 365, Box, or Saleforce or B platforms like S3 or Dynono DB through our product called Apollo or hosted platforms like what Live Nation is using through our Phoenix product line. So getting the breadth of coverage, consistency of policies on a single platform is what will make enterprises adopt what's best out there without worrying about how you build abstraction for data management. >> Jake, what's the biggest thing you see people who are moving to the Cloud for the first time? What are they struggling with? Is it the idea that there's no perimeter? Is it staff training? I mean what are some of the as people move from Test Dev and or start to put in production the Cloud? What are some of the critical things they should think about? >> Yeah, there are so many of them. But first, really, its just getting buy in, you know, from your technical staff because, you know, in an enterprise environment you bring in a Cloud provider it's very easily framed to hold as if we're just being outsourced right? So, I think getting past that barrier first and really getting through to folks and letting them know that really this is good for you. This is not bad for you. You're going to be learning a new skill, very valuable skill, and you're going to be more effective at your job. So, I think that's the first thing. After that, once you start moving to the Cloud, then, the thing that becomes apparent very quickly is cost control. So, you know, the thing with public Cloud is you know, before you had this really kind of narrow range of what IT could cost. Now with the traditional data center, now we have this huge range. And yes, it can be cheaper than it was before. But, it can also be far more expensive than it was before. >> So, service is sprawled or just not paying attention? Both? >> Well, you essentially you're giving your engineers a blank check. So, you need to have some governance and, you know, you really need to think about things that you didn't have to think about before. You're paying for consumption. So, you really have to watch your consumption. >> So, take me thorough the mental model of D duplication in the Cloud. Because I'm trying to like visualize it or grok it a little bit. Okay, so, the Cloud is out there, data's everywhere. And do I move the compute to the data? How does the backup and recovery and data management work? And does D Doup change with Cloud? Because some people think I got my D Doup already and I'm on premise. I've been doing these old solutions. How does D Doup specifically change in the Cloud or does it? >> I know scale changes. You're looking at, you know, the best D Doup systems, if you look historically, you know, were 100 terabyte, 200 terabyte, Dedup indexes, data domain. The scale changes, you know, customers expect massive scale in Cloud. Our largest customer had 10 perabyte in a single Dedup index. It's 100x scale difference compared to what traditional systems could do. Number two, you could create a quality of service which is not really bound by a fixed, you know, algorithm like variable lent or whatever. So, you can optimize a Dedup very clearly for the right workload. The right Dedup for the right workload. So, you may Dedup off of 365 differently than your VMware instances, compared to your Oracle databases or your Endpoint workload. So, it helps you that as a service business model helps you create a custom, tailored solution for the right data. And bring the scale. We don't have the complexity of scale. But, to get the benefit of scale. All, you know, simply managing the cloud. >> Jake, what's it like working with Druve? What's the benefit that they bring to you guys? >> Yeah, so, specifically around backups for our enterprise systems, you know, that's a difficult challenge to solve natively in the Cloud. Especially if you're going to be limited to using Cloud native tools. So, it's really it's a really perfect use case for a third party provider. You know, people don't think about this much but in the old days, in the data center, you know, our backups went offsite into a vault. They were on tapes. It was very difficult for us to lose those or for them to be erased accidentally or even intentionally. Once you go into the Cloud, especially if you're all in with the Cloud, like we are. Everything is easier. And so, accidents are easier also. You know, deleting your data is easier. So, you know, what we really want and what a lot of enterprises want. >> And security too is a potential >> Absolutely, yeah. And so, what we want is we want to get some of that benefit, you know, back that we had from that inefficiency that we had beforehand. We love all the benefits of the Cloud. But, we want to have our data protected also. So, this is a great role for a company like Druva to come in and offer a product like Phoenix and say, you know, we're going to handle we're going to handle your backups for you essentially. So, you're going to put it in a safe place. We're going to secure it for you. And we're going to make sure it's secure for you. And doing it software is a service like Druva does with Phoenix. I think is the absolute right way to go. It's exactly what you need. >> Well, congratulations Jake Burns, Vice President in Cloud services. >> Thank you. >> At Live Nation entertainment. Jaspreet Singh, CEO of Druva, great to have you on. Congratulations on your success. >> Thank you. >> Inside the tornado called Cloud computing. A lot more stuff coming. More CUBE coverage coming up after this short break. Be right back. (electronic music)

Published Date : Mar 9 2018

SUMMARY :

Brought to you by SiliconANGLE Media, Welcome to theCUBE, so what's going on with Cloud? So, we went all in with AWS, What's the key thing that you guys are doing and services deployed either at the edge or in the Cloud. you guys are backing up? So, you know, we have our financial systems And how would you talk to someone who says to today, you know, in the Cloud. Is the consumption on your end different on the things that are important to the business. Jaspreet, you guys have been very successful So, it improves the pusher of the service being offered that, you know, doesn't know the old IT baggage. to not have, you know, the risk And in that way, you can get the benefit To provide some abstraction layer. and you put that data into a second Cloud provider. And that's where you're looking for differentiation. Solve the problem. And that's why we're doing this. Because that's like you got to backup So, Druva has the best pusher, So, you know, the thing with public Cloud is So, you really have to watch your consumption. And do I move the compute to the data? the best D Doup systems, if you look historically, So, you know, what we really want to get some of that benefit, you know, back in Cloud services. Jaspreet Singh, CEO of Druva, great to have you on. Inside the tornado called Cloud computing.

ENTITIES

Entity	Category	Confidence
Jake Burns	PERSON	0.99+
Jaspreet Singh	PERSON	0.99+
Europe	LOCATION	0.99+
John	PERSON	0.99+
John Furrier	PERSON	0.99+
Live Nation Entertainment	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
US	LOCATION	0.99+
AWS	ORGANIZATION	0.99+
Jake	PERSON	0.99+
Australia	LOCATION	0.99+
Amazon	ORGANIZATION	0.99+
100x	QUANTITY	0.99+
three	QUANTITY	0.99+
San Jose	LOCATION	0.99+
One	QUANTITY	0.99+
Jaspreet	PERSON	0.99+
Office 365	TITLE	0.99+
one	QUANTITY	0.99+
Live Nation	ORGANIZATION	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
both	QUANTITY	0.99+
Druva	ORGANIZATION	0.99+
200 terabyte	QUANTITY	0.99+
first	QUANTITY	0.99+
120 applications	QUANTITY	0.99+
Both	QUANTITY	0.99+
100%	QUANTITY	0.99+
100 terabyte	QUANTITY	0.99+
second	QUANTITY	0.99+
Phoenix	ORGANIZATION	0.99+
two scenarios	QUANTITY	0.99+
late 2015	DATE	0.98+
six months	QUANTITY	0.98+
first time	QUANTITY	0.98+
theCUBE	ORGANIZATION	0.98+
Ticketmaster	ORGANIZATION	0.98+
2016	DATE	0.98+
10 perabyte	QUANTITY	0.98+
two great guests	QUANTITY	0.97+
S3	TITLE	0.97+
Cloud	TITLE	0.97+
one vendor	QUANTITY	0.97+
GDPR	TITLE	0.97+
single platform	QUANTITY	0.96+
Oracle	ORGANIZATION	0.96+
Big Data SV	ORGANIZATION	0.96+
Azure	TITLE	0.95+
365	QUANTITY	0.95+
today	DATE	0.94+
20 different things	QUANTITY	0.94+
Big Data Silicon Valley	ORGANIZATION	0.94+
Druva Phoenix	TITLE	0.93+
Druva	TITLE	0.93+
one place	QUANTITY	0.93+
Cloud Services	ORGANIZATION	0.92+
more than one Cloud	QUANTITY	0.91+
two tech teams	QUANTITY	0.91+
first thing	QUANTITY	0.89+
DB	TITLE	0.89+

Peter Burris Big Data Research Presentation

(upbeat music) >> Announcer: Live from San Jose, it's theCUBE presenting Big Data Silicon Valley brought to you by SiliconANGLE Media and its ecosystem partner. >> What am I going to spend time, next 15, 20 minutes or so, talking about. I'm going to answer three things. Our research has gone deep into where are we now in the big data community. I'm sorry, where is the big data community going, number one. Number two is how are we going to get there and number three, what do the numbers say about where we are? So those are the three things. Now, since when we want to get out of here, I'm going to fly through some of these slides but again there's a lot of opportunity for additional conversation because we're all about having conversations with the community. So let's start here. The first thing to know, when we think about where this is all going is it has to be bound. It's inextricably bound up with digital transformation. Well, what is digital transformation? We've done a lot of research on this. This is Peter Drucker who famously said many years ago, that the purpose of a business is to create and keep a customer. That's what a business is. Now what's the difference between a business and a digital business? What's the business between Sears Roebuck, or what's the difference between Sears Roebuck and Amazon? It's data. A digital business uses data as an asset to create and keep customers. It infuses data and operations differently to create more automation. It infuses data and engagement differently to catalyze superior customer experiences. It reformats and restructures its concept of value proposition and product to move from a product to a services orientation. The role of data is the centerpiece of digital business transformation and in many respects that is where we're going, is an understanding and appreciation of that. Now, we think there's going to be a number of strategic capabilities that will have to be built out to make that possible. First off, we have to start thinking about what it means to put data to work. The whole notion of an asset is an asset is something that can be applied to a productive activity. Data can be applied to a productive activity. Now, there's a lot of very interesting implications that we won't get into now, but essentially if we're going to treat data as an asset and think about how we could put more data to work, we're going to focus on three core strategic capabilities about how to make that possible. One, we need to build a capability for collecting and capturing data. That's a lot of what IoT is about. It's a lot of what mobile computing is about. There's going to be a lot of implications around how to ethically and properly do some of those things but a lot of that investment is about finding better and superior ways to capture data. Two, once we are able to capture that data, we have to turn it into value. That in many respects is the essence of big data. How we turn data into data assets, in the form of models, in the form of insights, in the form of any number of other approaches to thinking about how we're going to appropriate value out of data. But it's not just enough to create value out of it and have it sit there as potential value. We have to turn it into kinetic value, to actually do the work with it and that is the last piece. We have to build new capabilities for how we're going to apply data to perform work better, to enact based on data. Now, we've got a concept we're researching now that we call systems of agency, which is the idea that there's going to be a lot of new approaches, new systems with a lot of intelligence and a lot of data that act on behalf of the brand. I'm not going to spend a lot of time going into this but remember that word because I will come back to it. Systems of agency is about how you're going to apply data to perform work with automation, augmentation, and actuation on behalf of your brand. Now, all this is going to happen against the backdrop of cloud optimization. I'll explain what we mean by that right now. Very importantly, increasingly how you create value out of data, how you create future options on the value of your data is going to drive your technology choices. For the first 10 years of the cloud, the presumption is all data was going to go to the cloud. We think that a better way of thinking about it is how is the cloud experience going to come to the data. We've done a lot of research on the cost of data movement and both in terms of the actual out-of-pocket costs but also the potential uncertainty, the transaction costs, etc, associated with data movement. And that's going to be one of the fundamental pieces or elements of how we think about the future of big data and how digital business works, is what we think about data movement. I'll come to that in a bit. But our proposition is increasingly, we're going to see architectural approaches that focus on how we're going to move the cloud experience to the data. We've got this notion of true private cloud which is effectively the idea of the cloud experience on or near premise. That doesn't diminish the role that the cloud's going to play on industry or doesn't say that Amazon and AWS and Microsoft Azure and all the other options are not important. They're crucially important but it means we have to start thinking architecturally about how we're going to create value of data out of data and recognize that means that it, we have to start envisioning how our organization and infrastructure is going to be set up so that we can use data where it needs to be or where it's most valuable and often that's close to the action. So if we think then about that very quickly because it's a backdrop for everything, increasingly we're going to start talking about the idea of where's the workload going to go? Where's workload the dog going to be against this kind of backdrop of the divorce of infrastructure? We believe that and our research pretty strongly shows that a lot of workloads are going to go to true private cloud but a lot of big data is moving into the cloud. This is a prediction we made a few years ago and it's clearly happening and it's underway and we'll get into what some of the implications are. So again, when we say that a lot of the big data elements, a lot of the process of creating value out of data is going to move into the cloud. That doesn't mean that all the systems of agency that build or rely on that data, the inference engines, etc, are also in a public cloud. A lot of them are going to be distributed out to the edge, out to where the action needs to be because of latency and other types of issues. This is a fundamental proposition and I know I'm going fast but hopefully I'm being clear. All right, so let's now get to the second part. This is kind of where the industry's going. Data is an asset. Invest in strategic business capabilities to appreciate, to create those data assets and appreciate the value of those assets and utilize the cloud intelligently to generate and ensure increasing returns. So the next question is well, how will we get there? Now. Right now, not too far from here, Neil Raden for example, was on the show floor yesterday. Neil made the observation that, as he wandered around, he only heard the word big data two or three times. The concept of big data is not dead. Whether the term is or is not is somebody else's decision. Our perspective, very simply, is that the notion is bifurcating. And it's bifurcating because we see different strategic imperatives happening at two different levels. On the one hand, we see infrastructure convergence. The idea that increasingly we have to think about how we're going to bring and federated data together, both from a systems and a data management standpoint. And on the other hand, we're going to see infrastructure or application specialization. That's going to have an enormous implication over next few years, if only because there just aren't enough people in the world that understand how to create value out of data. And there's going to be a lot of effort made over the next few years to find new ways to go from that one expertise group to billions of people, billions of devices, and those are the two dominant considerations in the industry right now. How can we converge data physically, logically, and on the other hand, how can we liberate more of the smarts associated with this very, very powerful approach so that more people get access to the capacities and the capabilities and the assets that are being generated by that process. Now, we've done at Wikibon, probably I don't know, 18, 20, 23 predictions overall on the role that or on the changes being wrought by digital business. Here I'm going to focus on four of them that are central to our big data research. We have many more but I'm just going to focus on four. The first one, when we think about infrastructure convergence we worry about hardware. Here's a prediction about what we think is going to happen with hardware and our observation is we believe pretty strongly that future systems are going to be built on the concept of how do you increase the value of data assets. The technologies are all in place. Simpler parts that it more successfully bind specifically through all its storage and network are going to play together. Why, because increasingly that's the fundamental constraint. How do I make data available to other machines, actors, sources of change, sources of process within the business. Now, we envision or we are watching before our very eyes, new technologies that allow us to take these simple piece parts and weave them together in very powerful fabrics or grids, what we call UniGrid. So that there is almost no latency between data that exists within one of these, call it a molecule, and anywhere else in that grid or lattice. Now again, these are not systems that are going to be here in five years. All the piece parts are here today and there are companies that are actually delivering them. So if you take a look at what Micron has done with Mellanox and other players, that's an example of one of these true private cloud oriented machines in place. The bottom line though is that there is a lot of room left in hardware. A lot of room. This is what cloud suppliers are building and are going to build but increasingly as we think about true private cloud, enterprises are going to look at this as well. So future systems for improving data assets. The capacity of this type of a system with low latency amongst any source of data means that we can now think about data not as... Not as a set of sources that have to be each individually, each having some control over its own data and sinks woven together by middleware and applications but literally as networks of data. As we start to think about distributing data and distributing control and authority associated with that data more broadly across systems, we now have to think about what does it mean to create networks of data? Because that, in many respects, is how these assets are going to be forged. I haven't even mentioned the role that security is going to play in all of this by the way but fundamentally that's how it's likely to play out. We'll have a lot of different sources but from a business standpoint, we're going to think about how those sources come together into a persistent network that can be acted upon by the business. One of the primary drivers of this is what's going on at the edge. Marc Andreessen famously said that software is eating the world, well our observation is great but if software's eating the world, it's eating it at the edge. That's where it's happening. Secondly, that this notion of agency zones. I said I'm going to bring that word up again, how systems act on behalf of a brand or act on behalf of an institution or business is very, very crucial because the time necessary to do the analysis, perform the intelligence, and then take action is a real constraint on how we do things. And our expectation is that we're going to see what we call an agency zone or a hub zone or cloud zone defined by latency and how we architect data to get the data that's necessary to perform that piece of work into the zone where it's required. Now, the implications of this is none of this is going to happen if we don't use AI and related technologies to increasingly automate how we handle infrastructure. And technologies like blockchain have the potential to provide a interesting way of imagining how these networks of data actually get structured. It's not going to solve everything. There's some people that think the blockchain is kind of everything that's necessary but it will be a way of describing a network of data. So we see those technologies on the ascension. But what does it mean for DBMS? In the old way, in the old world, the old way of thinking, the database manager was the control point for data. In the new world these networks of data are going to exist beyond a single DBMS and in fact, over time, that concept of federated data actually has a potential to become real. When we have these networks of data, we're going to need people to act upon them and that's essentially a lot of what the data scientist is going to be doing. Identifying the outcome, identifying the data that's required, and weaving that data through the construction and management, manipulation of pipelines, to ensure that the data as an asset can persist for the purposes of solving a near-term problem or over whatever duration is required to solve a longer term problem. Data scientists remain very important but we're going to see, as a consequence of improvements in tooling capable of doing these things, an increasing recognition that there's a difference between a data scientist and a data scientist. There's going to be a lot of folks that participate in the process of manipulating, maintaining, managing these networks of data to create these business outcomes but we're going to see specialization in those ranks as the tooling is more targeted to specific types of activities. So the data scientist is going to become or will remain an important job, going to lose a little bit of its luster because it's going to become clear what it means. So some data scientists will probably become more, let's call them data network administrators or networks of data administrators. And very importantly as I said earlier, there's just not enough of these people on the planet and so increasingly when we think about again, digital business and the idea of creating data assets. A central challenge is going to be how to create the data or how to turn all the data that can be captured into assets that can be applied to a lot of different uses. There's going to be two fundamental changes to the way we are currently conceiving of the big data world on the horizon. One is well, it's pretty clear that Hadoop can only go so far. Hadoop is a great tool for certain types of activities and certain numbers of individuals. So Hadoop solves problems for an important but relatively limited subset of the world. Some of the new data science platforms that we just talked about, that I just talked about, they're going to help with a degree of specialization that hasn't been available before in the data world, will certainly also help but it also will only take it so far. The real way that we see the work that we're doing, the work that the big data community is performing, turned into sources of value that extend into virtually every single corner of humankind is going to be through these cloud services that are being built and increasingly through packaged applications. A lot of computer science, it still exists between what I just said and when this actually happens. But in many respects, that's the challenge of the vendor ecosystem. How to reconstruct the idea of packaged software, which has historically been built around operations and transaction processing, with a known data model and an unknown or the known process and some technology challenges. How do we reapply that to a world where we now are thinking about, well we don't know exactly what the process is because the data tells us at the moment that the actions going to be taking place. It's a very different way of thinking about application development. A very different way of thinking about what's important in IT and very different way of thinking about how business is going to be constructed and how strategy's going to be established. Packaged applications are going to be crucially important. So in the last few minutes here, what are the numbers? So this is kind of the basis for our analysis. Digital business, role of data is an asset, having an enormous impact in how we think about hardware, how do we think about database management or data management, how we think about the people involved in this, and ultimately how we think about how we're going to deliver all this value out to the world. And the numbers are starting to reflect that. So why don't you think about four numbers as I go through the two or three slides. Hundred and three billion, 68%, 11%, and 2017. So of all the numbers that you will see, those are four of the most important numbers. So let's start by looking at the total market place. This is the growth of the hardware, software, and services pieces of the big data universe. Now we have a fair amount of additional research that breaks all these down into tighter segments, especially in software side. But the key number here is we're talking about big numbers. 103 billion over the course of next 10 years and let's be clear that 103 billion dollars actually has a dramatic amplification on the rest of the computing industry because a lot of the pricing models associated with, especially the software, are tied back to open source which has its own issues. And very importantly, the fact that the services business is going to go through an enormous amount of change over the next five years as service companies better understand how to deliver some of these big data rich applications. The second point to note here is that it was in 2017 that the software market surpassed the hardware market in big data. Again, for first number of years we focused on buying the hardware and the system software associated with that and the software became something that we hope to discover. So I was having a conversation here in theCUBE with the CEO of Transwarp which is a very interesting Chinese big data company and I asked what's the difference between how you do things in China and how we do things in the US? He said well, in the US you guys focus on proof of concept. You spend an enormous amount of time asking, does the hardware work? Does the database software work? Does the data management software work? In China we focus on the outcome. That's what we focus on. Here you have to placate the IT organization to make sure that everybody in IT is comfortable with what's about to happen. In China, were focused on the business people. This is the first year that software is bigger than hardware and it's only going to get bigger and bigger over time. It doesn't mean again, that hardware is dead or hardware is not important. It's going to remain very important but it does mean that the centerpiece of the locus of the industry is moving. Now, when we think about what the market shares look like, it's a very fragmented market. 60%, 68% of the market is still other. This is a highly immature market that's going to go through a number of changes over the next few years. Partly catalyzed by that notion of infrastructure convergence. So in four years our expectation is that, that 68% is going to start going down pretty fast as we see greater consolidation in how some of these numbers come together. Now IBM is the biggest one on the basis of the fact that they operate in all these different segments. They operating the hardware, software, and services segment but especially because they're very strong within the services business. The last one I want to point your attention to is this one. I mentioned earlier on, that our expectation is that the market increasingly is going to move to a packaged application orientation or packaged services orientation as a way of delivering expertise about big data to customers. Splunk is the leading software player right now. Why, because that's the perspective that they've taken. Now, perhaps we're a limited subset. It's perhaps for a limited subset of individuals or markets or of sectors but it takes a packaged application, weaves these technologies together, and applies them to an outcome. And we think this presages more of that kind of activity over the course of the next few years. Oracle, kind of different approach and we'll see how that plays out over the course of the next five years as well. Okay, so that's where the numbers are. Again, a lot more numbers, a lot of people you can talk to. Let me give you some action items. First one, if data was a core asset, how would IT, how would your business be different? Stop and think about that. If it wasn't your buildings that were the asset, it wasn't the machines that were the asset, it wasn't your people by themselves who were the asset, but data was the asset. How would you reinstitutionalize work? That's what every business is starting to ask, even if they don't ask it in the same way. And our advice is, then do it because that's the future of business. Not that data is the only asset but data is a recognized central asset and that's going to have enormous impacts on a lot of things. The second point I want to leave you with, tens of billions of users and I'm including people and devices, are dependent on thousands of data scientists that's an impedance mismatch that cannot be sustained. Packaged apps and these cloud services are going to be the way to bridge that gap. I'd love to tell you that it's all going to be about tools, that we're going to have hundreds of thousands or millions or tens of millions or hundreds of millions of data scientists suddenly emerge out of the woodwork. It's not going to happen. The third thing is we think that big businesses, enterprises, have to master what we call the big inflection. The big tech inflection. The first 50 years were about known process and unknown technology. How do I take an accounting package and do I put on a mainframe or a mini computer a client/server or do I do it on the web? Unknown technology. Well increasingly today, all of us have a pretty good idea what the base technology is going to be. Does anybody doubt it's going to be the cloud? We got a pretty good idea what the base technology is going to be. What we don't know is what are the new problems that we can attack, that we can address with data rich approaches to thinking about how we turn those systems into actors on behalf of our business and customers. So I'm a couple minutes over, I apologize. I want to make sure everybody can get over to the keynotes if you want to. Feel free to stay, theCUBE's going to be live at 9:30. If I got that right. So it's actually pretty exciting if anybody wants to see how it works, feel free to stay. Georgia's here, Neil's here, I'm here. I mentioned Greg Terrio, Dave Volante, John Greco, I think I saw Sam Kahane back in the corner. Any questions, come and ask us, we'll be more than happy. Thank you very much for, oh David Volante. >> David: I have a question. >> Yes. >> David: Do you have time? >> Yep. >> David: So you talk about data as a core asset, that if you look at the top five companies by market cap in the US, Google, Amazon, Facebook, etc. They're data companies, they got data at the core which is kind of what your first bullet here describes. How do you see traditional companies closing that gap where humans, buildings, etc at the core as we enter this machine intelligence era, what's your advice to the traditional companies on how they close that gap? >> All right. So the question was, the most valuable companies in the world are companies that are well down the path of treating data as an asset. How does everybody else get going? Our observation is you go back to what's the value proposition? What actions are most important? what's data is necessary to perform those actions? Can changing the way the data is orchestrated and organized and put together inform or change the cost of performing that work by changing the cost transactions? Can you increase a new service along the same lines and then architect your infrastructure and your business to make sure that the data is near the action in time for the action to be absolute genius to your customer. So it's a relatively simple thought process. That's how Amazon thought, Apple increasingly thinks like that, where they design the experience and they think what data is necessary to deliver that experience. That's a simple approach but it works. Yes, sir. >> Audience Member: With the slide that you had a few slides ago, the market share, the big spenders, and you mentioned that, you asked the question do any of us doubt that cloud is the future? I'm with Snowflake, I don't see many of those large vendors in the cloud and I was wondering if you could speak to what are you seeing in terms of emerging vendors in that space. >> What a great question. So the question was, when you look at the companies that are catalyzing a lot of the change, you don't see a lot of the big companies being at the leadership. And someone from Snowflake just said, well who's going to lead it? That's a big question that has a lot of implications but at this point time it's very clear that the big companies are suffering a bit from the old, from the old, trying to remember what the... RCA syndrome. I think Clay Christensen talked about this. You know, the innovators dilemma. So RCA actually is one of the first creators. They created the transistor and they held a lot of original patents on it. They put that incredible new technology, back in the forties and fifties, under the control of the people who ran the vacuum tube business. When was the last time anybody bought RCA stock? The same problem is existing today. Now, how is that going to play out? Are we going to see a lot of, as we've always seen, a lot of new vendors emerge out of this industry, grow into big vendors with IPO related exits to try to scale their business? Or are we going to see a whole bunch of gobbling up? That's what I'm not clear on but it's pretty clear at this point in time that a lot of the technology, a lot of the science, is being done in smaller places. The moderating feature of that is the services side. Because there's limited groupings of expertise that the companies that today are able to attract that expertise. The Googles, the Facebooks, the AWSs, etc, the Amazons. Are doing so in support of a particular service. IBM and others are trying to attract that talent so they can apply it to customer problems. We'll see over the next few years whether the IBMs and the Accentures and the big service providers are able to attract the kind of talent necessary to diffuse that knowledge into the industry faster. So it's the rate at which that the idea of internet scale computing, the idea of big data being applied to business problems, can diffuse into the marketplace through services. If it can diffuse faster that will have both an accelerating impact for smaller vendors, as it has in the past. But it may also again, have a moderating impact because a lot of that expertise that comes out of IBM, IBM is going to find ways to drive in the product faster than it ever has before. So it's a complicated answer but that's our thinking at this point time. >> Dave: Can I add to that? >> Yeah. (audience member speaking faintly) >> I think that's true now but I think the real question, not to not to argue with Dave but this is part of what we do. The real question is how is that knowledge going to diffuse into the enterprise broadly? Because Airbnb, I doubt is going to get into the business of providing services. (audience member speaking faintly) So I think that the whole concept of community, partnership, ecosystem is going to remain very important as it always has and we'll see how fast those service companies that are dedicated to diffusing knowledge, diffusing knowledge into customer problems actually occurs. Our expectation is that as the tooling gets better, we will see more people be able to present themselves truly as capable of doing this and that will accelerate the process. But the next few years are going to be really turbulent and we'll see which way it actually ends up going. (audience member speaking faintly) >> Audience Member: So I'm with IBM. So I can tell you 100% for sure that we are, I hired literally 50 data scientists in the last three months to go out and do exactly what you're saying. Sit down with clients and help them figure out how to do data science in the enterprise. And so we are in fact scaling it, we're getting people that have done this at Google, Facebook. Not a whole lot of those 'cause we want to do it with people that have actually done it in legacy fortune 500 Companies, right? Because there's a little bit difference there. >> So. >> Audience Member: So we are doing exactly what you said and Microsoft is doing the same thing, Amazon is actually doing the same thing too, Domino Data Lab. >> They don't like they're like talking about it too much but they're doing it. >> Audience Member: But all the big players from the data science platform game are doing this at a different scale. >> Exactly. >> Audience Member: IBM is doing it on a much bigger scale than anyone else. >> And that will have an impact on ultimately how the market gets structured and who the winners end up being. >> Audience Member: To add too, a lot of people thought that, you mentioned the Red Hat of big data, a lot of people thought Cloudera was going to be the Red Hat of big data and if you look at what's happened to their business. (background noise drowns out other sounds) They're getting surrounded by the cloud. We look at like how can we get closer to companies like AWS? That was like a wild card that wasn't expected. >> Yeah but look, at the end of the day Red Hat isn't even the Red Hat of open source. So the bottom line is the thing to focus on is how is this knowledge going to diffuse. That's the thing to focus on. And there's a lot of different ways, some of its going to diffuse through tools. If it diffuses through tools, it increases the likelihood that we'll have more people capable of doing this in IBM and others can hire more. That Citibank can hire more. That's an important participant, that's an important play. So you have something to say about that but it also says we're going to see more of the packaged applications emerge because that facilitates the diffusion. This is not, we haven't figured out, I don't know exactly, nobody knows exactly the exact shape it's going to take. But that's the centerpiece of our big data researches. How is that diffusion process going to happen, accelerate, and what's the resulting structure going to look like? And ultimately how are enterprises going to create value with whatever results. Yes, sir. (audience member asks question faintly) So the recap question is you see more people coming in and promising the moon but being incapable of delivering because they are, partly because the technology is uncertain and for other reasons. So here's our approach. Or here's our observation. We actually did a fair amount of research on this. When you take a look at what we call a approach to doing big data that's optimized for the costs of procurement i.e. let's get the simplest combination of infrastructure, the simplest combination of open-source software, the simplest contracting, to create that proof of concept that you can stand things up very quickly if you have enough expertise but you can create that proof of concept but the process of turning that into actually a production system extends dramatically. And that's one of the reasons why the Clouderas did not take over the universe. There are other reasons. As George Gilbert's research has pointed out, that Cloudera is spending 53, 55 % of their money right now just integrating all the stuff that they bought into the distribution five years ago. Which is a real great recipe for creating customer value. The bottom line though is that if we focus on the time to value in production, we end up taking a different path. We don't focus as much on whether the hardware is going to work and the network is going to work and the storage can be integrated and how it's going to impact the database and what that's going to mean to our Oracle license pool and all the other things that people tend to think about if they're focused on the technology. And so as a consequence, you get better time to value if you focus on bringing the domain expertise, working with the right partner, working with the appropriate approach, to go from what's the value proposition, what actions are associated with a value proposition, what's stated in that area to perform those actions, how can I take transaction costs out of performing those actions, where's the data need to be, what infrastructure do I require? So we have to focus on a time to value not the time to procure. And that's not what a lot of professional IT oriented people are doing because many of them, I hate say it, but many of them still acquire new technology with the promise to helping the business but having a stronger focus on what it's going to mean to their careers. All right, I want to be really respectful to everybody's time. The keynotes start in about five minutes which means you just got time. If you want to stay, feel free to stay. We'll be here, we'll be happy to talk but I think that's pretty much going to close our presentation broadcast. Thank you very much for being an attentive audience and I hope you found this useful. (upbeat music)

Published Date : Mar 9 2018

SUMMARY :

brought to you by SiliconANGLE Media that the actions going to be taking place. by market cap in the US, Google, Amazon, Facebook, etc. or change the cost of performing that work in the cloud and I was wondering if you could speak to the idea of big data being applied to business problems, (audience member speaking faintly) Our expectation is that as the tooling gets better, in the last three months to go out and do and Microsoft is doing the same thing, but they're doing it. Audience Member: But all the big players from Audience Member: IBM is doing it on a much bigger scale how the market gets structured They're getting surrounded by the cloud. and the network is going to work

ENTITIES

Entity	Category	Confidence
Dave Volante	PERSON	0.99+
Marc Andreessen	PERSON	0.99+
Dave	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
Neil	PERSON	0.99+
Facebook	ORGANIZATION	0.99+
Sam Kahane	PERSON	0.99+
Google	ORGANIZATION	0.99+
Neil Raden	PERSON	0.99+
2017	DATE	0.99+
John Greco	PERSON	0.99+
Citibank	ORGANIZATION	0.99+
Greg Terrio	PERSON	0.99+
China	LOCATION	0.99+
David Volante	PERSON	0.99+
Apple	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
Clay Christensen	PERSON	0.99+
David	PERSON	0.99+
Sears Roebuck	ORGANIZATION	0.99+
100%	QUANTITY	0.99+
Domino Data Lab	ORGANIZATION	0.99+
Peter Drucker	PERSON	0.99+
US	LOCATION	0.99+
Amazons	ORGANIZATION	0.99+
two	QUANTITY	0.99+
11%	QUANTITY	0.99+
George Gilbert	PERSON	0.99+
AWS	ORGANIZATION	0.99+
San Jose	LOCATION	0.99+
68%	QUANTITY	0.99+
millions	QUANTITY	0.99+
53, 55 %	QUANTITY	0.99+
60%	QUANTITY	0.99+
Peter Burris	PERSON	0.99+
Facebooks	ORGANIZATION	0.99+
103 billion	QUANTITY	0.99+
Googles	ORGANIZATION	0.99+
second part	QUANTITY	0.99+
second point	QUANTITY	0.99+
IBMs	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
AWSs	ORGANIZATION	0.99+
Accentures	ORGANIZATION	0.99+
Hadoop	TITLE	0.99+
One	QUANTITY	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
Snowflake	ORGANIZATION	0.99+
four	QUANTITY	0.99+
Hundred	QUANTITY	0.99+
Transwarp	ORGANIZATION	0.99+
Mellanox	ORGANIZATION	0.99+
tens of millions	QUANTITY	0.99+
three things	QUANTITY	0.99+
Micron	ORGANIZATION	0.99+
50 data scientists	QUANTITY	0.99+
First	QUANTITY	0.99+
yesterday	DATE	0.99+
three times	QUANTITY	0.99+
103 billion dollars	QUANTITY	0.99+
Red Hat	TITLE	0.99+
first bullet	QUANTITY	0.99+
Two	QUANTITY	0.99+
Airbnb	ORGANIZATION	0.99+
Secondly	QUANTITY	0.99+
five years	QUANTITY	0.98+
one	QUANTITY	0.98+
both	QUANTITY	0.98+
hundreds of millions	QUANTITY	0.98+
first	QUANTITY	0.98+

Jacques Nadeau, Dremio | Big Data SV 2018

>> Announcer: Live from San Jose, it's theCUBE, presenting Big Data Silicon Valley. Brought to you by SiliconANGLE Media and it's ecosystem partners. >> Welcome back to Big Data SV in San Jose. This theCUBE, the leader in live tech coverage. My name is Dave Vellante and this is day two of our wall-to-wall coverage. We've been here most of the week, had a great event last night, about 50 or 60 of our CUBE community members were here. We had a breakfast this morning where the Wikibon research team laid out it's big data forecast, the eighth big data forecast and report that we've put out, so check out that online. Jacques Nadeau is here. He is the CTO and co-founder of Dremio. Jacque, welcome to theCUBE, thanks for coming on. >> Thanks for having me here. >> So we were talking a little bit about what you guys do. Three year old company. Well, let me start. Why did you co-found Dremio? >> So, it was a very simple thing I saw, so, over the last ten years or so, we saw a regression in the ability for people to get at data, so you see all these really cool technologies that came out to store data. Data lakes, you know, SQL systems, all these different things that make developers very agile with data. But what we were also seeing was a regression in the ability for analysts and data consumers to get at that data because the systems weren't designed for analysts, they were designed for data producers and developers. And we said, you know what, there needs to be a way to solve this. We need to be able to empower people to be self-sufficient again at the data consumption layer. >> Okay, so you solved that problem how, you said, called it a self-service of a data platform. >> Yeah, yeah, so self-service data platform and the idea is pretty simple. It's that, no matter where the data is physically, people should be able to interact with a logical view of it. And so, we talk a little bit like it's Google Docs for your data. So people can go into the system, they can see the different data sets that are available to them, collaborate around those, create changes to those that they can then share with other people in the organization, always dealing with the logical layer and then, behind the scenes, we have physical capabilities to interact with all the different system we interact with. But that's something that business users shouldn't have to think as much about and so, if you think about how people interact with data today, it's very much about copies. So every time you want to do something, typically you're going to make a copy. I want to reshape the data, I make a copy. I want to make it go faster, I make a copy. And those copies are very, very difficult for people to manage and they could have mixed the business meaning of data with the physical, I'm making copies to make them faster or whatever. And so our perspective is that, if you can separate away the physical concerns from the logical, then business users have a much more, much more likelihood to be able to do something self-service. >> So you're essentially virtualizing my corpus of data, independent of location, is that right, I mean-- >> It's part of what we do, yeah. No, it's part of what we do. So, the way we look at it is, is kind of several different components to try to make something self-service. It starts with, yeah, virtualize or abstract away the details of the physical, right? But then, on top of that, expose a very, sort of a very user-friendly interface that allows people to sort of catalog and understand the different things, you know, search for things that they want to interact with, and then curate things, even if they're non-technical users, right? So the goal is that, if you talk to sort of even large internet companies in the Valley, it's very hard to even hire the amount of data engineering that you need to satisfy all the requests of your end-users of data. And so the, and so the goal of Dremio is basically to figure out different tools that can provide a non-technical experience for getting at the data. So that's sort of the start of it but then the second step is, once you've got access to this thing and people can collaborate and sort of deal with the data, then you've got these huge volumes of data, right? It's big data and so how do you make that go faster? And then we have some components that we deal with, sort of, speed and acceleration. >> So maybe talk about how people are leveraging this capability, this platform, what the business impact is, what have you seen there? >> So a lot of people have this problem, which is, they have data all over the place and they're trying to figure out "How do I expose this "to my end-users?" And those end-users might be analysts, they might be data scientists, they might be product managers that are trying to figure out how their product is working. And so, what they're doing today is they're typically trying to build systems internally that, to provide these capabilities. And so, for example, working with a large auto manufacturer. And they've got a big initiative where they're trying to make the data that they have, they have huge amounts of data across all sort of different parts of the organization and they're trying to make that available to different data consumers. Now, of course, there's a bunch of security concerns that you need to have around that, but they just want to make the data more accessible. And so, what they're doing is they're using Dremio to figure out ways to, basically, catalog all the data below, expose that to the different users, applying lots of different security rules around that, and then create a bunch of reflections, which make the things go faster as people are interacting with the things. >> Well, what about the governance factor? I mean, you heard this in the hadoop world years ago. "Ah, we're going to make, we're going to harden hadoop, "we're going to" and really, there was no governance and it became more and more important. How do you guys handle that? Do you partner with people? Is it up to the customer to figure that out? Do you provide that? >> It's several different things, right? It's a complex ecosystem, right? So it's a combination of things. You start with partnering with different systems to make sure that you integrate well with those things. So the different things that control some parts of credentials inside the systems all the way down to "What's the file system permissions?", right? "What are the permissions inside of something like Hive and the metastore there?" And then other systems on top of that, like Sentry or Ranger are also exposing different credentialing, right? And so we work hard to sort of integrate with those things. On top of that, Dremio also provides a full security model inside of the sort of virtual space that we work. And so people can control the permissions, the ability to access or edit any object inside of Dremio based on user roles and LDAP and those kinds of things. So it's, it's kind of multiple layers that have to be working together. >> And tell me more about the company. So founded three years ago, I think a couple of raises, >> Yep >> who's backing you? >> Yeah, yeah, yeah, so we founded just under three years ago. We had great initial investors, in Red Point and Lightspeed, so two great initial investors and we raised about 15 million on that round. And then we actually just closed a B round in January of this year and we added Norwest to the portfolio there. >> Awesome, so you're now in the mode of, I mean, they always say, you know, software is such a capital-efficient business but you see software companies raising, you know, 900 million dollars and so, presumably, that's to compete, to go to market and, you know, differentiate with your messaging and branding. Is that sort of what the, the phase that you're in now? You kind of developed a product, it's technically sound, it's proven in the marketspace and now you're scaling the, the go-to-market, is that right? >> That's exactly right. So, so we've had a lot of early successes, a lot of Fortune 100 companies using Dremio today. For example, we're working with TransUnion. We're working with Intel. We actually have a great relationship with OVH, which is the third-largest hosting company in the world, so a lot of great, Daimler is another one. So working with a lot of great companies, seeing sort of great early success with the product with those companies, and really looking to say "Hey, we're out here." We've got a booth for the first time at Strata here and we're sort of letting people know about, sort of, a better way, or easier way, for people to deal with data >> Yeah. >> A happier way. >> I mean, it's a crowded space, right? There's a lot of tools out there, a lot of companies. I'm interested in how you sort of differentiate. Obviously simplification is a part of that, the breadth of your capabilities. But maybe, in your words, you could share with me how you differentiate from the competition and how you break out from the noise. >> Yeah, yeah, yeah, so it's, you're absolutely right, it's a very crowded space. Everybody's using the same words and that makes it very hard for people to understand what's going on. And so, what we've found is very simple is that typically we will actually, the first meeting we deal with a customer, within the first 10 minutes we'll demo the product. Because so many technologies are technologies, not, they're not products and so you have to figure out how to use the product. You've got to figure out how you would customize it for your certain use-case. And what we've found with our product is, by making it very, very simple, people start, the light goes on in a very short amount of time and so, we also do things on our website so that you can see, in a couple of minutes, or even less than that, little animations that sort of give you a sense of what it's about. But really, it's just "Hey, this is a product "which is about", there's this light bulb that goes on, it's great. And you figure this out over the course of working with different customers, right? But there's this light bulb that goes on for people that are so confused by all the things that are going on and if we can just sit down with them, show them the product for a few minutes, all of a sudden they're like "Wait a minute, "I can use this", right? So you're frequently talking to buyers that are not the most technical parts of the organization initially, and so most of the technologies they look at are technologies that are very difficult to understand and they have to look to others to try to even understand how it would fit into their architecture. With Dremio, we have customers that can, that have installed it and gotten up, and within an hour or two, started to see real value. And that sort of excitement happens even in the demo, with most people. >> So you kind of have this bifurcated market. Since the big data meme, everybody says they're data-driven and you've got a bifurcated market in that, you've got the companies that are data-driven and you've got companies who say they're data-driven but really aren't. Who are your customers? Are they in both? Are they predominantly in the data-driven side? Are they predominantly in the trying to be data-driven? >> Well, I would say that they all would say that they're data-driven. >> Yeah, everyone, who's going to say "Well, we're not data-driven." >> Yeah, yeah, yeah. So I would say >> We're dead. >> I would say that everybody has data and they've got some ways that they're using it well and other places where they feel like they're not using it as well as they should. And so, I mean, the reason that we exist is to make it so it's easier for people to get value out of data, and so, if they were getting all the value they think they could get out of data, then we probably wouldn't exist and they would be fully data-driven. So I think that everybody, it's a journey and people are responding well to us, in part, because we're helping them down that journey. >> Well, the reason I asked that question is that we go to a lot of shows and everybody likes to throw out the digital transformation buzzword and then use Uber and Airbnb as an example, but if you dig deeper, you see that data is at the core of those companies and they're now beginning to apply machine intelligence and they're leveraging all this data that they've built up, this data architecture that they built up over the last five or 10 years. And then you've got this set of companies where all the data lives in silos and I can see you guys being able to help them. At the same time, I can see you helping the disruptors, so how do you see that? I mean, in terms of your role, in terms of affecting either digital transformations or digital disruptions. >> Well, I'd say that in either case, so we believe in a very sort of simple thing, which is that, so going back to what I said at the beginning, which is just that I see this regression in terms of data access, right? And so what happens is that, if you have a tightly-coupled system between two layers, then it becomes very difficult for people to sort of accommodate two different sets of needs. And so, the change over the last 10 years was the rise of the developer as the primary person for controlling data and that brought a huge amount of great things to it but analysis was not one of them. And there's tools that try to make that better but that's really the problem. And so our belief is very simple, which is that a new tier needs to be introduced between the consumers and the, and the producers of data. And that, and so that tier may interact with different systems, it may be more complex or whatever, for certain organizations, but the tier is necessary in all organizations because the analysts shouldn't be shaken around every time the developers change how they're doing data. >> Great. John Furrier has a saying that "Data is the new development kit", you know. He said that, I don't know, eight years ago and it's really kind of turned out to be the case. Jacques Nadeau, thanks very much for coming on theCUBE. Really appreciate your time. >> Yeah. >> Great to meet you. Good luck and keep us informed, please. >> Yes, thanks so much for your time, I've enjoyed it. >> You're welcome. Alright, thanks for watching everybody. This is theCUBE. We're live from Big Data SV. We'll be right back. (bright music)

Published Date : Mar 9 2018

SUMMARY :

Brought to you by SiliconANGLE Media We've been here most of the week, So we were talking a little bit about what you guys do. And we said, you know what, there needs to be a way Okay, so you solved that problem how, and the idea is pretty simple. So the goal is that, if you talk to sort of expose that to the different users, I mean, you heard this in the hadoop world years ago. And so people can control the permissions, And tell me more about the company. And then we actually just closed a B round that's to compete, to go to market and, you know, for people to deal with data and how you break out from the noise. and so most of the technologies they look at So you kind of have this bifurcated market. that they're data-driven. Yeah, everyone, who's going to say So I would say And so, I mean, the reason that we exist is At the same time, I can see you helping the disruptors, And so, the change over the last 10 years "Data is the new development kit", you know. Great to meet you. This is theCUBE.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Jacques Nadeau	PERSON	0.99+
Daimler	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
Norwest	ORGANIZATION	0.99+
Intel	ORGANIZATION	0.99+
Wikibon	ORGANIZATION	0.99+
TransUnion	ORGANIZATION	0.99+
Jacque	PERSON	0.99+
San Jose	LOCATION	0.99+
OVH	ORGANIZATION	0.99+
Lightspeed	ORGANIZATION	0.99+
second step	QUANTITY	0.99+
Uber	ORGANIZATION	0.99+
two layers	QUANTITY	0.99+
Airbnb	ORGANIZATION	0.99+
both	QUANTITY	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
Google Docs	TITLE	0.99+
Red Point	ORGANIZATION	0.99+
Strata	ORGANIZATION	0.99+
60	QUANTITY	0.98+
900 million dollars	QUANTITY	0.98+
three years ago	DATE	0.98+
eight years ago	DATE	0.98+
two	QUANTITY	0.98+
Dremio	PERSON	0.98+
first 10 minutes	QUANTITY	0.98+
last night	DATE	0.98+
about 15 million	QUANTITY	0.97+
theCUBE	ORGANIZATION	0.97+
first time	QUANTITY	0.97+
Dremio	ORGANIZATION	0.97+
Big Data SV	ORGANIZATION	0.96+
an hour	QUANTITY	0.96+
two great initial investors	QUANTITY	0.95+
today	DATE	0.93+
first meeting	QUANTITY	0.93+
this morning	DATE	0.92+
two different sets	QUANTITY	0.9+
third	QUANTITY	0.88+
Big Data	ORGANIZATION	0.87+
SQL	TITLE	0.87+
10 years	QUANTITY	0.87+
CUBE	ORGANIZATION	0.87+
years ago	DATE	0.86+
Silicon Valley	LOCATION	0.86+
January of this year	DATE	0.84+
Dremio	TITLE	0.84+
Three year old	QUANTITY	0.81+
last 10 years	DATE	0.8+
Sentry	ORGANIZATION	0.77+
one of them	QUANTITY	0.75+
about 50	QUANTITY	0.75+
day two	QUANTITY	0.74+
Ranger	ORGANIZATION	0.74+
SV	EVENT	0.7+
last ten years	DATE	0.68+
eighth big	QUANTITY	0.68+
Data	ORGANIZATION	0.66+
Big	EVENT	0.65+
couple of minutes	QUANTITY	0.61+
CTO	PERSON	0.56+
one	QUANTITY	0.55+
last	DATE	0.52+
100 companies	QUANTITY	0.52+
under	DATE	0.51+
five	QUANTITY	0.5+
2018	DATE	0.5+
Hive	TITLE	0.42+

Steve Wilkes, Striim | Big Data SV 2018

>> Narrator: Live from San Jose it's theCUBE. Presenting Big Data Silicon Valley. Brought to you by SiliconANGLE Media and its ecosystem partners. (upbeat music) >> Welcome back to San Jose everybody, this is theCUBE, the leader in live tech coverage and you're watching BigData SV, my name is Dave Vellante. In the early days of Hadoop everything was batch oriented. About four or five years ago the market really started to focus on real time and streaming analytics to try to really help companies affect outcomes while things were still in motion. Steve Wilks is here, he's the co-founder and CTO of a company called Stream, a firm that's been in this business for around six years. Steve welcome to theCUBE, good to see you. Thanks for coming on. >> Thanks Dave it's a pleasure to be here. >> So tell us more about that, you started about six years ago, a little bit before the market really started talking about real time and streaming. So what led you to that conclusion that you should co-found Steam way ahead of its time? >> It's partly our heritage. So the four of us that founded Stream, we were executives at GoldenGate Software. In fact our CEO Ali Kutay was the CEO of GoldenGate Software. So when we were acquired by Oracle in 2009, after having to work for Oracle for a couple years, we were trying to work out what to do next. And GoldenGate was replication software right? So it's moving data from one place to another. But customers would ask us in customer advisory boards, that data seems valuable, it's moving. Can you look at it while it's moving and analyze it while it's moving, get value out of that moving data? And so that was kind of set in our heads. And then we were thinking about what to do next, that was kind of the genesis of the idea. So the concept around Stream when we first started the company was we can't just give people streaming data, we need to give them the ability to process that data, analyze it, visualize it, play with it and really truly understand the data. As well as being able to collect it and move it somewhere else. And so the goal from day one was always to build a full end-to-end platform that did everything customers needed to do for streaming integration analytics out of the box. And that's what we've done after six years. >> I got to ask a really basic question, so you're talking about your experience at GoldenGate moving data from point a to point b and somebody said well why don't we put that to work. But is there change data or was it static data? Why couldn't I just analyze it in place? >> GoldenGate works on change data. >> Okay so that's why, there was changes going through. Why wait until it hits its target, let's do some work in real time and learn from that, get greater productivity. And now you guys have taken that to a new level. That new level being what? Modern tools, modern technologies? >> A platform built from the ground up to be inherently distributed, scalable, reliable with exactly one's processing guarantees. And to be a complete end-to-end platform. There's a recognition that the first part of being able to do streaming data integration or analytics is that you need to be able to collect the data right? And while change data captured from databases is the way to get data out of databases in a streaming fashion, you also have to deal with files and devices and message queues and anywhere else the data can reside. So you need a large number of different data collectors that all turn the enterprise data sources into streaming data. And similarly if you want to store data somewhere you need a large collection of target adapters that deliver to things. Not just on premise but also in the cloud. So things like Amazon S3 or the cloud databases like Redshift and Google BigQuery. So the idea was really that we wanted to give customers everything they need and that everything they need isn't trivial. It's not just, well we take Apache Kafka and then we stuff things into it and then we take things out. Pretty often, for example, you need to be able to enrich data and that means you need to be able to join streaming data with additional context information, reference data. And that reference data may come form a database or from files or somewhere else. So you can't call out to the database and maintain the speeds of streaming data. We have customers that are doing hundreds of thousands of events per second. So you can't call out to a database for every event and ask for records to enrich it with. And you can't even do that with an external cache because it's just not fast enough. So we built in an in-memory data grid as part of our platform. So you can join streaming data with the context information in real time without slowing anything down. So when you're thinking about doing streaming integration, it's more than just moving data around. It's ability to process it and get it in the right form, to be able to analyze it, to be able to do things like complex event processing on that data. And also to be able to visualize it and play with it is an essential part of the whole platform. >> So I wanted to ask you about end-to-end. I've seen a lot of products from larger, maybe legacy companies that will say it's end-to-end but what it really is, is a cobbled together pieces that they bought in and then, this is our end-to-end platform, but it's not unified. Or I've seen others "Well we've got an end-to-end platform" oh really, can I see the visualization? "Well we don't have visualization "we use this third party for visualization". So convince me that you're end-to-end. >> So our platform when you start with it you go into a UI, you can start building data flows. Those data flows start from connectors, we have all the connectors that you need to get your enterprise data. We have wizards to help you build those. And so now you have a data stream. Now you want to start processing that, we have SQL-based processing so you can do everything from filtering, transformation, aggregation, enrichment of data. If you want to load reference data into memory you use a cache component to drag that in, configure that. You now have data in-memory you can join with your streams. If you want to now take the results of all that processing and write it somewhere, use one of our target connectors, drag that in so you've got a data flow that's getting bigger and bigger, doing more and more processing. So now you're writing some of that data out to Kafka, oh I'm going to also add in another target adaptor write some of it into Azure Blob Storage and some of it's going to Amazon Redshift. So now you have a much bigger data flow. But now you say okay well I also want to do some analytics on that. So you take the data stream, you build another data flow that is doing some aggregation of a Windows, maybe some complex event processing, and then you use that dashboard builder to build a dashboard to visualize all of that. And that's all in one product. So it literally is everything you need to get value immediately. And you're right, the big vendors they have multiple different products and they're very happy to sell you consulting to put them all together. Even if you're trying to build this from open source and you know, organizations try and do that, you need five or six major pieces of open source, a lot of support in libraries, and a huge team of developers to just build a platform that you can start to build applications on. And most organizations aren't software platform companies, they're finance companies, oil and gas companies, healthcare companies. And they really want to focus on solving business problems and not on reinventing the wheel by building a software platform. So we can just go in there and say look; value immediately. And that really, really helps. >> So what are some of your favorite use cases, examples, maybe customer examples that you can share with me? >> So one of the great examples, one of my customers they have a lot of data in our HP non-stop system. And they needed to be able to get visibility into that immediately. And this was like order processing, supply chain, ERP data. And it would've taken a very large amount of time to do analytics directly on the HP nonstop. And finding resources to do that is hard as well. So they needed to get the data out and they need to get it into the appropriate place. And they recognize that use the right technology to ask the right question. So they wanted some of it in Hadoop so they could do some machine learning on that. They wanted some of it to go into Kafka so they could get real time analytics. And they wanted some of it to go into HBase so they could query it immediately and use that for reference purposes. So they utilized us to do change data capture against the HP nonstop, deliver that datastream out immediately into Kafka and also push some of it into HEFS and some of it into HBase. So they immediately got value out of that, because then they could also build some real-time analytics on it. It would sent out alerts if things were taking too long in their order processing system. And allowed them to get visibility directly into their process that they couldn't get before with much fewer resources and more modern technologies than they could have used before. So that's one example. >> Can I ask you a question about that? So you talked about Kafka, HBase, you talk about a lot of different open source projects. You've integrated those or you've got entries and exits into those? >> So we ship with Kafka as part of our product. It's an optional messaging bus. So, our platform has two different ways of moving data around. We have a high-speed, in-memory only message bus and that works almost network speed and it's great for a lot of different use cases. And that is what backs our data streams. So when you build a data flow, you have streams in between each step, that is backed by an in-memory bus. Pretty often though, in use cases, you need to be able to potentially rewind data for recovery purposes or have different applications running at different speeds and that's where a persistent message bus like Kafka comes in but you don't want to use a persistent message bus for everything because it's doing IO and it's slowing things down. So you typically use that at the beginning, at the sources, especially things like IOT where you can't rewind into them. Things like databases and files, you can rewind into them and replay and recover but IOT sources, you can't do that. So you would push that into a Kafka backed stream and then subsequent processing is in-memory. So we have that as part of our product. We also have Elastic as part of our product for results storage. You can switch to other results storage but that's our default. And we have a few other key components that are part of our product but then on the periphery, we have adapters integrate with a lot of the other things that you mentioned. So we have adapters to read and write HDFS, Hive, HBase, Across, Cloudera, Autumn Works, even MapR. So we have the MapR versions of the file system and MapR streams and MapR DB and then there's lots of other more proprietary connectors like CVC from Oracle, and SQL server, and MySQL and MariaDB. And then database connectors for delivery to virtually any JDBC compliant database. >> I took you down a tangent before you had a chance. You were going to give us another example. We're pretty much out of time but if you can briefly share either that or the last word, I'll give it to you. >> I think the last word would be that that is one example. We have lots and lots of other types of use cases that we do including things like: migrating data from on-premise to the cloud, being able to distribute log data, and being able to analyze that log data being able to do in-memory analytics and get real-time insights immediately and send alerts. It's a very comprehensive platform but each one of those use cases are very easy to develop on their own and you can do them very quickly. And of course as the use case expands within a customer, they build more and more and so they end up using the same platform for lots of different use cases within the same account. >> And how large is the company? How many people? >> We are around 70 people right now. >> 70 People and you're looking for funding? What rounds are you in? Where are you at with funding and revenue and all that stuff? >> Well I'd have to defer to my CEO for those questions. >> All right, so you've been around for what, six years you said? >> Yeah, we have a number of rounds of funding. We had initial seed funding then we had the investment by Summit Partners that carried us through for a while. Then subsequent investment from Intel Capital, Dell EMC, Atlantic Bridge. And that's where we are right now. >> Good, excellent. Steve, thanks so much for coming on theCUBE, really appreciate your time. >> Great, it's awesome. Thank you Dave. >> Great to meet you. All right, keep it right there everybody, we'll be back with our next guest. This is theCUBE. We're live from BigData SV in San Jose. We'll be right back. (techno music)

Published Date : Mar 9 2018

SUMMARY :

Brought to you by SiliconANGLE Media the market really started to focus So what led you to that conclusion So it's moving data from one place to another. I got to ask a really basic question, And now you guys have taken that to a new level. and that means you need to be able to So I wanted to ask you about end-to-end. So our platform when you start with it And they needed to be able to get visibility So you talked about Kafka, HBase, So when you build a data flow, you have streams We're pretty much out of time but if you can briefly to develop on their own and you can do them very quickly. And that's where we are right now. really appreciate your time. Thank you Dave. Great to meet you.

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Steve Wilks	PERSON	0.99+
Steve	PERSON	0.99+
2009	DATE	0.99+
Steve Wilkes	PERSON	0.99+
five	QUANTITY	0.99+
Intel Capital	ORGANIZATION	0.99+
GoldenGate Software	ORGANIZATION	0.99+
Ali Kutay	PERSON	0.99+
Oracle	ORGANIZATION	0.99+
hundreds	QUANTITY	0.99+
GoldenGate	ORGANIZATION	0.99+
Kafka	TITLE	0.99+
San Jose	LOCATION	0.99+
Stream	ORGANIZATION	0.99+
MySQL	TITLE	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
Atlantic Bridge	ORGANIZATION	0.99+
six years	QUANTITY	0.99+
Steam	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
MapR	TITLE	0.99+
HP	ORGANIZATION	0.99+
four	QUANTITY	0.99+
70 People	QUANTITY	0.99+
Dell EMC	ORGANIZATION	0.99+
MariaDB	TITLE	0.99+
Striim	PERSON	0.99+
SQL	TITLE	0.99+
one	QUANTITY	0.98+
each step	QUANTITY	0.98+
Summit Partners	ORGANIZATION	0.98+
two different ways	QUANTITY	0.97+
first part	QUANTITY	0.97+
around six years	QUANTITY	0.97+
around 70 people	QUANTITY	0.96+
HBase	TITLE	0.96+
one example	QUANTITY	0.96+
theCUBE	ORGANIZATION	0.95+
BigData SV	ORGANIZATION	0.94+
Big Data	ORGANIZATION	0.92+
Hadoop	TITLE	0.92+
one product	QUANTITY	0.92+
each one	QUANTITY	0.91+
six major pieces	QUANTITY	0.91+
About four	DATE	0.91+
CVC	TITLE	0.89+
first	QUANTITY	0.89+
about six years ago	DATE	0.88+
day one	QUANTITY	0.88+
Elastic	TITLE	0.87+
Silicon Valley	LOCATION	0.87+
Windows	TITLE	0.87+
five years ago	DATE	0.86+
S3	TITLE	0.82+
JDBC	TITLE	0.81+
Azure	TITLE	0.8+
CEO	PERSON	0.79+
one place	QUANTITY	0.78+
Redshift	TITLE	0.76+
Autumn	ORGANIZATION	0.75+
second	QUANTITY	0.74+
thousands	QUANTITY	0.72+
Big Data SV 2018	EVENT	0.71+
couple years	QUANTITY	0.71+
Google	ORGANIZATION	0.69+

Praveen Kankariya, Impetus | Big Data SV 2018

>> Narrator: Live from San Jose, it's theCUBE. Presenting Big Data Silicon Valley. Brought to you by SiliconANGLE Media, and its ecosystem partners. (electronica flourish) >> We're back at Big Data SV. This is theCUBE, the leader in live tech coverage. My name is Dave Vellante. Praveen Kankariya is here. He's the CEO of a company called Impetus. Company's been around the Big Data space before Hadoop, even. Praveen, thanks for back in theCUBE, good to see you. >> Thank you, Dave. >> So, as I said in the open, you've seen a lot. You kind of really got into the Big Data space in 2007, seen it blow through the Hadoop, you know, sort of batch world into the real time world, seen the data management headwinds. From your perspective, you know, what kind of problems are you solving today in the Big Data world? >> So I can go into the details of what we are doing, but at a high level, we are helping companies converge to a singular, enterprise-wide data model. 'Cause I think that is a crisis in the Fortune 500 today, and there'll be have and have-nots. >> Dave: What do you mean a crisis? >> I routinely run into companies who do not have their data model stitched. So they know the same customer, they know me by five different handles, and they don't have it figured out, that I'm the same guy. So, that I think is a major problem. So I think the C-suite is, they would not like to hear this, but they are flying partially blind. >> I have a theory on this, but I want to hear yours-- >> Sure. >> Why is that such a big problem? >> So, the most efficient business in the world is a one-man business, because everything is flowing in the same brain. The moment you hire your first employee, you start having communication breakdowns. And now these companies have hundreds and thousands of employees. Hundreds of thousands of employees. There's a lot of breakdown. There are airlines that, when I'm upgraded to first class, are offering me an economy-plus seat when I go to check in. That's ... they're turning me off, and they're losing an opportunity to, real opportunity to upsell something else to me. So. >> Okay, well, so let's bring this into the world of digital transformation. Everybody talks about those buzzwords, so let's try to put some sort of meat on that bone. If you look at the top five companies by market cap, Amazon, Apple, Facebook, Google. I'm missing somebody. Anyway, they're big. 500 billion, 700 billion dollars. They're all sort of what we would call data-driven. What does that mean? Data is at the core of their enterprise. A lot of the companies you're talking about, human expertise is the core of their enterprise, and they've got data that's sort of in silos, surrounding it. >> Praveen: Yes, yes. >> Is that an accurate description? >> That's-- And how can you help close that gap? >> So they have data in silos, and even that data in silos is not being used at velocity, with velocity. That data is, you know, it's taking much longer for them to even clean up that data, get access to that data, derive insights from that data. >> Dave: Right. >> So there's a lot of sluggishness, overall. >> Dave: So how do you help? >> How do we help? Great question. We help in many different ways. So we actually, so my company provides solutions. So we have some, a few products of our own, and then we work with all kinds of product companies. But we're about solving a problem, so when the customers we engage with, we actually solve a problem, so that there's a business outcome before we walk out. That's the big difference. We're not here to just sell the next sexy platform, or this or that, you know. We're not just here to excite the developers. >> So, maybe you could give me some of your favorite examples of where you've helped some of your clients. >> So there's one fairly large company, it's a household name around the world. And we have helped them create a single source of truth using a Big Data infrastructure. This has about six and a half thousand feeds of data coming in, continuously. Some continuously, some every few minutes, every few hours, whatnot. But then all their data is stitched together, and it's got guardrails, there's full governance. So, and now this platform is available to every business unit, to run their own applications. There's a set of APIs who go in and develop their own applications. So shadow idea is being promoted in this environment. It's not being looked down upon. >> So it's not sitting in one box, presumably, it's distributed throughout the organization? >> It is distributed. And you know, there're are some, you know, as long as you stay within the governance structure, you can derive, you know, somebody wants a graph database, they can derive a graph database from this massive, fully-connected data set, which is an enterprise-wide data set. >> Don't you see as some of the challenges, as well as cultural, there are some industries that might say, or some executives that say, "Well, you know my industry, "healthcare is an example, really hasn't been disrupted. "We're maybe insulated from that." I feel as though that's somewhat risky thinking, and it's easy to maybe sit back say, "Well, I'm going to wait, see what happens." What are your thoughts on that? >> Look at the data. The week Jeff Bezos announced that he is tying up with JPMC and Warren Buffet, some of the largest healthcare companies, and I'm talking of Fortune 10 companies, they lost about 20% of their market cap that week. So, you don't have to listen to me. Listen to the markets. >> Well, that's true. We see what happens in grocery, see what happens in... We haven't really seen, as I say, the disruption in healthcare, financial services, but it's all data, and that changes the equation. So why, let's see, not why. How when, if you get to this, so it sounds like step one is to get that sort of single data model across the organization, but there's other steps. You got to figure out how to monetize the data, not necessarily by selling it, but how data contributes to the monetization of the company. You got to it accessible, you got to make it of high quality, you've got to get the right skill sets. So there's a lot to it, and more than just the technology. Maybe you could talk about that. >> So the way, I would like to preach, if I'm allowed to-- >> Dave: Please, it's theCUBE... (laughs) >> No, no, I mean, I don't mean here, but if any CEO was listening to me, what I would like to tell them is, just create a vision of your ultimate connected data model. And then start looking at how do you converge out of that vision. It may not happen in one day, one week, one year. It's going to take time, and you know, every business is in flight, so they have to operate continuously, but they have to keep gravitating. And the biggest casualty is going to be their customer relationship if they don't do this. Because most companies don't know their customers fully. I mean, that little example of the airline which was showing me, flashing an ad for economy seats, premium economy seats when I'm already in first class, they don't know me. Some part of that company doesn't know me. So they're not able to service me well. Here now they lost an opportunity to monetize, but I think from another perspective, they lost an opportunity to really offer me something which would've made my flight way more comfortable. >> Well. >> So. >> Then you wonder if that's the dynamic that you encountered, what's the speed to market, the agility of that organization? They're hampered by their ability to, whether it's roll out new apps, identify new data sources, create new products for the customers. Have you seen, what kind of impacts have you seen within your customers? You gave the example before, of that sort of single data model, the single version of the truth. What business impacts have been able to affect for your customers? >> So, there, I mean I can go on giving you anecdotes from my observations, my front row observations into these companies. >> Yeah, it'd be good to have some kind of proof points, right? Our audience would love to hear that. >> So, you know there's a company not too far from here. They've stitched every click stream, right to product usage data. To support data, to every marketing email opened. And they can tell who's buying, what happened, what is their support experience, who's upgrading, who's upgrading faster because they had a positive support experience, or not. So everything is tied. Any direction you want to look into your customer space, you can go and get visibility from every perspective you can think of. That's customer 360. We worked with a credit card company where they had a massive rules engine, which had been developed over generations to report fraud, to catch fraud, while a transaction's being processed. We actually, once they got all their data together, we could apply a massive machine learning engine. And we started learning from customers' own behavior, so we completely discarded the rules engine, and now we have a learning system which is flagging fraudulent transactions. So they managed to cut down their false positives tremendously, and in turn reduced inconvenience. It used to be embarrassing for me to give out a card and get it declined in front of a customer. >> So, as I said at the top, you've seen sort of the evolution of this whole Big Data meme before it was called Big Data. What are the things that may be exciting you? We seem to be entering a new era we call digital. There's a cognitive era, AI, machine intelligence. What do you see that's exciting, and real? >> So number one, so I like to divide this space into two parts, the whole space of data analytics. There's the data plumbing, which we call data management, and whatnot. I have to plumb all my data together. Only then I can feed this data into my AI models. Now I can do in my silos today, but for me to do at a global level for my entire corporation, I need it all stitched together. And then, of course, these models are very real. My son, my 22-year old son is using TensorFlow for some little startup that he's cooking. And it took him just a month to pick it up and start applying it. So why can't our large companies do so? And in turn, bring down the cost of services, cost of products, the velocity of delivering those things to us, and make life better. >> So, the barriers to technology deployment are getting lower. >> And this is all feasible, Dave, right now. >> Yeah. >> You know, I mean, this is all, this is a dream 10 years ago. If somebody had said, you know, for an old corporation to stitch all its data, "What're you talking about? "It's not going to happen." But now, this is possible, and it's feasible. It's not going to require, make a massive hole in their budgets. >> But don't you think it's also table stakes to compete in over, the next 10 years? >> It is, there is table stakes. It's actually kind of late, from my perspective. If I had to go invest in the market, I mean, I would invest in companies who have their data act together. >> Yeah, yeah. So, what's the, how do you tell, when a company has its data act together? When you walk into a prospect, how do you know, what do you see, what're the characteristics of somebody who has that act together? >> It's hard for me to give you a few characteristics, but you know, you can tell what is the mandate they're operating under, if there are clear mandates. Because, for most companies, this is lost because of turf battle. This whole battle is lost due to turf issues. And the moment you see senior executives working together, with a massive willingness to bring everything together. You know, they'll have different turfs, and they're willing to contribute data, and bring it together. That's a phenomenally positive sign, because once that happens, then every large company has the wherewithal to go hire 50 data scientists, or work with all kinds of companies, including mine, to get data science help. >> Yeah, it comes back to the culture, doesn't it? >> Yes, absolutely. >> All right, Praveen, we have to leave it right there. Thanks very much for coming back in theCUBE. >> Thank you Dave, thank you. Thank you for the opportunity. >> You're very welcome. All right, keep it right there, everybody. This is theCUBE. We're live from the Forager in San Jose, Big Data SV. We'll be right back. (electronica flourish)

Published Date : Mar 9 2018

SUMMARY :

Brought to you by SiliconANGLE Media, Praveen, thanks for back in theCUBE, good to see you. You kind of really got into the Big Data space in 2007, So I can go into the details of what we are doing, that I'm the same guy. because everything is flowing in the same brain. Data is at the core of their enterprise. That data is, you know, it's taking much longer for them We're not here to just sell the next sexy platform, So, maybe you could give me to every business unit, And you know, there're are some, you know, and it's easy to maybe sit back say, So, you don't have to listen to me. So there's a lot to it, and more than just the technology. Dave: Please, it's theCUBE... It's going to take time, and you know, if that's the dynamic that you encountered, So, there, I mean I can go on giving you anecdotes Yeah, it'd be good to have So they managed to cut down We seem to be entering a new era we call digital. So number one, so I like to divide this space So, the barriers to technology deployment It's not going to require, If I had to go invest in the market, So, what's the, how do you tell, It's hard for me to give you a few characteristics, All right, Praveen, we have to leave it right there. Thank you for the opportunity. We're live from the Forager in San Jose, Big Data SV.

ENTITIES

Entity	Category	Confidence
Amazon	ORGANIZATION	0.99+
Apple	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Dave	PERSON	0.99+
Google	ORGANIZATION	0.99+
Facebook	ORGANIZATION	0.99+
Jeff Bezos	PERSON	0.99+
2007	DATE	0.99+
Praveen Kankariya	PERSON	0.99+
JPMC	ORGANIZATION	0.99+
one week	QUANTITY	0.99+
Praveen	PERSON	0.99+
Impetus	ORGANIZATION	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
one box	QUANTITY	0.99+
one year	QUANTITY	0.99+
two parts	QUANTITY	0.99+
one day	QUANTITY	0.99+
50 data scientists	QUANTITY	0.99+
first employee	QUANTITY	0.99+
San Jose	LOCATION	0.99+
five different handles	QUANTITY	0.98+
10 years ago	DATE	0.98+
Big Data SV	ORGANIZATION	0.98+
today	DATE	0.98+
700 billion dollars	QUANTITY	0.98+
about 20%	QUANTITY	0.97+
about six and a half thousand feeds	QUANTITY	0.97+
Big Data	ORGANIZATION	0.96+
single	QUANTITY	0.96+
five companies	QUANTITY	0.96+
Impetus	PERSON	0.96+
one-man	QUANTITY	0.96+
theCUBE	ORGANIZATION	0.95+
one	QUANTITY	0.95+
22-year old	QUANTITY	0.94+
step one	QUANTITY	0.94+
single source	QUANTITY	0.93+
single version	QUANTITY	0.92+
2018	DATE	0.91+
next 10 years	DATE	0.87+
first class	QUANTITY	0.86+
hundreds and	QUANTITY	0.86+
Hundreds of thousands of employees	QUANTITY	0.85+
Silicon Valley	LOCATION	0.85+
Buffet	PERSON	0.84+
a month	QUANTITY	0.83+
Fortune	ORGANIZATION	0.82+
500 billion	QUANTITY	0.81+
10 companies	QUANTITY	0.76+
Hadoop	TITLE	0.69+
hours	QUANTITY	0.69+
employees	QUANTITY	0.68+
Hadoop	LOCATION	0.68+
week	DATE	0.68+
360	QUANTITY	0.6+
Forager	ORGANIZATION	0.56+
Fortune 500	ORGANIZATION	0.56+
Warren	ORGANIZATION	0.54+
thousands	QUANTITY	0.53+
TensorFlow	TITLE	0.51+
minutes	QUANTITY	0.5+
every	QUANTITY	0.49+
SV	EVENT	0.48+

David Abercrombie, Sharethrough & Michael Nixon, Snowflake | Big Data SV 2018

>> Narrator: Live from San Jose, it's theCUBE. Presenting Big Data, Silicon Valley. Brought to you by SiliconANGLE Media and its ecosystem partners. >> Hi, I'm George Gilbert, and we are broadcasting from the Strata Data Conference, we're right around the corner at the Forager Tasting Room & Eatery. We have this wonderful location here, and we are very lucky to have with us Michael Nixon, from Snowflake, which is a leading cloud data warehouse. And David Abercrombie from Sharethrough which is a leading ad tech company. And between the two of them, they're going to tell us some of the most advance these cases we have now for cloud-native data warehousing. Michael, why don't you start with giving us some context for how on a cloud platform one might rethink a data warehouse? >> Yeah, thank you. That's a great question because let me first answer it from the end-user, business value perspective, when you run a workload on a cloud, there's a certain level of expectation you want out of the cloud. You want scalability, you want unlimited scalability, you want to be able to support all your users, you want to be able to support the data types, whatever they may be that comes in into your organization. So, there's a level of expectation that one should expect from a service point of view once you're in a cloud. So, a lot of the technology that were built up to this point have been optimized for on-premises types of data warehousing where perhaps that level of service and currency and unlimited scalability was not really expected but, guess what? Once it comes to the cloud, it's expected. So those on-premises technologies aren't suitable in the cloud, so for enterprises and, I mean, companies, organizations of all types from finance, banking, manufacturing, ad tech as we'll have today, they want that level of service in the cloud. And so, those technologies will not work, and so it requires a rethinking of how those architectures are built. And it requires being built for the cloud. >> And just to, alright, to break this down and be really concrete, some of the rethinking. We separate compute from storage, which is a familiar pattern that we've learned in the cloud but we also then have to have this sort of independent elasticity between-- >> Yes. Storage and the compute, and then Snowflake's taken it even a step further where you can spin out multiple compute clusters. >> Right. >> Tell us how that works and why that's so difficult and unique. >> Yeah, you know, that's taking us under the covers a little bit, but what makes our infrastructure unique is that we have a three-layer architecture. We separate, just as you said, storage from the compute layer, from the services layer. And that's really important because as I mentioned before, you want unlimited capacity, unlimited resources. So, if you scale, compute, and today's world on on-premises MPP, what that really means is that you have to bring the storage along with the compute because compute is tied to the storage so when you scale the storage along with the compute, usually that involves a lot of burden on the data warehouse manager because now they have to redistribute the data and that means redistributing keys, managing keys if you will. And that's a burden, and by the reverse, if all you wanted to do was increase storage but not the compute, because compute was tied to storage. Why you have to buy these additional compute notes, and that might add to the cost when, in fact, all you really wanted to pay for was for additional storage? So, by separating those, you keep them independent, and so you can scale storage apart from compute and then, once you have your compute resources in place, the virtual warehouses that you're talking about that have completed the job, you spun them up, it's done its job, and you take it down, guess what? You can release those resources, and of course, in releasing those resources, basically you can cut your cost as well because, for us, it's pure usage-based pricing. You only pay for what you use, and that's really fantastic. >> Very different from the on-prem model where, as you were saying, tied compute and storage together, so. >> Yeah, let's think about what that means architecturally, right? So if you have an on-premises data warehouse, and you want to scale your capacity, chances are you'll have to have that hardware in place already. And having that hardware in place already means you're paying that expense and, so you may pay for that expense six months prior to need it. Let's take a retailer example. >> Yeah. >> You're gearing up for a peak season, which might be Christmas, and so you put that hardware in place sometime in June, you'll always put it in advanced because why? You have to bring up the environment, so you have to allow time for implementation or, if you will, deployment to make sure everything is operational. >> Okay. >> And then what happens is when that peak period comes, you can't expand in that capacity. But what happens once that peak period is over? You paid for that hardware, but you don't really need it. So, our vision is, or the vision we believe you should have when you move workloads to the cloud is, you pay for those when you need them. >> Okay, so now, David, help us understand, first, what was the business problem you were trying to solve? And why was Snowflake, you know, sort of uniquely suited for that? >> Well, let me talk a little bit about Sharethrough. We're ad tech, at the core of our business we run an ad exchange, where we're doing programmatic training with the bids, with the real-time bidding spec. The data is very high in volume, with 12 billion impressions a month, that's a lot of bids that we have to process, a lot of bid requests. The way it operates, the bids and the bid responses and programmatic training are encoded in JSONs, so our ad exchange is basically exchanging messages in JSON with our business partners. And the JSONs are very complicated, there's a lot of richness and detail, such that the advertisers can decide whether or not they want to bid. Well, this data is very complicated, very high-volume. And advertising, like any business, we really need to have good analytics to understand how our business is operating, how our publishers are doing, how our advertisers are doing. And it all depends upon this very high-volume, very complex JSON event data stream. So, Snowflake was able to ingest our high-volume data very gracefully. The JSON parsing techniques of Snowflake allow me to expose the complicated data structure in a way that's very transparent and usable to our analysts. Our use of Snowflake has replaced clunkier tools where the analysts basically had to be programmers, writing programs in Scala or something to do in analysis. And now, because we've transparently and easily exposed the complicated structures within Snowflake in a relational database, they can use good old-fashioned SQL to run their queries, literally, afternoon analysis is now a five-minute query. >> So, let me, as I'm listening to you describe this. We've had various vendors telling us about these workflows in the sort of data prep and data science tool change. It almost sounds to me like Snowflake is taking semi-structured or complex data and it's sort of unraveling it and normalizing is kind of an overloaded term but it's making it business-ready, so you don't need as much of that manual data prep. >> Yeah, exactly, you don't need as much manual data prep, or you don't need as much expertise. For instance, Snowflake's JSON capabilities, in terms of drilling down the JSON tree with dot path notation, or expanding nested objects is very expressive, very powerful, but still your typical analyst or your BI tool certainly wouldn't know how to do that. So, in Snowflake, we sort of have our cake and eat it too. We can have our JSONs with their full richness in our database, but yet we can simplify and expose the data elements that are needed for analysis, so that an analyst, their first day on the job, they can get right to work and start writing queries. >> So let me ask you about, a little more about the programmatic ad use case. So if you have billions of impressions per month, I'm guessing that means you have quite a few times more, in terms of bids, and then there's the, you know once you have, I guess a successful one, you want to track what happens. >> Correct. >> So tell us a little more about that, what that workload looks like, in terms of, what analytics you're trying to perform, what's your tracking? >> Yeah, well, you're right. There's different steps in our funnel. The impression request expands out by a factor of a dozen as we send it to all the different potential bidders. We track all that data, the responses come back, we track that, we track our decisions and why we selected the bidder. And then, once the ad is shown, of course there's various beacons and tracking things that fire. We'd have to track all of that data, and the only way we could make sense out of our business is by bringing all that data together. And in a way that is reliable, transparent, and visible, and also has data integrity, that's another thing I like about the Snowflake database is that it's a good old-fashioned SQL database that I can declare my primary keys, I can run QC checks, I can ensure high data integrity that is demanded by BI and other sorts of analytics. >> What would be, as you continue to push the boundaries of the ad tech service, what's some functionality that you're looking to add, and Snowflake as your partner, either that's in there now that you still need to take advantage of or things that you're looking to in the future? >> Well, moving forward, of course, we, it's very important for us to be able to quickly gauge the effectiveness of new products. The ad tech market is fast-changing, there's always new ways of bidding, new products that are being developed, new ways for the ad ecosystem to work. And so, as we roll those out, we need to be able to quickly analyze, you know, "Is this thing working or not?" You know, kind of an agile environment, pivot or prove it. Does this feature work or not? So, having all the data in one place makes that possible for that very quick assessment of the viability of a new feature, new product. >> And, dropping down a little under the covers for how that works, does that mean, like you still have the base JSON data that you've absorbed, but you're going to expose it with different schemas or access patterns? >> Yeah, indeed. For instance, we make use of the SQL schemas, roles, and permissions internally where we can have the different teams have their own domain of data that they can expose internally, and looking forward, there's the share house feature of Snowflake that we're looking to implement with our partners, where, rather than sending them data, like a daily dump of data, we can give them access to their data in our database through this top layer that Michael mentioned, the service layer, essentially allows me to create a view grant select onto another customer. So I no longer have to send daily data dumps to partners or have some sort of API for getting data. They can simply query the data themselves so we'll be implementing that feature with our major partners. >> I would be remiss in not asking at a data conference like this, now that there's the tie-in with CuBOL and Spark Integration and Machine Learning, is there anything along that front that you're planning to exploit in the near future? >> Well, yeah, Sharethrough, we're very experimental, playful, we're always examining new data technologies and new ways of doing things but now with Snowflake as sort of our data warehouse of curated data. I've got two petabytes of referential integrity data, and that is reliable. We can move forward into our other analyses and other uses of data knowing that we have captured every event exactly once, and we know exactly where it fits in a business context, in a relational manner. It's clean, good data integrity, reliable, accessible, visible, and it's just plain old SQL. (chuckles) >> That's actually a nice way to sum it up. We've got the integrity that we've come to expect and love from relational databases. We've got the flexibility of machine-oriented data, or JSON. But we don't have to give up the query engine, and then now you have more advanced features, analytic features that you can take advantage of coming down the pipe. >> Yeah, again we're a modern platform for the modern age, that's basically cloud-based computing. With a platform like Snowflake in the backend, you can now move those workloads that you're accustomed to to the cloud and have in the environment that you're familiar with, and it saves you a lot of time and effort. You can focus on more strategic projects. >> Okay, well, with that, we're going to take a short break. This has been George Gilbert, we're with Michael Nixon of Snowflake, and David Abercrombie of Sharethrough listening to how the most modern ad tech companies are taking advantage of the most modern cloud data warehouses. And we'll be back after a short break here at the Strata Data Conference, thanks. (quirky music)

Published Date : Mar 9 2018

SUMMARY :

Brought to you by SiliconANGLE Media some of the most advance these cases we have now a certain level of expectation you want out of the cloud. concrete, some of the rethinking. Storage and the compute, and then Snowflake's taken it and unique. that have completed the job, you spun them up, Very different from the on-prem model where, as you and you want to scale your capacity, chances are You have to bring up the environment, so you have to allow You paid for that hardware, but you don't really need it. of richness and detail, such that the advertisers can So, let me, as I'm listening to you describe this. of drilling down the JSON tree with dot path notation, I'm guessing that means you have quite a few times more, I like about the Snowflake database analyze, you know, "Is this thing working or not?" the service layer, essentially allows me to create and that is reliable. and then now you have more you can now move those workloads that you're accustomed to at the Strata Data Conference, thanks.

ENTITIES

Entity	Category	Confidence
David	PERSON	0.99+
George Gilbert	PERSON	0.99+
David Abercrombie	PERSON	0.99+
Michael Nixon	PERSON	0.99+
Michael	PERSON	0.99+
June	DATE	0.99+
two	QUANTITY	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
San Jose	LOCATION	0.99+
Scala	TITLE	0.99+
first	QUANTITY	0.99+
Silicon Valley	LOCATION	0.99+
five-minute	QUANTITY	0.99+
Snowflake	TITLE	0.99+
Christmas	EVENT	0.98+
Strata Data Conference	EVENT	0.98+
three-layer	QUANTITY	0.98+
first day	QUANTITY	0.98+
a dozen	QUANTITY	0.98+
two petabytes	QUANTITY	0.97+
Sharethrough	ORGANIZATION	0.97+
JSON	TITLE	0.97+
SQL	TITLE	0.96+
one place	QUANTITY	0.95+
six months	QUANTITY	0.94+
Forager Tasting Room & Eatery	ORGANIZATION	0.91+
today	DATE	0.89+
Snowflake	ORGANIZATION	0.87+
Spark	TITLE	0.87+
12 billion impressions a month	QUANTITY	0.87+
Machine Learning	TITLE	0.84+
Big Data	ORGANIZATION	0.84+
billions of impressions	QUANTITY	0.8+
CuBOL	TITLE	0.79+
Big Data SV 2018	EVENT	0.77+
once	QUANTITY	0.72+
theCUBE	ORGANIZATION	0.63+
JSONs	TITLE	0.61+
times	QUANTITY	0.55+

Satyen Sangani, Alation | Big Data SV 2018

>> Announcer: Live from San Jose, it's theCUBE. Presenting Big Data Silicon Valley, brought to you by SiliconANGLE Media and its ecosystem partners. (upbeat music) >> Welcome back to theCUBE, I'm Lisa Martin with John Furrier. We are covering our second day of our event Big Data SV. We've had some great conversations, John, yesterday, today as well. Really looking at Big Data, digital transformation, Big Data, plus data science, lots of opportunity. We're excited to welcome back to theCUBE an alumni, Satyen Sangani, the co-founder and CEO of Alation. Welcome back! >> Thank you, it's wonderful to be here again. >> So you guys finish up your fiscal year end of December 2017, where in the first quarter of 2018. You guys had some really strong results, really strong momentum. >> Yeah. >> Tell us what's going on at Alation, how are you pulling this momentum through 2018. >> Well, I think we have had an enterprise focused business historically, because we solve a very complicated problem for very big enterprises, and so, in the last quarter we added customers like American Express, PepsiCo, Roche. And with huge expansions from our existing customers, some of whom, over the course of a year, I think went 12 X from an initial base. And so, we found some just incredible momentum in Q4 and for us that was a phenomenal cap to a great year. >> What about the platform you guys are doing? Can you just take a minute to explain what Alation does again just to refresh where you are on the product side? You mentioned some new accounts, some new use cases. >> Yeah. >> What's the update? Take a minute, talk about the update. >> Absolutely, so, you certainly know, John, but Alation's a data catalog and a data catalog essentially, you can think of it as Yelp or Amazon for data and information side of the enterprise. So if you think about how many different databases there are, how many different reports there are, how many different BI tools there are, how many different APIs there are, how many different algorithms there are, it's pretty dizzying for the average analyst. It's pretty dizzying for the average CIO. It's pretty dizzying for the average chief data officer. And particularly, inside of Fortune 500s where you have hundreds of thousands of databases. You have a situation where people just have too much signal or too much noise, not enough signal. And so what we do is we provide this Yelp for that information. You can come to Alation as a catalog. You can do a search on revenue 2017. You'll get all of the reports, all of the dashboards, all of the tables, all of the people that you might need to be able to find. And that gives you a single place of reference, so you can understand what you've got and what can answer your questions. >> What's interesting is, first of all, I love data. We're data driven, we're geeks on data. But when I start talking to folks that are outside the geek community or nerd community, you say data and they go, "Oh," because they cringe and they say, "Facebook." They see that data issues there. GDPR, data nightmare, where's the store, you got to manage it. And then, people are actually using data, so they're realizing how hard (laughs) it is. >> Yeah >> How much data do we have? So it's kind of like a tropic disillusionment, if you will. Now they got to get their hands on it. They've got to put it to work. >> Yeah. >> And they know that So, it's now becoming really hard (laughs) in their mind. This is business people. >> Yeah. >> They have data everywhere. How do you guys talk to that customer? Because, if you don't have quality data, if you don't have data you can trust, if you don't have the right people, it's hard to get it going. >> Yeah. >> How do you guys solve that problem and how do you talk to customers? >> So we talk a lot about data literacy. There is a lot of data in this world and that data is just emblematic of all of the stuff that's going on in this world. There's lots of systems, there's lots of complexity and the data, basically, just is about that complexity. Whether it's weblogs, or sensors, or the like. And so, you can either run away from that data, and say, "Look, I'm going to not, "I'm going to bury my head in the sand. "I'm going to be a business. "I'm just going to forget about that data stuff." And that's certainly a way to go. >> John: Yeah. >> It's a way to go away. >> Not a good outlook. >> I was going to say, is that a way of going out of business? >> Or, you can basically train, it's a human resources problem fundamentally. You've got to train your people to understand how to use data, to become data literate. And that's what our software is all about. That's what we're all about as a company. And so, we have a pretty high bar for what we think we do as a business and we're this far into that. Which is, we think we're training people to use data better. How do you learn to think scientifically? How do you go use data to make better decisions? How do you build a data driven culture? Those are the sorts of problems that I'm excited to work on. >> Alright, now take me through how you guys play out in an engagement with the customer. So okay, that's cool, you guys can come in, we're getting data literate, we understand we need to use data. Where are you guys winning? Where are you guys seeing some visibility, both in terms of the traction of the usage of the product, the use cases? Where is it kind of coming together for you guys? >> Yeah, so we literally, we have a mantra. I think any early stage company basically wins because they can focus on doing a couple of things really well. And for us, we basically do three things. We allow people to find data. We allow people to understand the data that they find. And we allow them to trust the data that they see. And so if I have a question, the first place I start is, typically, Google. I'll go there and I'll try to find whatever it is that I'm looking for. Maybe I'm looking for a Mediterranean restaurant on 1st Street in San Jose. If I'm going to go do that, I'm going to do that search and I'm going to find the thing that I'm looking for, and then I'm going to figure out, out of the possible options, which one do I want to go to. And then I'll figure out whether or not the one that has seven ratings is the one that I trust more than the one that has two. Well, data is no different. You're going to have to find the data sets. And inside of companies, there could be 20 different reports and there could be 20 different people who have information, and so you're going to trust those people through having context and understanding. >> So, trust, people, collaboration. You mentioned some big brands that you guys added towards the end of calendar 2017. How do you facilitate these conversations with maybe the chief data officer. As we know, in large enterprises, there's still a lot of ownership over data silos. >> Satyen: Yep. >> What is that conversation like, as you say on your website, "The first data catalog designed for collaboration"? How do you help these organizations as large as Coca-Cola understand where all the data are and enable the human resources to extract values, and find it, understand it, and trust it? >> Yeah, so we have a very simple hypothesis, which is, look, people fundamentally have questions. They're fundamentally curious. So, what you need to do as a chief data officer, as a chief information officer, is really figure out how to unlock that curiosity. Start with the most popular data sets. Start with the most popular systems. Start with the business people who have the most curiosity and the most demand for information. And oh, by the way, we can measure that. Which is the magical thing that we do. So we can come in and say, "Look, "we look at the logs inside of your systems to know "which people are using which data sets, "which sources are most popular, which areas are hot." Just like a social network might do. And so, just like you can say, "Okay, these are the trending restaurants." We can say, "These are the trending data sets." And that curiosity allows people to know, what data should I document first? What data should I make available first? What data do I improve the data quality over first? What data do I govern first? And so, in a world where you've got tons of signal, tons of systems, it's totally dizzying to figure out where you should start. But what we do is, we go these chief data officers and say, "Look, we can give you a tool and a catalyst so "that you know where to go, "what questions to answer, who to serve first." And you can use that to expand to other groups in the company. >> And this is interesting, a lot of people you mentioned social networks, use data to optimize for something, and in the case of Facebook, they they use my data to target ads for me. You're using data to actually say, "This is how people are using the data." So you're using data for data. (laughs) >> That's right. >> So you're saying-- >> Satyen: We're measuring how you can use data. >> And that's interesting because, I hear a lot of stories like, we bought a tool, we never used it. >> Yep. >> Or people didn't like the UI, just kind of falls on the side. You're looking at it and saying, "Let's get it out there and let's see who's using the data." And then, are you doubling down? What happens? Do I get a little star, do I get a reputation point, am I being flagged to HR as a power user? How are you guys treating that gamification in this way? It's interesting, I mean, what happens? Do I become like-- >> Yeah, so it's funny because, when you think about search, how do you figure out that something's good? So what Google did is, they came along and they've said, "We've got PageRank." What we're going to do is we're going to say, "The pages that are the best pages are the ones "that people link to most often." Well, we can do the same thing for data. The data sources that are the most useful ones are the people that are used most often. Now on top of that, you can say, "We're going to have experts put ratings," which we do. And you can say people can contribute knowledge and reviews of how this data set can be used. And people can contribute queries and reports on top of those data sets. And all of that gives you this really rich graph, this rich social graph, so that now when I look at something it doesn't look like Greek. It looks like, "Oh, well I know Lisa used this data set, "and then John used it "and so at least it must answer some questions "that are really intelligent about the media business "or about the software business. "And so that can be really useful for me "if I have no clue as to what I'm looking at." >> So the problem that you-- >> It's on how you demystify it through the social connections. >> So the problem that you solve, if what I hear you correctly, is that you make it easy to get the data. So there's some ease of use piece of it, >> Yep. >> cataloging. And then as you get people using it, this is where you take the data literacy and go into operationalizing data. >> Satyen: That's right. >> So this seems to be the challenge. So, if I'm a customer and I have a problem, the profile of your target customer or who your customers are, people who need to expand and operationalize data, how would you talk about it? >> Yeah, so it's really interesting. We talk about, one of our customers called us, sort of, the social network for nerds inside of an enterprise. And I think for me that's a compliment. (John laughing) But what I took from that, and when I explained the business of Alation, we start with those individuals who are data literate. The data scientists, the data engineers, the data stewards, the chief data officer. But those people have the knowledge and the context to then explain data to other people inside of that same institution. So in the same way that Facebook started with Harvard, and then went to the rest of the Ivies, and then went to the rest of the top 20 schools, and then ultimately to mom, and dad, and grandma, and grandpa. We're doing the exact same thing with data. We start with the folks that are data literate, we expand from there to a broader audience of people that don't necessarily have data in their titles, but have curiosity and questions. >> I like that on the curiosity side. You spent some time up at Strata Data. I'm curious, what are some of the things you're hearing from customers, maybe partners? Everyone used to talk about Hadoop, it was this big thing. And then there was a creation of data lakes, and swampiness, and all these things that are sort of becoming more complex in an organization. And with the rise of myriad data sources, the velocity, the volume, how do you help an enterprise understand and be able to catalog data from so many different sources? Is it that same principle that you just talked about in terms of, let's start with the lowest hanging fruit, start making the impact there and then grow it as we can? Or is an enterprise needs to be competitive and move really, really quickly? I guess, what's the process? >> How do you start? >> Right. >> What do people do? >> Yes! >> So it's interesting, what we find is multiple ways of starting with multiple different types of customers. And so, we have some customers that say, "Look, we've got a big, we've got Teradata, "and we've got some Hadoop, "and we've got some stuff on Amazon, "and we want to connect it all." And those customers do get started, and they start with hundreds of users, in some case, they start with thousands of users day one, and they just go Big Bang. And interestingly enough, we can get those customers enabled in matters of weeks or months to go do that. We have other customers that say, "Look, we're going to start with a team of 10 people "and we're going to see how it grows from there." And, we can accommodate either model or either approach. From our prospective, you just have to have the resources and the investment corresponding to what you're trying to do. If you're going to say, "Look, we're going to have, two dollars of budget, and we're not going to have the human resources, and the stewardship resources behind it." It's going to be hard to do the Big Bang. But if you're going to put the appropriate resources up behind it, you can do a lot of good. >> So, you can really facilitate the whole go big or go home approach, as as well as the let's start small think fast approach. >> That's right, and we always, actually ironically, recommend the latter. >> Let's start small, think fast, yeah. >> Because everybody's got a bigger appetite than they do the ability to execute. And what's great about the tool, and what I tell our customers and our employees all day long is, there's only metric I track. So year over year, for our business, we basically grow in accounts by net of churn by 55%. Year over year, and that's actually up from the prior year. And so from my perspective-- >> And what does that mean? >> So what that means is, the same customer gave us 55 cents more on the dollar than they did the prior year. Now that's best in class for most software businesses that I've heard. But what matters to me is not so much that growth rate in and of itself. What it means to me is this, that nobody's come along and says, "I've mastered my data. "I understand all of the information side of my company. "Every person knows everything there is to know." That's never been said. So if we're solving a problem where customers are saying, "Look, we get, and we can find, and understand, "and trust data, and we can do that better last year "than we did this year, and we can do it even more "with more people," we're going to be successful. >> What I like about what you're doing is, you're bringing an element of operationalizing data for literacy and for usage. But you're really bringing this notion of a humanizing element to it. Where you see it in security, you see it in emerging ecosystems. Where there's a community of data people who know how hard it is and was, and it seems to be getting easier. But the tsunami of new data coming in, IOT data, whatever, and new regulators like GDPR. These are all more surface area problems. But there's a community coming together. How have you guys seen your product create community? Have you seen any data on that, 'cause it sounds like, as people get networked together, the natural outcome of that is possibly usage you attract. But is there a community vibe that you're seeing? Is there an internal collaboration where they sit, they're having meet ups, they're having lunches. There's a social aspect in a human aspect. >> No, it's humanal, no, it's amazing. So in really subtle but really, really powerful ways. So one thing that we do for every single data source or every single report that we document, we just put who are the top users of this particular thing. So really subtly, day one, you're like, "I want to go find a report. "I don't even know "where to go inside of this really mysterious system". Postulation, you're able to say, "Well, I don't know where to go, but at least I can go call up John or Lisa," and say, "Hey, what is it that we know about this particular thing?" And I didn't have to know them. I just had to know that they had this report and they had this intelligence. So by just discovering people in who they are, you pick up on what people can know. >> So people of the new Google results, so you mentioned Google PageRank, which is web pages and relevance. You're taking a much more people approach to relevance. >> Satyen: That's right. >> To the data itself. >> That's right, and that builds community in very, very clear ways, because people have curiosity. Other people are in the mechanism why in which they satisfy that curiosity. And so that community builds automatically. >> They pay it forward, they know who to ask help for. >> That's right. >> Interesting. >> That's right. >> Last question, Satyen. The tag line, first data catalog designed for collaboration, is there a customer that comes to mind to you as really one that articulates that point exactly? Where Alation has come in and really kicked open the door, in terms of facilitating collaboration. >> Oh, absolutely. I was literally, this morning talking to one of our customers, Munich Reinsurance, largest reinsurance customer or company in the world. Their chief data officer said, "Look, three years ago, "we started with 10 people working on data. "Today, we've got hundreds. "Our aspiration is to get to thousands." We have three things that we do. One is, we actually discover insights. It's actually the smallest part of what we do. The second thing that we do is, we enable people to use data. And the third thing that we do is, drive a data driven culture. And for us, it's all about scaling knowledge, to centers in China, to centers in North America, to centers in Australia. And they've been doing that at scale. And they go to each of their people and they say, "Are you a data black belt, are you a data novice?" It's kind of like skiing. Are you blue diamond or a black diamond. >> Always ski in pairs (laughs) >> That's right. >> And they do ski in pairs. And what they end up ultimately doing is saying, "Look, we're going to train all of our workforce to become better, so that in three, 10 years, we're recognized as one of the most innovative insurance companies in the world." Three years ago, that was not the case. >> Process improvement at a whole other level. My final question for you is, for the folks watching or the folks that are going to watch this video, that could be a potential customer of yours, what are they feeling? If I'm the customer, what smoke signals am I seeing that say, I need to call Alation? What are some of the things that you've found that would tell a potential customer that they should be talkin' to you guys? >> Look, I think that they've got to throw out the old playbook. And this was a point that was made by some folks at a conference that I was at earlier this week. But they basically were saying, "Look, the DLNA's PlayBook was all about providing the right answer." Forget about that. Just allow people to ask the right questions. And if you let people's curiosity guide them, people are industrious, and ambitious, and innovative enough to go figure out what they need to go do. But if you see this as a world of control, where I'm going to just figure out what people should know and tell them what they're going to go know. that's going to be a pretty, a poor career to go choose because data's all about, sort of, freedom and innovation and understanding. And we're trying to push that along. >> Satyen, thanks so much for stopping by >> Thank you. >> and sharing how you guys are helping organizations, enterprises unlock data curiosity. We appreciate your time. >> I appreciate the time too. >> Thank you. >> And thanks John! >> And thank you. >> Thanks for co-hosting with me. For John Furrier, I'm Lisa Martin, you're watching theCUBE live from our second day of coverage of our event Big Data SV. Stick around, we'll be right back with our next guest after a short break. (upbeat music)

Published Date : Mar 9 2018

SUMMARY :

brought to you by SiliconANGLE Media Satyen Sangani, the co-founder and CEO of Alation. So you guys finish up your fiscal year how are you pulling this momentum through 2018. in the last quarter we added customers like What about the platform you guys are doing? Take a minute, talk about the update. And that gives you a single place of reference, you got to manage it. So it's kind of like a tropic disillusionment, if you will. And they know that How do you guys talk to that customer? And so, you can either run away from that data, Those are the sorts of problems that I'm excited to work on. Where is it kind of coming together for you guys? and I'm going to find the thing that I'm looking for, that you guys added towards the end of calendar 2017. And oh, by the way, we can measure that. a lot of people you mentioned social networks, I hear a lot of stories like, we bought a tool, And then, are you doubling down? And all of that gives you this really rich graph, It's on how you demystify it So the problem that you solve, And then as you get people using it, and operationalize data, how would you talk about it? and the context to then explain data the volume, how do you help an enterprise understand have the resources and the investment corresponding to So, you can really facilitate the whole recommend the latter. than they do the ability to execute. What it means to me is this, that nobody's come along the natural outcome of that is possibly usage you attract. And I didn't have to know them. So people of the new Google results, And so that community builds automatically. is there a customer that comes to mind to And the third thing that we do is, And what they end up ultimately doing is saying, that they should be talkin' to you guys? And if you let people's curiosity guide them, and sharing how you guys are helping organizations, Thanks for co-hosting with me.

ENTITIES

Entity	Category	Confidence
PepsiCo	ORGANIZATION	0.99+
Lisa Martin	PERSON	0.99+
Satyen Sangani	PERSON	0.99+
John	PERSON	0.99+
American Express	ORGANIZATION	0.99+
Alation	ORGANIZATION	0.99+
Roche	ORGANIZATION	0.99+
Satyen	PERSON	0.99+
thousands	QUANTITY	0.99+
Lisa	PERSON	0.99+
55 cents	QUANTITY	0.99+
Australia	LOCATION	0.99+
Amazon	ORGANIZATION	0.99+
Coca-Cola	ORGANIZATION	0.99+
2018	DATE	0.99+
10 people	QUANTITY	0.99+
three	QUANTITY	0.99+
John Furrier	PERSON	0.99+
hundreds	QUANTITY	0.99+
Yelp	ORGANIZATION	0.99+
San Jose	LOCATION	0.99+
China	LOCATION	0.99+
Harvard	ORGANIZATION	0.99+
Facebook	ORGANIZATION	0.99+
two	QUANTITY	0.99+
Today	DATE	0.99+
2017	DATE	0.99+
55%	QUANTITY	0.99+
second day	QUANTITY	0.99+
North America	LOCATION	0.99+
Google	ORGANIZATION	0.99+
today	DATE	0.99+
two dollars	QUANTITY	0.99+
20 different people	QUANTITY	0.99+
yesterday	DATE	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
last year	DATE	0.99+
three years ago	DATE	0.99+
first	QUANTITY	0.99+
second thing	QUANTITY	0.99+
One	QUANTITY	0.99+
one	QUANTITY	0.99+
first quarter of 2018	DATE	0.99+
20 different reports	QUANTITY	0.99+
three things	QUANTITY	0.98+
theCUBE	ORGANIZATION	0.98+
last quarter	DATE	0.98+
DLNA	ORGANIZATION	0.98+
third thing	QUANTITY	0.98+
Three years ago	DATE	0.98+
each	QUANTITY	0.98+
single	QUANTITY	0.98+
both	QUANTITY	0.98+
1st Street	LOCATION	0.98+
Big Bang	EVENT	0.98+
this year	DATE	0.98+
Strata Data	ORGANIZATION	0.97+
12 X	QUANTITY	0.97+
GDPR	TITLE	0.97+
seven ratings	QUANTITY	0.96+
Alation	PERSON	0.95+
this morning	DATE	0.95+
Big Data SV 2018	EVENT	0.94+
first data	QUANTITY	0.94+
Teradata	ORGANIZATION	0.93+
10 years	QUANTITY	0.93+

Ian Swanson, DataScience.com | Big Data SV 2018

(royal music) >> Announcer: John Cleese. >> There's a lot of people out there who have no idea what they're doing, but they have absolutely no idea that they have no idea what they're doing. Those are the ones with the confidence and stupidity who finish up in power. That's why the planet doesn't work. >> Announcer: Knowledgeable, insightful, and a true gentleman. >> The guy at the counter recognized me and said... Are you listening? >> John Furrier: Yes, I'm tweeting away. >> No, you're not. >> I tweet, I'm tweeting away. >> He is kind of rude that way. >> You're on your (bleep) keyboard. >> Announcer: John Cleese joins the Cube alumni. Welcome, John. >> John Cleese: Have you got any phone calls you need to answer? >> John Furrier: Hold on, let me check. >> Announcer: Live from San Jose, it's the Cube, presenting Big Data Silicon Valley, brought to you by Silicon Angle Media and its ecosystem partners. (busy music) >> Hey, welcome back to the Cube's continuing coverage of our event, Big Data SV. I'm Lisa Martin with my co-host, George Gilbert. We are down the street from the Strata Data Conference. This is our second day, and we've been talking all things big data, cloud data science. We're now excited to be joined by the CEO of a company called Data Science, Ian Swanson. Ian, welcome to the Cube. >> Thanks so much for having me. I mean, it's been a awesome two days so far, and it's great to wrap up my trip here on the show. >> Yeah, so, tell us a little bit about your company, Data Science, what do you guys do? What are some of the key opportunities for you guys in the enterprise market? >> Yeah, absolutely. My company's called datascience.com, and what we do is we offer an enterprise data science platform where data scientists get to use all they tools they love in all the languages, all the libraries, leveraging everything that is open source to build models and put models in production. Then we also provide IT the ability to be able to manage this massive stack of tools that data scientists require, and it all boils down to one thing, and that is, companies need to use the data that they've been storing for years. It's about, how do you put that data into action. We give the tools to data scientists to get that data into action. >> Let's drill down on that a bit. For a while, we thought if we just put all our data in this schema-on-read repository, that would be nirvana. But it wasn't all that transparent, and we recognized we have to sort of go in and structure it somewhat, help us take the next couple steps. >> Ian: Yeah, the journey. >> From this partially curated data sets to something that turns into a model that is actionable. >> That's actually been the theme in the show here at the Strata Data Conference. If we went back years ago, it was, how do we store data. Then it was, how do we not just store and manage, but how do we transform it and get it into a shape that we can actually use it. The theme of this year is how do we get it to that next step, the next step of putting it into action. To layer onto that, data scientists need to access data, yes, but then they need to be able to collaborate, work together, apply many different techniques, machine learning, AI, deep learning, these are all techniques of a data scientist to be able to build a model. But then there's that next step, and the next is, hey, I built this model, how do I actually get it in production? How does it actually get used? Here's the shocking thing. I was at an event where there's 500 data scientists in the audience, and I said, "Stand up if you worked on a model for more than nine months "and it never went into production." 90% of the audience stood up. That's the last mile that we're all still working on, and what's exciting is, we can make it possible today. >> Wanting to drill down into the sort of, it sounds like there's a lot of choice in the tools. But typically, to do a pipeline, you either need well established APIs that everyone understands and plugs together with, or you need an end to end sort of single vendor solution that becomes the sort of collaboration backbone. How are you organized, how are you built? >> This might be self-serving, but datascience.com, we have enterprise data science platform, we recommend a unified platform for data science. Now, that unified platform needs to be highly configurable. You need to make it so that that workbench, you can use any tool that you want. Some data scientists might want to use a hammer, others want to be able to use a screwdriver over here. The power is how configurable, how extensible it is, how open source you can adopt everything. The amazing trends that we've seen have been proprietary solutions going back decades, to now, the rise of open source. Every day, dozens if not hundreds of new machine learning libraries are being released every single day. We've got to give those capabilities to data scientists and make them scale. >> OK, so the, and I think it's pretty easy to see how you would have incorporate new machine learning libraries into a pipeline. But then there's also the tools for data preparation, and for like feature extraction and feature engineering, you might even have some tools that help you with figuring out which algorithm to select. What holds all that together? >> Yeah, so orchestrating the enterprise data science stack is the hardest challenge right now. There has to be a company like us that is the glue, that is not just, do these solutions work together, but also, how do they collaborate, what is that workflow? What are those steps in that process? There's one thing that you might have left out, and that is, model deployment, model interpretation, model management. >> George: That's the black art, yeah. >> That's where this whole thing is going next. That was the exciting thing that I heard in terms of all these discussion with business leaders throughout the last two days is model deployment, model management. >> If I can kind of take this to maybe shift the conversation a little bit to the target audience. Talked a lot about data scientists and needing to enable them. I'm curious about, we just talked with, a couple of guests ago, about the chief data officer. How, you work with enterprises, how common is the chief data officer role today? What are some of the challenges they've got that datascience.com can help them to eliminate? >> Yeah, the CIO and the chief data officer, we have CIOs that have been selecting tools for companies to use, and now the chief data officer is sitting down with the CEO and saying, "How do we actually drive business results?" We work very closely with both of those personas. But on the CDO side, it's really helping them educate their teams on the possibilities of what could be realized with the data at hand, and making sure that IT is enabling the data scientists with the right tools. We supply the tools, but we also like to go in there with our customers and help coach, help educate what is possible, and that helps with the CDO's mission. >> A question along that front. We've been talking about sort of empowering the data scientist, and really, from one end of the modeling life cycle all the way to the end or the deployment, which is currently the hardest part and least well supported. But we also have tons of companies that don't have data science trained people, or who are only modestly familiar. Where do, what do we do with them? How do we get those companies into the mainstream in terms of deploying this? >> I think whether you're a small company or a big company, digital transformation is the mandate. Digital transformation is not just, how do I make a taxi company become Uber, or how do I make a speaker company become Sonos, the smart speaker, it's how do I exploit all the sources of my data to get better and improved operational processes, new business models, increased revenue, reduced operation costs. You could start small, and so we work with plenty of smaller companies. They'll hire a couple data scientists, and they're able to do small quick wins. You don't have to go sit in the basement for a year having something that is the thing, the unicorn in the business, it's small quick wins. Now we, my company, we believe in writing code, trained, educated, data scientists. There are solutions out there that you throw data at, you push a button, it gets an output. It's this magic black box. There's risk in that. Model interpretation, what are the features it's scoring on, there's risk, but those companies are seeing some level of success. We firmly believe, though, in hiring a data science team that is trained, you can start small, two or three, and get some very quick wins. >> I was going to say, those quick wins are essential for survivability, like digital transformation is essential, but it's also, I mean, to survival at a minimum, right? >> Ian: Yes. >> Those quick wins are presumably transformative to an enterprise being able to sustain, and then eventually, or ideally, be able to take market share from their competition. >> That is key for the CDO. The CDO is there pitching what is possible, he's pitching, she's pitching the dream. In order to be able to help visualize what that dream and the outcome could be, we always say, start small, quick wins, then from there, you can build. What you don't want to do is go nine months working on something and you don't know if there's going to be outcome. A lot of data science is trial and error. This is science, we're testing hypotheses. There's not always an outcome that's to be there, so small quick wins is something we highly recommend. >> A question, one of the things that we see more and more is the idea that actionable insights are perishable, and that latency matters. In fact, you have a budget for latency, almost, like in that short amount of time, the more sort of features that you can dynamically feed into a model to get a score, are you seeing more of that? How are the use cases that you're seeing, how's that pattern unfolding? >> Yeah, so we're seeing more streaming data use cases. We work with some of the biggest technology companies in the world, so IoT, connected services, streaming real time decisions that are happening. But then, also, there are so many use cases around org that could be marketing, finance, HR related, not just tech related. On the marketing side, imagine if you're customer service, and somebody calls you, and you know instantly the lifetime value of that customer, and it kicks off a totally new talk track, maybe get escalated immediately to a new supervisor, because that supervisor can handle this top tier customer. These are decisions that can happen real time leveraging machine learning models, and these are things that, again, are small quick wins, but massive, massive impact. It's about decision process now. That's digital transformation. >> OK. Are you seeing patterns in terms of how much horsepower customers are budgeting for the training process, creating the model? Because we know it's very compute intensive, like, even Intel, some people call it, like, high performance compute, like a supercomputer type workload. How much should people be budgeting? Because we don't see any guidelines or rules of thumb for this. >> I still think the boundaries are being worked out. There's a lot of great work that Nvidia's doing with GPU, we're able to do things faster on compute power. But even if we just start from the basics, if you go and talk to a data scientist at a massive company where they have a team of over 1,000 data scientists, and you say to do this analysis, how do you spin up your compute power? Well, I go walk over to IT and I knock on the door, and I say, "Set up this machine, set up this cluster." That's ridiculous. A product like ours is able to instantly give them the compute power, scale it elastically with our cloud service partners or work with on-prem solutions to be able to say, get the power that you need to get the results in the time that's needed, quick, fast. In terms of the boundaries of the budget, that's still being defined. But at the end of the day, we are seeing return on investment, and that's what's key. >> Are you seeing a movement towards a greater scope of integration for the data science tool chain? Or is it that at the high end, where you have companies with 1,000 data scientists, they know how to deal with specialized components, whereas, when there's perhaps less of, a smaller pool of expertise, the desire for end to end integration is greater. >> I think there's this kind of thought that is not necessarily right, and that is, if you have a bigger data science team, you're more sophisticated. We actually see the same sophistication level of 1,000 person data science team, in many cases, to a 20 person data science team, and sometimes inverse, I mean, it's kind of crazy. But it's, how do we make sure that we give them the tools so they can drive value. Tools need to include collaboration and workflow, not just hammers and nails, but how do we work together, how do we scale knowledge, how do we get it in the hands of the line of business so they can use the results. It's that that is key. >> That's great, Ian. I also like that you really kind of articulated start small, quick ins can make massive impact. We want to thank you so much for stopping by the Cube and sharing that, and what you guys are doing at Data Science to help enterprises really take advantage of the value that data can really deliver. >> Thanks so much for having datascience.com on, really appreciate it. >> Lisa: Absolutely. George, thank you for being my co-host. >> You're always welcome. >> We want to thank you for watching the Cube. I'm Lisa Martin with George Gilbert, and we are at our event Big Data SV on day two. Stick around, we'll be right back with our next guest after a short break. (busy music)

Published Date : Mar 8 2018

SUMMARY :

Those are the ones with the confidence and stupidity and a true gentleman. The guy at the counter recognized me and said... Announcer: John Cleese joins the Cube alumni. brought to you by Silicon Angle Media We are down the street from the Strata Data Conference. and it's great to wrap up my trip here on the show. and it all boils down to one thing, and that is, the next couple steps. to something that turns into a model that is actionable. and the next is, hey, I built this model, that becomes the sort of collaboration backbone. how open source you can adopt everything. OK, so the, and I think it's pretty easy to see Yeah, so orchestrating the enterprise data science stack in terms of all these discussion with business leaders a couple of guests ago, about the chief data officer. and making sure that IT is enabling the data scientists empowering the data scientist, and really, having something that is the thing, or ideally, be able to take market share and the outcome could be, we always say, start small, the more sort of features that you can dynamically in the world, so IoT, connected services, customers are budgeting for the training process, get the power that you need to get the results Or is it that at the high end, We actually see the same sophistication level and sharing that, and what you guys are doing Thanks so much for having datascience.com on, George, thank you for being my co-host. and we are at our event Big Data SV on day two.

ENTITIES

Entity	Category	Confidence
George Gilbert	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Ian Swanson	PERSON	0.99+
George	PERSON	0.99+
Ian	PERSON	0.99+
Lisa	PERSON	0.99+
Uber	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
Silicon Angle Media	ORGANIZATION	0.99+
John	PERSON	0.99+
John Cleese	PERSON	0.99+
500 data scientists	QUANTITY	0.99+
90%	QUANTITY	0.99+
dozens	QUANTITY	0.99+
Nvidia	ORGANIZATION	0.99+
San Jose	LOCATION	0.99+
20 person	QUANTITY	0.99+
Data Science	ORGANIZATION	0.99+
nine months	QUANTITY	0.99+
1,000 person	QUANTITY	0.99+
two	QUANTITY	0.99+
two days	QUANTITY	0.99+
more than nine months	QUANTITY	0.99+
second day	QUANTITY	0.99+
1,000 data scientists	QUANTITY	0.99+
three	QUANTITY	0.99+
Big Data SV	EVENT	0.99+
over 1,000 data scientists	QUANTITY	0.99+
Cube	ORGANIZATION	0.99+
both	QUANTITY	0.99+
Strata Data Conference	EVENT	0.98+
one	QUANTITY	0.98+
Intel	ORGANIZATION	0.98+
Sonos	ORGANIZATION	0.98+
one thing	QUANTITY	0.97+
a year	QUANTITY	0.96+
today	DATE	0.95+
day two	QUANTITY	0.95+
this year	DATE	0.94+
single	QUANTITY	0.92+
Big Data SV 2018	EVENT	0.88+
DataScience.com	ORGANIZATION	0.87+
hundreds of new machine learning libraries	QUANTITY	0.86+
lot of people	QUANTITY	0.83+
decades	QUANTITY	0.82+
every single day	QUANTITY	0.81+
years ago	DATE	0.77+
last two days	DATE	0.76+
datascience.com	ORGANIZATION	0.75+
one end	QUANTITY	0.7+
years	QUANTITY	0.67+
datascience.com	OTHER	0.65+
couple steps	QUANTITY	0.64+
Big Data	EVENT	0.64+
couple of guests	DATE	0.57+
couple	QUANTITY	0.52+
Silicon Valley	LOCATION	0.52+
things	QUANTITY	0.5+
Cube	TITLE	0.47+

Ziya Ma, Intel | Big Data SV 2018

>> Live from San Jose, it's theCUBE! Presenting Big Data Silicon Valley, brought to you by SiliconANGLE Media and its ecosystem partners. >> Welcome back to theCUBE. Our continuing coverage of our event, Big data SV. I'm Lisa Martin with my co-host George Gilbert. We're down the street from the Strata Data Conference, hearing a lot of interesting insights on big data. Peeling back the layers, looking at opportunities, some of the challenges, barriers to overcome but also the plethora of opportunities that enterprises alike have that they can take advantage of. Our next guest is no stranger to theCUBE, she was just on with me a couple days ago at the Women in Data Science Conference. Please welcome back to theCUBE, Ziya Ma. Vice President of Software and Services Group and the Director of Big Data Technologies from Intel. Hi Ziya! >> Hi Lisa. >> Long time, no see. >> I know, it was just really two to three days ago. >> It was, well and now I can say happy International Women's Day. >> The same to you, Lisa. >> Thank you, it's great to have you here. So as I mentioned, we are down the street from the Strata Data Conference. You've been up there over the last couple days. What are some of the things that you're hearing with respect to big data? Trends, barriers, opportunities? >> Yeah, so first it's very exciting to be back at the conference again. The one biggest trend, or one topic that's hit really hard by many presenters, is the power of bringing the big data system and data science solutions together. You know, we're definitely seeing in the last few years the advancement of big data and advancement of data science or you know, machine learning, deep learning truly pushing forward business differentiation and improve our life quality. So that's definitely one of the biggest trends. Another thing I noticed is there was a lot of discussion on big data and data science getting deployed into the cloud. What are the learnings, what are the use cases? So I think that's another noticeable trend. And also, there were some presentations on doing the data science or having the business intelligence on the edge devices. That's another noticeable trend. And of course, there were discussion on security, privacy for data science and big data so that continued to be one of the topics. >> So we were talking earlier, 'cause there's so many concepts and products to get your arms around. If someone is looking at AI and machine learning on the back end, you know, we'll worry about edge intelligence some other time, but we know that Intel has the CPU with the Xeon and then this lower power one with Atom. There's the GPU, there's ASICs, FPGAS, and then there are these software layers you know, with higher abstraction layer, higher abstraction level. Help us put some of those pieces together for people who are like saying, okay, I know I've got a lot of data, I've got to train these sophisticated models, you know, explain this to me. >> Right, so Intel is a real solution provider for data science and big data. So at the hardware level, and George, as you mentioned, we offer a wide range of products from general purpose like Xeon to targeted silicon such as FPGA, Nervana, and other ASICs chips like Nervana. And also we provide adjacencies like networking the hardware, non-volatile memory and mobile. You know, those are the other adjacent products that we offer. Now on top of the hardware layer, we deliver fully optimized software solutions stack from libraries, frameworks, to tools and solutions. So that we can help engineers or developers to create AI solutions with greater ease and productivity. For instance, we deliver Intel optimized math kernel library. That leverage of the latest instruction set gives us significant performance boosts when you are running your software on Intel hardware. We also deliver framework like BigDL and for Spark and big data type of customers if they are looking for deep learning capabilities. We also optimize some popular open source deep learning frameworks like Caffe, like TensorFlow, MXNet, and a few others. So our goal is to provide all the necessary solutions so that at the end our customers can create the applications, the solutions that they really need to address their biggest pinpoints. >> Help us think about the maturity level now. Like, we know that the very most sophisticated internet service providers who are sort of all over this machine learning now for quite a few years. Banks, insurance companies, people who've had this. Statisticians and actuaries who have that sort of skillset are beginning to deploy some of these early production apps. Where are we in terms of getting this out to the mainstream? What are some of the things that have to happen? >> To get it to mainstream, there are so many things we could do. First I think we will continue to see the wide range of silicon products but then there are a few things Intel is pushing. For example, we're developing this in Nervana, graph compiler that will encapsulate the hardware integration details and present a consistent API for developers to work with. And this is one thing that we hope that we can eventually help the developer community with. And also, we are collaborating with the end user. Like, from the enterprise segment. For example, we're working with the financial services industry, we're working with a manufacturing sector and also customers from the medical field. And online retailers, trying to help them to deliver or create the data science and analytics solutions on Intel-based hardware or Intel optimized software. So that's another thing that we do. And we're seeing actually very good progress in this area. Now we're also collaborating with many cloud service providers. For instance, we work with some of the top seven cloud service providers, both in the U.S. and also in China to democratize the, not only our hardware, but also our libraries and tools, BigDL, MKL, and other frameworks and libraries so that our customers, including individuals and businesses, can easily access to those building blocks from the cloud. So definitely we're working from different factors. >> So last question in the last couple of minutes. Let's kind of vibe on this collaboration theme. Tell us a little bit about the collaboration that you're having with, you mentioned customers in some highly regulated industries, for as an example. But a little bit to understand what's that symbiosis? What is Intel learning from your customers that's driving Intel's innovation of your technologies and big data? >> That's an excellent question. So Lisa, maybe I can start my sharing a couple of customer use cases. What kind of a solution that we help our customer to address. I think it's always wise not to start a conversation with the customer on technology that you deliver. You want to understand the customer's needs first. And then so that you can provide a solution that really address their biggest pinpoint rather than simply selling technology. So for example, we have worked with an online retailer to better understand their customers' shopping behavior and to assess their customers' preferences and interests. And based upon that analysis, the online retailer made different product recommendations and maximized its customers' purchase potential. And it drove up the retailer's sales. You know, that's one type of use case that we have worked. We also have partnered with the customers from the medical field. Actually, today at the Strata Conference we actually had somebody highlighting, we had a joint presentation with UCSF where we helped the medical center to automate the diagnosis and grading of meniscus lesions. And so today actually, that's all done manually by the radiologist but now that entire process is automated. The result is much more accurate, much more consistent, and much more timely. Because you don't have to wait for the availability of a radiologist to read all the 3D MRI images. And that can all be done by machines. You know, so those are the areas that we work with our customers, understand their business need, and give them the solution they are looking for. >> Wow, the impact there. I wish we had more time to dive into some of those examples. But we thank you so much, Ziya, for stopping by twice in one week to theCUBE and sharing your insights. And we look forward to having you back on the show in the near future. >> Thanks, so thanks Lisa, thanks George for having me. >> And for my co-host George Gilbert, I'm Lisa Martin. We are live at Big Data SV in San Jose. Come down, join us for the rest of the afternoon. We're at this cool place called Forager Tasting and Eatery. We will be right back with our next guest after a short break. (electronic outro music)

Published Date : Mar 8 2018

SUMMARY :

brought to you by SiliconANGLE Media some of the challenges, barriers to overcome What are some of the things that you're So that's definitely one of the biggest trends. on the back end, So at the hardware level, and George, as you mentioned, What are some of the things that have to happen? and also customers from the medical field. So last question in the last couple of minutes. customers from the medical field. And we look forward to having you We will be right back with our

ENTITIES

Entity	Category	Confidence
George Gilbert	PERSON	0.99+
Lisa Martin	PERSON	0.99+
UCSF	ORGANIZATION	0.99+
George	PERSON	0.99+
Lisa	PERSON	0.99+
San Jose	LOCATION	0.99+
China	LOCATION	0.99+
Ziya Ma	PERSON	0.99+
U.S.	LOCATION	0.99+
International Women's Day	EVENT	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
Ziya	PERSON	0.99+
one week	QUANTITY	0.99+
today	DATE	0.99+
twice	QUANTITY	0.99+
First	QUANTITY	0.99+
Strata Data Conference	EVENT	0.99+
one topic	QUANTITY	0.98+
Spark	TITLE	0.98+
both	QUANTITY	0.98+
Intel	ORGANIZATION	0.98+
one thing	QUANTITY	0.98+
three days ago	DATE	0.98+
Women in Data Science Conference	EVENT	0.97+
Strata Conference	EVENT	0.96+
first	QUANTITY	0.96+
BigDL	TITLE	0.96+
TensorFlow	TITLE	0.96+
one type	QUANTITY	0.95+
two	DATE	0.94+
MXNet	TITLE	0.94+
Caffe	TITLE	0.92+
theCUBE	ORGANIZATION	0.91+
one	QUANTITY	0.9+
Software and Services Group	ORGANIZATION	0.9+
Forager Tasting and Eatery	ORGANIZATION	0.88+
Vice President	PERSON	0.86+
Big Data Technologies	ORGANIZATION	0.84+
seven cloud service providers	QUANTITY	0.81+
last couple days	DATE	0.81+
Atom	COMMERCIAL_ITEM	0.76+
Silicon Valley	LOCATION	0.76+
Big Data SV 2018	EVENT	0.74+
a couple days ago	DATE	0.72+
Big Data SV	ORGANIZATION	0.7+
Xeon	COMMERCIAL_ITEM	0.7+
Nervana	ORGANIZATION	0.68+
Big Data	EVENT	0.62+
last	DATE	0.56+
data	EVENT	0.54+
case	QUANTITY	0.52+
3D	QUANTITY	0.48+
couple	QUANTITY	0.47+
years	DATE	0.47+
Nervana	TITLE	0.45+
Big	ORGANIZATION	0.32+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for big: