AWS Heroes Panel | AWS Startup Showcase S2 E2 | Data as Code

>>Hi, everyone. Welcome to the cubes presentation of the AWS startup showcase the theme. This episode is data as code, and this is season two, episode two of the ongoing series covering exciting startups from the ecosystem in cloud and the future of data analytics. I'm your host, John furry. You're getting great featured panel here with AWS heroes, Lynn blankets, the CEO of Lindbergh Lega consulting, Peter Hanson's, founder of cloud Cedar and Alex debris, principal of debris advisory. Great to see all of you here and, uh, remotely and look forward to see you in person at the next re-invent or other event. >>Thanks for having us. >>So Lynn, you're doing a lot of work in healthcare, Peter you're in the middle of all the action as data as code Alex. You're in deep on the databases. We've got a good round up of, of topics here ranging from healthcare to getting under the hood on databases. So as we'll start with you, what are you working on right now? What trends do you see in the database space? >>Yeah, sure. So I do, uh, I do a lot of consulting work working with different people and, you know, often with, with dynamo DB or, or just general serverless technology type stuff. Um, if you want to talk about trends that I'm seeing right now, I would say trends you're seeing as a lot, just more serverless native databases or cloud native databases where you're seeing these cool databases come out that really take advantage of, uh, this new cloud environment, right? Where you have scalability, you have plasticity of the clouds. So you're not having, you know, instant space environments anymore. You're paying for capacity, you're paying for throughput. You're able to scale up and down. You're not managing individual instances. So a lot of cool stuff that we're seeing, you know, um, with this new generation of, of infrastructure and in particular database is taking advantage of this, this new cloud world >>And really lot deep into the database side in terms of like cloud native impact, diversity of database types, when to use certain databases that also a big deal. >>Yeah, absolutely. I like, I totally agree. I love seeing the different types of databases and, you know, AWS has this whole, uh, purpose-built database strategy. And I think that, that makes a lot of sense. Um, you know, I want to go too far with it. I would, I would more think about purpose-built categories and things like that, you know, specialize in an OLTB database within your, within your organization, whether that's dynamo DB or document DB or relational database Aurora or something like that. But then also choose some sort of analytics database, you know, if it's drew it or Redshift or Athena, and then, you know, if you have some specialized needs, you want to show some real time stuff to your users, check out rock site. If you want to, uh, you know, do some graph analytics, fraud detection, checkout tiger graph, a lot of cool stuff that we're seeing from the startup showcase here. >>Looking forward to unpacking that Lynn you've been in love now, a healthcare action with cloud ops, the pandemic pushes hard core on everybody. What are you working on? >>Yeah, it's all COVID data all the time. Uh, before the pandemic, I was supporting research groups for cancer genomics, which I still do, but, um, what's, uh, impactful is the explosive data volumes. You know, when you there's big data and there's genomic data, you know, I've worked with clients that have broken data centers, broken public cloud provider data centers because of the daily volume they're putting in. So there's this volume aspect. And then there's a collaboration, particularly around COVID research because of pandemic. And so you have this explosive volume, you have this, um, need for, uh, computational complexity. And that means cloud the challenge is it, you know, put the pedal to the metal. So you've got all these bioinformatics researchers that are used to single machine. Suddenly they have to deal with distributed compute. So it's a wild time to be in this space. >>What was the big change that you've seen with the, uh, the pandemic and in genomic cloud genomic specifically what's the big change has happened. >>The amount of data that is being put into the public cloud, um, previously people would have their data on their local, uh, capacity, and then they would publish their paper and the data may or may not become available for, uh, reproducing the research, uh, to accelerate for drug discovery and even variant identification. The data sets are being pushed to public cloud repositories, which is a whole new set of concerns. You have not only dealing with the volume and cost, but security, you know, there's federated security is non-trivial and not well understood by this domain. So there's so much work available here. >>Awesome. Peter, you're doing a lot with the data as a platform kind of view and platform engineering data as code is, is something that's being kicked around. What are you working on and how does platform engineering change as data becomes so much more prevalent in its value proposition? >>Yeah. So I'm the founder of cloud Cedar and, um, we sort of built this company out, this consultancy all around the challenges that a lot of companies have got with getting their data sorted, getting it organized, getting it ready for other use cases, such as analytics and machine learning, um, AI workloads and the like. So typically a platform engineering team will look after the organization of a company infrastructure, making sure that it's coherent across the company and a data platform, engineering teams doing something similar in that sense where they're, they're looking at making sure that, uh, data teams have a solid foundation to build upon, uh, that everything's quite predictable and what that enables is a faster velocity and the ability to use data as code as a way of specifying and onboarding data, building that, translating it, transforming it out into its specific domains and then on to data products. >>I have to ask you while you're here. Um, there's a big trend around data meshes right now. You're hearing, we've had a lot of stuff on the cube. Um, what are practical that people are using data mesh, first of all, is it relevant and how are people looking at this data mesh conversation? >>I think it becomes more and more relevant, uh, the bigger the organization that you're dealing with. So, you know, often times in the enterprise, you've got, uh, projects with timelines of five to 10 years often outlasting technology life cycles. The technology that you're building on is probably irrelevant by the time that you complete it. And what we're seeing is that data engineering teams and data teams more broadly, this organizational bottleneck and data mesh is all about, uh, breaking down that, um, bottleneck and decentralizing the work, shifting that work back onto, uh, development teams who oftentimes have got more of the context and a centralized data engineering team. And we're seeing a lot of, uh, Philocity increases as a result of that. >>It's interesting. There's so many different aspects of how data is changing the world. Lynn talks about the volume with the cloud and genomics. We're hearing data engineering at a platform level. You're talking about slicing and dicing and real-time information. You mentioned rock set, Alex. So I'd like to ask each of you to answer this next question, which is how has the team dynamics changed with data engineering because every single company's impacted. So if you're researchers, Lynn, you're pumping more data into the cloud, that's got a little bit of data engineering to it. Do they even understand that is that impacting them? So how has data changed the responsibilities or roles in this new emerging area of data engineering or whatever you want to call it? Lynn, we'll start with you. What do you, what do you see this impact? >>Well, you know, I mean, dev ops becomes data ops and ML ops and, uh, you know, this is a whole emergent area of work and it starts with an understanding of container technologies, which, you know, in different verticals like FinTech, that's a given, right, but in bioinformatics building an appropriately optimized Docker container is something I'm still working with customers now on because they have the concept of a Docker container is just a virtual machine, which obviously it isn't, or shouldn't be. So, um, you have, again, as I mentioned previously, this humongous skill gap, um, concepts like D, which are prevalent in ad tech FinTech, that's not available yet for most of my customers. So those are the things that I'm building. So the whole ops space is, um, this a wide open area. And really it's a question of practicality. Um, you know, I have, uh, a lot of experience with data lakes and, you know, containerizing and using the data lake platform. But a lot of my customers are going to move to like an interim pass based solutions. If they're using spark, for example, they might use to use a managed spark solution as an interim, um, step up to the cloud before they build their own containers. Because the amount of knowledge to do that effectively is non-trivial >>Peter, you mentioned data, you mentioned data lakes, onboarding data into lake house architectures, for instance, something that you're familiar with. Um, this is not obvious to some verticals obvious to others. What do you see this data engineering impact from a personnel standpoint? And then ultimately how things get built, >>You know, are you directing that to me, >>Peter? >>Yeah. So I think, um, first and foremost, you know, the workload that data engineering teams are dealing with is ever increasing. Usually there's a 10 X ratio of, um, software engineers to data engineers within a business and usually double the amount of analysts to data engineers again. And so they're, they're fighting it ever increasing backload. And, uh, so they're fighting an ever increasing backlog of, of, uh, tasks to do and tickets to, to, to churn through. And so what we're seeing is that data engineering teams are becoming data platform engineering teams where they're building capability instead of constantly hamster wheels spinning if you will. And so with that in mind, with onboarding data into, uh, a Lakehouse architecture or a data lake where data engineering teams, uh, uh, getting wins is developing a very good baseline of structure where they're getting the categorization, the data tagging, whether this data is of a particular domain, does it contain some, um, PII data, for instance, uh, and, and, and, and then the security aspects, and also, you know, the mechanisms on which to do the data transformations, >>Alex, on the database side, those are known personas in an enterprise, a them, the database team, but now the scale is so big. Um, and there's so much going on in databases. How does the data engineering impact organizations from your standpoint? >>Yeah, absolutely. I think definitely, you know, gone are the days where you have a single relational database that is serving operational queries for your users, and you can also serve analytics queries, you know, for your internal teams. It's, it's now split up into those purpose-built databases, like we've said. Uh, but now you've got two different teams managing it and they're, they're designing their data model for different things. You know? So L LLTP might have a more de-normalized model, something that works for very fast operations and it's optimized for that, but now you need to suck that data out and get it elsewhere so that your, your PM or your business analyst, or whoever can crunch through some of that. And, you know, now it needs to be in a more normalized format. How do you sort of bridge that gap? That's a tough one. I think you need to, you know, build empathy on each side of, of what each side is doing and, and build the tools to say, Hey, this is going to help you, uh, you know, LLTP team, if we know what, what users are actually doing, and, and if you can get us into the right format there, so that then I can, you know, we can analyze it, um, on the backend. >>So I think, I think building empathy across those teams is helpful. >>When I left to come back to, you mentioned a health and informatics is coming back. Um, but it's interesting, you know, I look at a database world and you look at the solutions that are out there. A lot of companies that build data solutions don't have a data problem. They've never, they're not swimming in a lot of data, but then you look at like the field that you're working in right now with the genomics and health and, and quantum, they're always, they're dealing with data all the time. So you have people who deal with a lot of data all the time are breaking through New Zealand. People who are don't have that experience are now becoming data full, right? So people are now either it's a first time problem, or they've always been swimming in a ton of data. So it's more of what's the new playbook. And then, wow, I've never had to deal with a lot of data before. What's your take? >>It's interesting. Cause they know, uh, bioinformatics hires, um, uh, grad students. So grad students, you know, use their, our scripts with their file on their laptop. And so, um, to get those folks to understand distributed container-based computing is like I said, a not non-trivial problem. What's been really interesting with the money pouring in to COVID research is when I first started, some of the workflows would take, you know, literally 500 hours and that was just okay. And coming out of FinTech, I was, uh, I could, I was blown away like FinTech is like, could that please take a millisecond rather than a second? Right. And so what has now happened, which makes it, you know, like I said, even more fun to work in this domain is, uh, the research dollars have really gone up because of the pandemic. And so there are, there are, there's this blending of people like me with more of a big data background coming into bioinformatics and working side by side. >>So it's this interesting sort of translation because you have the whole taxonomy of bioinformatics with genomics and sequencers and all the weird file types that you get. And then you have the whole taxonomy of dev ops data ops, you know, containers and Kubernetes and all that. And trying to get that into pipelines that can actually, you know, be efficient, given the constraints. Of course, we, on the tech side, we always want to make it super optimized. I had a customer that we got it down from 500 hours to minutes, but they wanted to stay with the past solution because it was easier for them to go from 500 hours to five hours was good enough, but you know, the techies want to get it down to five minutes. >>This is, this is, we've seen this movie before dev ops, um, edge and op operations, you know, IOT, world scenes, the convergence of cultures. Now you have data and then old, old school operations kind of coming up. So this kind of supports the thesis. That data as code is the next infrastructure as code. What do you guys, what's the reaction there for you guys? What do you think about that? What does data's code mean? If infrastructure's code was cloud and dev ops, what is data as code? What does that mean? >>I could take it if you like. I think, um, data teams, organizations, um, have been long been this bottleneck within the organization and there's like this dark matter of untapped energy and potential waiting to be unleashed a data with the advent of open source projects like DBT, um, have been slowly sort of embracing software development, lifecycle practices. And this is really sort of seeing a, a big steep increase in, um, in their velocity. And, and this is only going to increase and improve as we're seeing data teams, um, embrace starter as code. I think it's, uh, the future is bright for data. So I'm very excited. >>Lynn Peter reaction. I mean, agility data is code is developer concept CICB pipeline. You mentioned it new operational workflows coming into traditional operations reaction. >>Yeah. I mean, I think Peter's right on there. I'd say, you know, some of those tools we're seeing come in from, from software, like, like DBT, basically giving you that infrastructure as code, but applied to that data realm. Also there have been a few, like get for data type things, pack a derm, I believe is one and a few other ones where you bring that in and you also see a lot of immutability concepts flowing into the data realm. So I think just seeing some of those software engineering concepts come over to the data world has, has been pretty interesting >>What we'll literally just versioning datasets and the identification of what's in a data set. What's not in a data set. Some of this is around ethical AI as well, um, is a whole, uh, area that has come out of research groups. Um, mostly AI research groups, but is being applied to medical data and needs to be obviously, um, so this, this, this, um, metadata and versioning around data sets is really, I think, a very of the moment area. >>Yeah, I think we, we, you guys are bringing up a really good kind of direction that's happening in data. And that is something that you're seeing on the software side, open source and now dev ops. And now going to data is that the supply chain challenges of we've been talking about it here on the cube and this, this, um, this episode is, you know, we've seen Ukraine war, but some open source, you know, malware hitting datasets is data secure. What is that going to look like? So you starting to get into this what's the supply chain, is it verified data sets if data sets have to be managed a whole nother level of data supply chain comes up, what do you guys think about that? >>I'll jump in. Oh, sorry. I'll jump in again. I think that, you know, there's, there's, um, some, some of the compliance requirements, um, around financial data are going to be applied to other types of data, probably health data. So immutability reproducibility, um, that is, uh, legally required. Um, also some of the privacy requirements that originated in Europe with GDPR are going to be replicated as more and more, um, types of data. And again, I'm always going to speak for health, but there's other types as well coming out of personal devices and that kind of stuff. So I think, you know, this idea of data as code is it's, it goes down to versioning and controlling and, um, that's, uh, that's sort of a real succinct way to say it that we didn't used to think about that. We just put it in our, you know, relational database and we were good to go, but, um, versioning and controlling in the global ecosystem is kind of, uh, where I'm focusing my efforts. >>It brings up a good question. If databases, if data is going to be part of the development process has to be addressable, which means horizontally scalable. That means it has to be accessible and open. How do you make that work and not foreclose it with a lot of restrictions? >>I think the use of data catalogs and appropriate tagging and categorization, you know, I think, you know, everyone's heard of the term data swamp, and I think that just came about because that everyone saw like, oh, wow, S3, you know, infinite storage. We just, you know, throw whatever in there for as long as we want. And I think at times, you know, the proliferation of S3 buckets, um, and the like, you know, we've just seen, uh, perhaps security, not maintained as well as it could have been. And I think that's kind of where data platform engineering teams have really sort of, uh, come into the, for, you know, creating a governance set of buckets like formation on top. But I think that's kind of where we need to see a lot more work with appropriate tags and also the automatic publishing of metadata into data catalogs so that, um, folks can easily search and address particular data sets and also control the access. You know, for instance, you've got some PII data, perhaps really only your marketing folks should be looking at email addresses and the like not perhaps your finance folks. So I think, you know, there's, there's a lot to be leveraged there in formation and other solutions, >>Alex, let's back up and talk about what's in it for the customer, right. Let's zoom back and saying reality is I just got to get my data to make sure it's secure always on and not going to be hackable. And I just got to get my data available on river performance. So then, then I got to start thinking about, okay, how do I intersect it? So what should teams be thinking about right now as I look up all their data options or databases across their enterprise? >>Yeah, it's, it's a, it's a good question. I just, you know, I think Peter made some good points there and you can think of history as sort of ebbing and flowing between centralization and decentralization a lot of times. And you know, when storage was expensive, data was going to be sort of centralized and Maine maintained, sort of a, you know, by the, uh, the people that are in charge of it. But then when, when S3 comes along, it really decreases storage. Now we can do a lot more experiments on it. We can store a lot more of our data, keep it around and do different things on it. You know, now we've got regulations again, we were, we gotta, we gotta be more realistic about, about keeping that data secure and make sure we're, we're doing the right things with it. So it's, we're gonna probably go through a period of, of centralization as we work out some of this tooling around, you know, tagging and, and ethical AI that, that both Peter. And when we're talking about here and maybe get us into that, that next wearable world of de-centralization again. But I, I think that ebb and flow is going to be natural in response to, you know, the problems of the, the other extreme, >>Where are we in the market right now from progress standpoint, because data lakes don't want to be data swamps. You seeing lake formation as a data architecture, as an example, where are we with customers? What are they doing right now? Where would you put them in the progress bar of, of evolution towards the Nirvana of having this data sovereignty? And this data is code environment. Are they just now in the data lake store, everything real-time and historical? >>Well, I can jump in there. Um, SQL on files is the, is the driver. And so we know when Amazon got Athena, um, that really drove a lot of the customers to really realistically look at data lake technologies, but data warehouses are not going away. And the integration between the two is not seamless. No, we, we are partners with AWS, but we don't work for them. So we can tell you the truth here. Um, there's, there's work to it, but it really, for my customers, it really upped the ante around data lake, uh, because Athena and technologies like that, the serverless, um, SQL queries or the familiar quarry, um, uh, libraries really drove a movement away from either OLTB or OLAP, more expensive, more cumbersome structures, >>But they still need that. Oh, LTP, like if they have high latency issues, they want to be low latency. Can they have the best of both worlds? That's the question. >>I mean, I w I would say we're getting, you know, we're getting closer. We're always going to be, uh, you know, that technology is going to be moving forward, and then we'll just move the goalpost again, in terms of, of what we're asking from it. But I think, you know, the technology that's getting out there, you can get, get really well. And then, you know, just what I work in the dynamo DB world. So you can get really great low latency. So, you know, single digit millisecond LLTP response times on that. I think some of the analytics stuff has been a problem with that. And there, there are different solutions out there to where you can export dynamo to S3, and then you can be doing SQL on your FA your files with Athena Lakeland's talking about, or now you see, you know, rock set of partner here that that'll just ingest your dynamo, DB data, you know, make all those changes. So if you're doing a lot of, uh, changes to your data and dynamo is going to reflect in Roxanna, and then you can do analytics queries, you can do complex filters, different things like that. So, you know, I, I think we continue to push the envelope and then we moved the goalpost again. But, um, you know, I think we're in a, a lot better place than we were a few years ago, for sure. >>Where do you guys see this going relative to the next level? If data as code becomes that next agile, um, software defined environment with open source? Well, all of these new tools with serverless things happening with data lakes are built in with nice architectures with data warehouses, where does it go next? What happens next? If this becomes an agile environment, what's the impact? >>Well, I don't want to be so dominant, but I have, I feel strongly, so I'm going to jump in here. So, so I, um, I feel like, you know, now for my, my, my most computationally intensive workloads, I'm using GPS, I'm bursting to GPU for TensorFlow neural networks. So I've been doing quite a bit of exploration around Amazon bracket for QPS and it's early. Um, and it's specialty. It's not, you know, for everybody. And the learning curve again is pretty daunting, but, um, there are some use cases out there. I mean, I got ahold of a paper where some people did some, um, it was a Q CNN, um, quantum convolutional neural network for lung cancer images, um, from COVID patients and the, the, uh, the QP Hugh, um, algorithm pipeline performed more accurately and faster. So I think, um, bursting to quantum is something to pay attention to. >>Awesome. Peter, what's your take on what's next? >>Well, I think there's still, um, that, that was absolutely fascinating from Lynn, but I think also there's, there's, uh, you know, some more sort of low-level, uh, low-hanging fruit available in, in the data stack. I think there's a lot of, there's still a lot of challenges around the transformation there, getting our data from sort of raw landed data into business domains, and that sort of talks to a lot of what data mesh is all about. I think if we can somehow make that a little more frictionless, because that that's really where the like labor intensive work is. That's, that's kinda dominating, uh, data engineering teams and where we're sort of trying to push that, that workload back onto, um, you know, software engineering teams. >>Alice will give you the final word. What's the impact. What's the next step? What's it look like in the future? >>Yeah, for sure. I mean, I've never had the, uh, breaking a data center problem that wind's had, or the bursting the quantum problem, for sure. But, you know, if you're in that, you know, the pool I swim and of terabytes of data and below and things like that, I think it's a good time. It just like we saw, you know, like we were talking about dev ops and, and pushing, uh, you know, allowing software engineers to handle more of, of the operation stuff. I think the same thing with data can happen where, you know, software engineering teams can handle not just their code, not just, you know, deploying and operating it, but also thinking about their data around the code. And that doesn't mean you won't have people assist you within your organization. You won't have some specialists in there, but I think pushing more stuff, even onto the individual development teams where they have ownership of that. And they're thinking about it through all this different life cycle. I mean, I'm pretty bullish on that. And I think that's an exciting development >>Was that shift, what left with left is security. What does that mean to >>Shipped so much stuff left, but now, you know, the things that were at the end are back at the end again, but, uh, you know, at least we think we can think about that stuff early in the process, which is good, >>Great conversation, very provocative, very realistic and great impact on the future data as code is real, the developers I do believe will have a great operational role and the data stack concept and impacting things like quantum, it's all kind of lining up nicely. Um, and it's a great opportunity to be in this field from a science and policy standpoint. Um, data engineering is legit. It's going to continue to grow and thanks for unpacking that here on the queue. Appreciate it. Okay. Great panel D AWS heroes. They work with AWS and the ecosystem independently out there. They're in the trenches doing the front lines, cracking the code here with data as code season two, episode two of the ongoing series of the 80, but startups I'm John for your host. Thanks for watching.

Published Date : Apr 5 2022

SUMMARY :

remotely and look forward to see you in person at the next re-invent or other event. What trends do you see in the database space? So I do, uh, I do a lot of consulting work working with different people and, you know, often with, And really lot deep into the database side in terms of like cloud native impact, diversity of database and then, you know, if you have some specialized needs, you want to show some real time stuff to your users, check out rock site. What are you working on? you know, put the pedal to the metal. What was the big change that you've seen with the, uh, the pandemic and in genomic cloud genomic specifically but security, you know, there's federated security is non-trivial and not well understood What are you working on and how does making sure that it's coherent across the company and a data platform, I have to ask you while you're here. So, you know, often times in the enterprise, you've got, uh, projects with So I'd like to ask each of you to answer this next question, which is how has the team dynamics Um, you know, I have, uh, a lot of experience with data lakes and, you know, containerizing and using What do you see this data engineering impact from a personnel standpoint? and then the security aspects, and also, you know, the mechanisms How does the data engineering impact organizations from your standpoint? I think definitely, you know, gone are the days where you have a single relational database that is serving but it's interesting, you know, I look at a database world and you look at the solutions that are out there. which makes it, you know, like I said, even more fun to work in this domain is, uh, the research dollars have really for them to go from 500 hours to five hours was good enough, but you know, edge and op operations, you know, IOT, world scenes, I could take it if you like. I mean, agility data is code is developer concept CICB I'd say, you know, some of those tools we're seeing come in from, from software, to be obviously, um, so this, this, this, um, metadata and versioning around you know, we've seen Ukraine war, but some open source, you know, malware hitting datasets I think that, you know, there's, there's, um, How do you make that work and not foreclose it with a lot of restrictions? So I think, you know, there's, there's a lot to be leveraged there in formation And I just got to get my data available on river performance. But I, I think that ebb and flow is going to be natural in response to, you know, the problems of the, Where would you put them in the progress bar of, of evolution towards the So we can tell you the truth here. the question. We're always going to be, uh, you know, that technology is going to be moving forward, so I, um, I feel like, you know, now for my, my, my most computationally intensive Peter, what's your take on what's next? but I think also there's, there's, uh, you know, some more sort of low-level, Alice will give you the final word. I think the same thing with data can happen where, you know, software engineering teams can handle What does that mean to Um, and it's a great opportunity to be

ENTITIES

Entity	Category	Confidence
Lynn	PERSON	0.99+
Peter	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Europe	LOCATION	0.99+
New Zealand	LOCATION	0.99+
Peter Hanson	PERSON	0.99+
five hours	QUANTITY	0.99+
500 hours	QUANTITY	0.99+
five	QUANTITY	0.99+
Alex	PERSON	0.99+
two	QUANTITY	0.99+
Alice	PERSON	0.99+
each side	QUANTITY	0.99+
Lynn Peter	PERSON	0.99+
each	QUANTITY	0.99+
Athena Lakeland	ORGANIZATION	0.99+
five minutes	QUANTITY	0.99+
John	PERSON	0.99+
pandemic	EVENT	0.98+
FinTech	ORGANIZATION	0.98+
GDPR	TITLE	0.98+
first	QUANTITY	0.98+
both	QUANTITY	0.98+
both worlds	QUANTITY	0.97+
single machine	QUANTITY	0.96+
10 years	QUANTITY	0.96+
first time	QUANTITY	0.96+
10 X	QUANTITY	0.96+
CICB	ORGANIZATION	0.94+
single	QUANTITY	0.94+
John furry	PERSON	0.93+
Lynn blankets	PERSON	0.93+
80	QUANTITY	0.91+
Lindbergh Lega consulting	ORGANIZATION	0.9+
LLTP	ORGANIZATION	0.89+
one	QUANTITY	0.87+
two different teams	QUANTITY	0.87+
terabytes	QUANTITY	0.86+
S3	TITLE	0.81+
COVID	ORGANIZATION	0.79+
Alex	TITLE	0.78+
Lakehouse	ORGANIZATION	0.77+
few years ago	DATE	0.77+
a millisecond	QUANTITY	0.77+
single digit	QUANTITY	0.76+
D AWS	ORGANIZATION	0.76+
Startup Showcase S2 E2	EVENT	0.73+
a second	QUANTITY	0.73+
Kubernetes	TITLE	0.72+
Athena	ORGANIZATION	0.71+
season two	QUANTITY	0.7+
SQL	TITLE	0.69+
OLTB	ORGANIZATION	0.69+
Redshift	ORGANIZATION	0.69+
CNN	ORGANIZATION	0.68+
Cedar	ORGANIZATION	0.66+
Hugh	PERSON	0.66+
dynamo	ORGANIZATION	0.65+
episode	QUANTITY	0.63+
Q	ORGANIZATION	0.63+
episode two	OTHER	0.6+
Maine	LOCATION	0.6+

Venkat Venkataramani, Rockset & Jerry Chen, Greylock | CUBEConversation, November 2018

[Music] we're on welcome to the special cube conversation we're here with some breaking news we got some startup investment news here in the Q studios palo alto I'm John for your host here at Jerry Chen partnered Greylock and the CEO of rock said Venkat Venkat Rahmani welcome to the cube you guys announcing hot news today series a and seed and Series A funding 21 million dollars for your company congratulations thank you Roxette is a data company jerry great this is one of your nest you kept this secret forever it was John was really hard you know over the past two years every time I sat in this seat I'd say and one more thing you know I knew that part of the advantage was rocks I was a special company and we were waiting to announce it and that's right time so it's been about two and half years in the making I gotta give you credit Jerry I just want to say to everyone I try to get the secrets out of you so hard you are so strong and keeping a secret I said you got this hot startup this was two years ago yeah I think the probe from every different angle you can keep it secrets all the entrepreneurs out there Jerry Chen's your guide alright so congratulations let's talk about the startup so you guys got 21 million dollars how much was the seed round this is the series a the seed was three million dollars both Greylock and Sequoia participating and the series a was eighteen point five all right so other investors Jerry who else was in on this I just the two firms former beginning so we teamed up with their French from Sequoia and the seed round and then we over the course of a year and half like this is great we're super excited about the team bank had Andrew bhai belt we love the opportunity and so Mike for an office coin I said let's do this around together and we leaned in and we did it around alright so let's just get into the other side I'm gonna read your your about section of the press release roxette's visions to Korea to build the data-driven future provide a service search and analytics engine make it easy to go from data to applications essentially building a sequel layer on top of the cloud for massive data ingestion I want to jump into it but this is a hot area not a lot of people are doing this at the level you guys are now and what your vision is did this come from what's your background how did you get here did you wake up one Wednesday I'm gonna build this awesome contraction layer and build an operating system around data make this thing scalable how did it all start I think it all started from like just a realization that you know turning useful data to useful apps just requires lots of like hurdles right you have to first figure out what format the data is in you got to prepare the data you gotta find the right specialized you know data database or data management system to load it in and it often requires like weeks to months before useful data becomes useful apps right and finally you know after I you know my tenure at Facebook when I left the first thing I did was I was just talking you know talking to a lot of people with real-world companies and reload problems and I started walking away from moremore of them thinking that this is way too complex I think the the format in which a lot of the data is coming in is not the format in which traditional sequel based databases are optimized for and they were built for like transaction processing and analytical processing not for like real-time streams of data but there's JSON or you know you know parque or or any of these other formats that are very very popular and more and more data is getting produced by one set of applications and getting consumed by other applications but what we saw it was what is this how can we make it simpler why do we need all this complexity right what is a simple what is the most simple and most powerful system we can build and pulled in the hands of as many people as possible and so we very sort of naturally relate to developers and data scientists people who use code on data that's just like you know kind of like our past lives and when we thought about it well why don't we just index the data you know traditional databases were built when every byte mattered every byte of memory every byte on disk now in the cloud the economics are completely different right so when you rethink those things with fresh perspective what we said was like what if we just get all of this data index it in a format where we can directly run very very fast sequel on it how simple would the world be how much faster can people go from ideas to do experiments and experiments to production applications and how do we make it all faster also in the cloud right so that's really the genesis of it well the real inspiration came from actually talking to a lot of people with real-world problems and then figuring out what is the simplest most powerful thing we can build well I want to get to the whole complexity conversation cuz we were talking before we came on camera here about how complexity can kill and why and more complexity on top of more complexity I think there's a simplicity angle here that's interesting but I want to get back to your background of Facebook and I want to tell a story you've been there eight years but you were there during a very interesting time during that time in history Facebook was I think the first generation we've taught us on the cube all the time about how they had to build their own infrastructure at scale while they're scaling so they were literally blitzscaling as reid hoffman and would say and you guys do it the Greylock coverage unlike other companies at scale eBay Microsoft they had old-school one dotto Technology databases Facebook had to kind of you know break glass you know and build the DevOps out from generation one from scratch correct it was a fantastic experience I think when I started in 2007 Facebook had about 40 million monthly actives and I had the privilege of working with some of the best people and a lot of the problems we were very quickly around 2008 when I went and said hey I want to do some infrastructure stuff the mandate that was given to me and my team was we've been very good at taking open source software and customizing it to our needs what would infrastructure built by Facebook for Facebook look like and we then went into this journey that ended up being building the online data infrastructure at Facebook by the time I left the collectively these systems were surveying 5 plus billion requests per second across 25 plus geographical clusters and half a dozen data centers I think at that time and now there's more and the system continues to chug along so it was just a fantastic experience I think all the traditional ways of problem solving just would not work at that scale and when the user base was doubling early in the early days every four months every five months yeah and what's interesting you know you're young and here at the front lines but you're kind of the frog in boiling water and that's because you are you were at that time building the power DevOps equation automating scale growth everything's happening at once you guys were right there building it now fast forward today everyone who's got an enterprise it's it wants to get there they don't they're not Facebook they don't have this engineering staff they want to get scale they see the cloud clearly the value property has got clear visibility but the economics behind who they hire so they have all this data and they get more increasing amount of data they want to be like Facebook but can't be like Facebook so they have to build their own solutions and I think this is where a lot of the other vendors have to rebuild this cherry I want to ask you because you've been looking at a lot of investments you've seen that old guard kind of like recycled database solutions coming to the market you've seen some stuff in open source but nothing unique what was it about Roxette that when you first talk to them that but you saw that this is going to be vectoring into a trend that was going to be a perfect storm yeah I think you nailed it John historic when we have this new problems like how to use data the first thing trying to do you saw with the old technology Oh existing data warehouses akin databases okay that doesn't work and then the next thing you do is like okay you know through my investments in docker and B and the boards or a cloud aerosol firsthand you need kind of this rise of stateless apps but not stateless databases right and then I through the cloud area and a bunch of companies that I saw has an investor every pitch I saw for two or three years trying to solve this data and state problem the cloud dudes add more boxes right here's here's a box database or s3 let me solve it with like Oh another database elastic or Kafka or Mongo or you know Apache arrow and it just got like a mess because if almond Enterprise IT shop there's no way can I have the skill the developers to manage this like as Beckett like to call it Rube Goldberg machination of data pipelines and you know I first met Venkat three years ago and one of the conversations was you know complexity you can't solve complex with more complexity you can only solve complexity with simplicity and Roxette and the vision they had was the first company said you know what let's remove boxes and their design principle was not adding another boxes all a problem but how to remove boxes to solve this problem and you know he and I got along with that vision and excited from the beginning stood to leave the scene ah sure let's go back with you guys now I got the funding so use a couple stealth years to with three million which is good a small team and that goes a long way it certainly 2021 total 18 fresh money it's gonna help you guys build out the team and crank whatnot get that later but what did you guys do in the in those two years where are you now sequel obviously is lingua franca cool of sequel but all this data is doesn't need to be scheming up and built out so were you guys that now so since raising the seed I think we've done a lot of R&D I think we fundamentally believe traditional data management systems that have been ported over to run on cloud Williams does not make them cloud databases I think the cloud economics is fundamentally different I think we're bringing this just scratching the surface of what is possible the cloud economics is you know it's like a simple realization that whether you rent 100 CPUs for one minute or or one CPU 400 minutes it's cost you exactly the same so then if you really ask why is any of my query is slow right I think because your software sucks right so basically what I'm trying to say is if you can actually paralyze that and if you can really exploit the fluidity of the hardware it's not easy it's very very difficult very very challenging but it's possible I think it's not impossible and if you can actually build software ground-up natively in the cloud that simplifies a lot of this stuff and and understands the economics are different now and it's system software at the end of the day is how do I get the best you know performance and efficiency for the price being paid right and the you know really building you know that is really what I think took a lot of time for us we have built not only a ground-up indexing technique that can take raw data without knowing the shape of the data we can turn that and index it in ways and store them maybe in more than one way since for certain types of data and then also have built a distributed sequel engine that is cloud native built by ground up in the cloud and C++ and like really high performance you know technologies and we can actually run distributor sequel on this raw data very very fast my god and this is why I brought up your background on Facebook I think there's a parallel there from the ground this ground up kind of philosophy if you think of sequel as like a Google search results search you know keyword it's the keyword for machines in most database worlds that is the standard so you can just use that as your interface Christ and then you using the cloud goodness to optimize for more of the results crafty index is that right correct yes you can ask your question if your app if you know how to see you sequel you know how to use Roxette if you can frame your the question that you're asking in order to answer an API request it could be a micro service that you're building it could be a recommendation engine that you're that you're building or you could you could have recommendations you know trying to personalize it on top of real time data any of those kinds of applications where it's a it's a service that you're building an application you're building if you can represent ask a question in sequel we will make sure it's fast all right let's get into the how you guys see the application development market because the developers will other winners here end of the day so when we were covering the Hadoop ecosystem you know from the cloud era days and now the important work at the Claire merger that kind of consolidates that kind of open source pool the big complaint that we used to hear from practitioners was its time consuming Talent but we used to kind of get down and dirty the questions and ask people how they're using Hadoop and we had two answers we stood up Hadoop we were running Hadoop in our company and then that was one answer the other answer was we're using Hadoop for blank there was not a lot of those responses in other words there has to be a reason why you're using it not just standing it up and then the Hadoop had the problem of the world grew really fast who's gonna run it yeah management of it Nukem noose new things came in so became complex overnight it kind of had took on cat hair on it basically as we would say so how do you guys see your solution being used so how do you solve that what we're running Roxette oh okay that's great for what what did developers use Roxette for so there are two big personas that that we currently have as users right there are developers and data scientists people who program on data right - you know on one hand developers want to build applications that are making either an existing application better it could be a micro service that you know I want to personalize the recommendations they generated online I mean offline but it's served online but whether it is somebody you know asking shopping for cars on San Francisco was the shopping you know was the shopping for cars in Colorado we can't show the same recommendations based on how do we basically personalize it so personalization IOT these kinds of applications developers love that because often what what you need to do is you need to combine real-time streams coming in semi structured format with structured data and you have no no sequel type of systems that are very good at semi structured data but they don't give you joins they don't give you a full sequel and then traditional sequel systems are a little bit cumbersome if you think about it I new elasticsearch but you can do joins and much more complex correct exactly built for the cloud and with full feature sequel and joins that's how that's the best way to think about it and that's how developers you said on the other side because its sequel now all of a sudden did you know data scientist also loved it they had they want to run a lot of experiments they are the sitting on a lot of data they want to play with it run experiments test hypotheses before they say all right I got something here I found a pattern that I don't know I know I had before which is why when you go and try to stand up traditional database infrastructure they don't know how what indexes to build how do i optimize it so that I can ask you know interrogatory and all that complexity away from those people right from basically provisioning a sandbox if you will almost like a perpetual sandbox of data correct except it's server less so like you don't you never think about you know how many SSDs do I need how many RAM do I need how many hosts do I need what configure your programmable data yes exactly so you start so DevOps for data is finally the interview I've been waiting for I've been saying it for years when's is gonna be a data DevOps so this is kind of what you're thinking right exactly so you know you give us literally you you log in to rocks at you give us read permissions to battle your data sitting in any cloud and more and more data sources we're adding support every day and we will automatically cloudburst will automatically interested we will schematize the data and we will give you very very fast sequel over rest so if you know how to use REST API and if you know how to use sequel you'd literally need don't need to think about anything about Hardware anything about standing up any servers shards you know reindex and restarting none of that you just go from here is a bunch of data here are my questions here is the app I want to build you know like you should be bottleneck by your career and imagination not by what can my data employers give me through a use case real quick island anyway the Jarius more the structural and architectural questions around the marketplace take me through a use case I'm a developer what's the low-hanging fruit use case how would I engage with you guys yeah do I just you just ingest I just point data at you how do you see your market developing from the customer standpoint cool I'll take one concrete example from a from a developer right from somebody we're working with right now so they have right now offline recommendations right or every night they generate like if you're looking for this car or or this particular item in e-commerce these are the other things are related well they show the same thing if you're looking at let's say a car this is the five cars that are closely related this car and they show that no matter who's browsing well you might have clicked on blue cars the 17 out of 18 clicks you should be showing blue cars to them right you may be logging in from San Francisco I may be logging in from like Colorado we may be looking for different kinds of cars with different you know four-wheel drives and other options and whatnot there's so much information that's available that you can you're actually by personalizing it you're adding creating more value to your customer we make it very easy you know live stream all the click stream beta to rock set and you can join that with all the assets that you have whether it's product data user data past transaction history and now if you can represent the joins or whatever personalization that you want to find in real time as a sequel statement you can build that personalization engine on top of Roxanne this is one one category you're putting sequel code into the kind of the workflow of the code saying okay when someone gets down to these kinds of interactions this is the sequel query because it's a blue car kind of go down right so like tell me all the recent cars that this person liked what color is this and I want to like okay here's a set of candidate recommendations I have how do I start it what are the four five what are the top five I want to show and then on the data science use case there's a you know somebody building a market intelligence application they get a lot of third-party data sets it's periodic dumps of huge blocks of JSON they want to combine that with you know data that they have internally within the enterprise to see you know which customers are engaging with them who are the persons churning out what are they doing and they in the in the market and trying to bring they bring it all together how do you do that when you how do you join a sequel table with a with a JSON third party dumb and especially for coming and like in the real-time or periodic in a week or week month or one month literally you can you know what took this particular firm that we're working with this is an investment firm trying to do market intelligence it used age to run ad hoc scripts to turn all of this data into a useful Excel report and that used to take them three to four weeks and you know two people working on one person working part time they did the same thing in two days and Rock said I want to get to back to microservices in a minute and hold that thought I won't go to Jerry if you want to get to the business model question that landscape because micro services were all the world's going to Inc so competition business model I'll see you gets are funded so they said love the thing about monetization to my stay on the core value proposition in light of the red hat being bought by by IBM had a tweet out there kind of critical of the transactions just in terms of you know people talk about IBM's betting the company on RedHat Mike my tweet was don't get your reaction will and tie it to the visible here is that it seems like they're going to macro services not micro services and that the world is the stack is changing so when IBM sell out their stack you have old-school stack thinkers and then you have new-school stack thinkers where cloud completely changes the nature of the stack in this case this venture kind of is an indication that if you think differently the stack is not just a full stack this way it's this way in this way yeah as we've been saying on the queue for a couple of years so you get the old guard trying to get a position and open source all these things but the stacks changing these guys have the cloud out there as a tailwind which is a good thing how do you see the business model evolving do you guys talk about that in terms of you can hey just try to find your groove swing get customers don't worry about the monetization how many charging so how's that how do you guys talk about the business model is it specific and you guys have clear visibility on that what's the story on that I mean I think yeah I always tell Bank had this kind of three hurdles you know you have something worthwhile one well someone listen to your pitch right people are busy you like hey John you get pitched a hundred times a day by startups right will you take 30 seconds listen to it that's hurdle one her will to is we spend time hands on keyboards playing around with the code and step threes will they write you a check and I as a as a enter price offered investor in a former operator we don't overly folks in the revenue model now I think writing a check the biz model just means you're creating value and I think people write you checking screening value but you know the feedback I always give Venkat and the founders work but don't overthink pricing if the first 10 customers just create value like solve their problems make them love the product get them using it and then the monetization the actual specifics the business model you know we'll figure out down the line I mean it's a cloud service it's you know service tactically to many servers in that sentence but it's um it's to your point spore on the cloud the one that economists are good so if it works it's gonna be profitable yeah it's born the cloud multi-cloud right across whatever cloud I wanna be in it's it's the way application architects going right you don't you don't care about VMs you don't care about containers you just care about hey here's my data I just want to query it and in the past you us developer he had to make compromises if I wanted joins in sequel queries I had to use like postgrads if I won like document database and he's like Mongo if I wanted index how to use like elastic and so either one I had to pick one or two I had to use all three you know and and neither world was great and then all three of those products have different business models and with rocks head you actually don't need to make choices right yes this is classic Greylock investment you got sequoia same way go out get a position in the market don't overthink the revenue model you'll funded for grow the company let's scale a little bit and figure out that blitzscale moment I believe there's probably the ethos that you guys have here one thing I would add in the business model discussion is that we're not optimized to sell latte machines who are selling coffee by the cup right so like that's really what I mean we want to put it in the hands of as many people as possible and make sure we are useful to them right and I think that is what we're obsessed about where's the search is a good proxy I mean that's they did well that way and rocks it's free to get started right so right now they go to rocks calm get started for free and just start and play around with it yeah yeah I mean I think you guys hit the nail on the head on this whole kind of data addressability I've been talking about it for years making it part of the development process programming data whatever buzzword comes out of it I think the trend is it looks a lot like that depo DevOps ethos of automation scale you get to value quickly not over thinking it the value proposition and let it organically become part of the operation yeah I think we we the internal KPIs we track are like how many users and applications are using us on a daily and weekly basis this is what we obsess about I think we say like this is what excellence looks like and we pursue that the logos in the revenue would would you know would be a second-order effect yeah and it's could you build that core kernels this classic classic build up so I asked about the multi cloud you mention that earlier I want to get your thoughts on kubernetes obviously there's a lot of great projects going on and CN CF around is do and this new state problem that you're solving in rest you know stateless has been an easy solution VP is but API 2.0 is about state right so that's kind of happening now what's your view on kubernetes why is it going to be impactful if someone asked you you know at a party hey thank you why is what's all this kubernetes what party going yeah I mean all we do is talk about kubernetes and no operating systems yeah hand out candy last night know we're huge fans of communities and docker in fact in the entire rock set you know back-end is built on top of that so we run an AWS but with the inside that like we run or you know their entire infrastructure in one kubernetes cluster and you know that is something that I think is here to stay I think this is the the the programmability of it I think the DevOps automation that comes with kubernetes I think all of that is just like this is what people are going to start taking why is it why is it important in your mind the orchestration because of the statement what's the let's see why is it so important it's a lot of people are jazzed about it I've been you know what's what's the key thing I think I think it makes your entire infrastructure program all right I think it turns you know every aspect of you know for example yeah I'll take it I'll take a concrete example we wanted to build this infrastructure so that when somebody points that like it's a 10 terabytes of data we want to very quickly Auto scale that out and be able to grow this this cluster as quickly as possible and it's like this fluidity of the hardware that I'm talking about and it needs to happen or two levels it's one you know micro service that is ingesting all the data that needs to sort of burst out and also at the second level we need to be able to grow more more nodes that we we add to this cluster and so the programmability nature of this like just imagine without an abstraction like kubernetes and docker and containers and pods imagine doing this right you are building a you know a lots and lots of metrics and monitoring and you're trying to build the state machine of like what is my desired state in terms of server utilization and what is the observed state and everything is so ad hoc and very complicated and kubernetes makes this whole thing programmable so I think it's now a lot of the automation that we do in terms of called bursting and whatnot when I say clock you know it's something we do take advantage of that with respect to stateful services I think it's still early days so our our position on my partner it's a lot harder so our position on that is continue to use communities and continue to make things as stateless as possible and send your real-time streams to a service like Roxette not necessarily that pick something like that very separate state and keep it in a backhand that is very much suited to your micro service and the business logic that needs to live there continue should continue to live there but if you can take a very hard to scale stateful service split it into two and have some kind of an indexing system Roxette is one that you know we are proud of building and have your stateless communal application logic and continue to have that you know maybe use kubernetes scale it in lambdas you know for all we care but you can take something that is very hard to you know manage and scale today break it into the stateful part in the stateless part and the serval is back in like like Roxette will will sort of hopefully give you a huge boost in being able to go from you know an experiment to okay I'm gonna roll it out to a smaller you know set of audience to like I want to do a worldwide you know you can do all of that without having to worry about and think about the alternative if you did it the old way yeah yeah and that's like talent you'd need it would be a wired that's spaghetti everywhere so Jerry this is a kubernetes is really kind of a benefit off your your investment in docker you must be proud and that the industry has gone to a whole nother level because containers really enable all this correct yeah so that this is where this is an example where I think clouds gonna go to a whole nother level that no one's seen before these kinds of opportunities that you're investing in so I got to ask you directly as you're looking at them as a as a knowledgeable cloud guy as well as an investor cloud changes things how does that change how is cloud native and these kinds of new opportunities that have built from the ground up change a company's network network security application era formants because certainly this is a game changer so those are the three areas I see a lot of impact compute check storage check networking early days you know it's it's it's funny it gosh seems so long ago yet so briefly when you know I first talked five years ago when I first met mayor of Essen or docker and it was from beginning people like okay yes stateless applications but stateful container stateless apps and then for the next three or four years we saw a bunch of companies like how do I handle state in a docker based application and lots of stars have tried and is the wrong approach the right approach is what these guys have cracked just suffered the state from the application those are app stateless containers store your state on an indexing layer like rock set that's hopefully one of the better ways saw the problem but as you kind of under one problem and solve it with something like rock set to your point awesome like networking issue because all of a sudden like I think service mesh and like it's do and costs or kind of the technologies people talk about because as these micro services come up and down they're pretty dynamic and partially as a developer I don't want to care about that yeah right that's the value like a Roxanna service but still as they operate of the cloud or the IT person other side of the proverbial curtain I probably care security I matters because also India's flowing from multiple locations multiple destinations using all these API and then you have kind of compliance like you know GDP are making security and privacy super important right now so that's an area that we think a lot about as investors so can I program that into Roxette what about to build that in my nap app natively leveraging the Roxette abstraction checking what's the key learning feature it's just a I'd say I'm a prime agent Ariane gdpr hey you know what I got a website and social network out in London and Europe and I got this gdpr nightmare I don't we don't have a great answer for GDP are we are we're not a controller of the data right we're just a processor so I think for GDP are I think there is still the controller still has to do a lot of work to be compliant with GDP are I think the way we look at it is like we never forget that this ultimately is going to be adding value to enterprises so from day one we you can't store data and Roxette without encrypting it like it's just the on you know on by default the only way and all transit is all or HTTPS and SSL and so we never freaked out that we're building for enterprises and so we've baked in for enterprise customers if they can bring in their own custom encryption key and so everything will be encrypted the key never leaves their AWS account if it's a you know kms key support private VP ceilings like we have a plethora of you know security features so that the the control of the data is still with the data controller with this which is our customer but we will be the the processor and a lot of the time we can process it using their encryption keys if I'm gonna build a GDP our sleeves no security solution I would probably build on Roxette and some of the early developers take around rocks at our security companies that are trying to track we're all ideas coming and going so there the processor and then one of the companies we hope to enable with Roxette is another generation security and privacy companies that in the past had a hard time tracking all this data so I can build on top of rocks crack okay so you can built you can build security a gbbr solution on top rock set because rock set gives you the power to process all the data index all the data and then so one of the early developers you know stolen stealth is they looking at the data flows coming and go he's using them and they'll apply the context right they'll say oh this is your credit card the Social Security is your birthday excetera your favorite colors and they'll apply that but I think to your point it's game-changing like not just Roxette but all the stuff in cloud and as an investor we see a whole generation of new companies either a to make things better or B to solve this new category problems like pricing the cloud and I think the future is pretty bright for both great founders and investors because there's just a bunch of great new companies and it's building up from the ground up this is the thing I brought my mother's red hat IBM thing is that's not the answer at the root level I feel like right now I'd be on I I think's fastenings but it's almost like you're almost doubling down to your your comment on the old stack right it's almost a double down the old stack versus an aggressive bet on kind of what a cloud native stack will look like you know I wish both companies are great people I was doing the best and stuff do well with I think I'd like to do great with OpenStack but again their product company as the people that happen to contribute to open source I think was a great move for both companies but it doesn't mean that that's not we can't do well without a new stack doing well and I think you're gonna see this world where we have to your point oh these old stacks but then a category of new stack companies that are being born in the cloud they're just fun to watch it all it's all big all big investments that would be blitzscaling criteria all start out organically on a wave in a market that has problems yeah and that's growing so I think cloud native ground-up kind of clean sheet of paper that's the new you know I say you're just got a pic pick up you got to pick the right way if I'm oh it's gotta pick a big wave big wave is not a bad wave to be on right now and it's at the data way that's part of the cloud cracked and it's it's been growing bigger it's it's arguably bigger than IBM is bigger than Red Hat is bigger than most of the companies out there and I think that's the right way to bet on it so you're gonna pick the next way that's kind of cloud native-born the cloud infrastructure that is still early days and companies are writing that way we're gonna do well and so I'm pretty excited there's a lot of opportunities certainly this whole idea that you know this change is coming societal change you know what's going on mission based companies from whether it's the NGO to full scale or all the applications that the clouds can enable from data privacy your wearables or cars or health thing we're seeing it every single day I'm pretty sad if you took amazon's revenue and then edit edit and it's not revenue the whole ready you look at there a dybbuk loud revenue so there's like 20 billion run which you know Microsoft had bundles in a lot of their office stuff as well if you took amazon's customers to dinner in the marketplace and took their revenue there clearly would be never for sure if item binds by a long shot so they don't count that revenue and that's a big factor if you look at whoever can build these enabling markets right now there's gonna be a few few big ones I think coming on they're gonna do well so I think this is a good opportunity of gradual ations thank you thank you at 21 million dollars final question before we go what are you gonna spend it on we're gonna spend it on our go-to-market strategy and hiding amazing people as many as we can get good good answer didn't say launch party that I'm saying right yeah okay we're here Rex at SIA and Joe's Jerry Chen cube cube royalty number two all-time on our Keeble um nine list partner and Greylock guy states were coming in I'm Jeffrey thanks for watching this special cube conversation [Music]

Published Date : Nov 1 2018

SUMMARY :

the enterprise to see you know which

ENTITIES

Entity	Category	Confidence
San Francisco	LOCATION	0.99+
amazon	ORGANIZATION	0.99+
2007	DATE	0.99+
five cars	QUANTITY	0.99+
Jerry Chen	PERSON	0.99+
three million dollars	QUANTITY	0.99+
10 terabytes	QUANTITY	0.99+
30 seconds	QUANTITY	0.99+
Colorado	LOCATION	0.99+
Europe	LOCATION	0.99+
London	LOCATION	0.99+
one minute	QUANTITY	0.99+
two	QUANTITY	0.99+
21 million dollars	QUANTITY	0.99+
IBM	ORGANIZATION	0.99+
November 2018	DATE	0.99+
Facebook	ORGANIZATION	0.99+
Jerry	PERSON	0.99+
17	QUANTITY	0.99+
Microsoft	ORGANIZATION	0.99+
two people	QUANTITY	0.99+
2021	DATE	0.99+
AWS	ORGANIZATION	0.99+
second level	QUANTITY	0.99+
Excel	TITLE	0.99+
Mike	PERSON	0.99+
three million	QUANTITY	0.99+
eight years	QUANTITY	0.99+
reid hoffman	PERSON	0.99+
Roxette	ORGANIZATION	0.99+
five years ago	DATE	0.99+
Rube Goldberg	PERSON	0.99+
three years	QUANTITY	0.99+
two answers	QUANTITY	0.99+
two levels	QUANTITY	0.99+
three	QUANTITY	0.99+
both companies	QUANTITY	0.99+
Roxanna	ORGANIZATION	0.99+
Rock	PERSON	0.99+
C++	TITLE	0.99+
two big personas	QUANTITY	0.99+
21 million dollars	QUANTITY	0.99+
18 clicks	QUANTITY	0.99+
Hadoop	TITLE	0.99+
one	QUANTITY	0.99+
Sequoia	ORGANIZATION	0.98+
Venkat Venkataramani	PERSON	0.98+
three years ago	DATE	0.98+
Jeffrey	PERSON	0.98+
John	PERSON	0.98+
two firms	QUANTITY	0.98+
eBay	ORGANIZATION	0.98+
one person	QUANTITY	0.98+
Venkat	ORGANIZATION	0.98+
100 CPUs	QUANTITY	0.98+
Andrew	PERSON	0.98+
25 plus geographical clusters	QUANTITY	0.98+
today	DATE	0.98+
half a dozen data centers	QUANTITY	0.98+
four weeks	QUANTITY	0.98+
both companies	QUANTITY	0.98+
one month	QUANTITY	0.97+
two years ago	DATE	0.97+
400 minutes	QUANTITY	0.97+
more than one way	QUANTITY	0.97+
one answer	QUANTITY	0.97+
two days	QUANTITY	0.96+
SIA	ORGANIZATION	0.96+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Roxanna: