Steven Hillion & Jeff Fletcher, Astronomer | AWS Startup Showcase S3E1

(upbeat music) >> Welcome everyone to theCUBE's presentation of the AWS Startup Showcase AI/ML Top Startups Building Foundation Model Infrastructure. This is season three, episode one of our ongoing series covering exciting startups from the AWS ecosystem to talk about data and analytics. I'm your host, Lisa Martin and today we're excited to be joined by two guests from Astronomer. Steven Hillion joins us, it's Chief Data Officer and Jeff Fletcher, it's director of ML. They're here to talk about machine learning and data orchestration. Guys, thank you so much for joining us today. >> Thank you. >> It's great to be here. >> Before we get into machine learning let's give the audience an overview of Astronomer. Talk about what that is, Steven. Talk about what you mean by data orchestration. >> Yeah, let's start with Astronomer. We're the Airflow company basically. The commercial developer behind the open-source project, Apache Airflow. I don't know if you've heard of Airflow. It's sort of de-facto standard these days for orchestrating data pipelines, data engineering pipelines, and as we'll talk about later, machine learning pipelines. It's really is the de-facto standard. I think we're up to about 12 million downloads a month. That's actually as a open-source project. I think at this point it's more popular by some measures than Slack. Airflow was created by Airbnb some years ago to manage all of their data pipelines and manage all of their workflows and now it powers the data ecosystem for organizations as diverse as Electronic Arts, Conde Nast is one of our big customers, a big user of Airflow. And also not to mention the biggest banks on Wall Street use Airflow and Astronomer to power the flow of data throughout their organizations. >> Talk about that a little bit more, Steven, in terms of the business impact. You mentioned some great customer names there. What is the business impact or outcomes that a data orchestration strategy enables businesses to achieve? >> Yeah, I mean, at the heart of it is quite simply, scheduling and managing data pipelines. And so if you have some enormous retailer who's managing the flow of information throughout their organization they may literally have thousands or even tens of thousands of data pipelines that need to execute every day to do things as simple as delivering metrics for the executives to consume at the end of the day, to producing on a weekly basis new machine learning models that can be used to drive product recommendations. One of our customers, for example, is a British food delivery service. And you get those recommendations in your application that says, "Well, maybe you want to have samosas with your curry." That sort of thing is powered by machine learning models that they train on a regular basis to reflect changing conditions in the market. And those are produced through Airflow and through the Astronomer platform, which is essentially a managed platform for running airflow. So at its simplest it really is just scheduling and managing those workflows. But that's easier said than done of course. I mean if you have 10 thousands of those things then you need to make sure that they all run that they all have sufficient compute resources. If things fail, how do you track those down across those 10,000 workflows? How easy is it for an average data scientist or data engineer to contribute their code, their Python notebooks or their SQL code into a production environment? And then you've got reproducibility, governance, auditing, like managing data flows across an organization which we think of as orchestrating them is much more than just scheduling. It becomes really complicated pretty quickly. >> I imagine there's a fair amount of complexity there. Jeff, let's bring you into the conversation. Talk a little bit about Astronomer through your lens, data orchestration and how it applies to MLOps. >> So I come from a machine learning background and for me the interesting part is that machine learning requires the expansion into orchestration. A lot of the same things that you're using to go and develop and build pipelines in a standard data orchestration space applies equally well in a machine learning orchestration space. What you're doing is you're moving data between different locations, between different tools, and then tasking different types of tools to act on that data. So extending it made logical sense from a implementation perspective. And a lot of my focus at Astronomer is really to explain how Airflow can be used well in a machine learning context. It is being used well, it is being used a lot by the customers that we have and also by users of the open source version. But it's really being able to explain to people why it's a natural extension for it and how well it fits into that. And a lot of it is also extending some of the infrastructure capabilities that Astronomer provides to those customers for them to be able to run some of the more platform specific requirements that come with doing machine learning pipelines. >> Let's get into some of the things that make Astronomer unique. Jeff, sticking with you, when you're in customer conversations, what are some of the key differentiators that you articulate to customers? >> So a lot of it is that we are not specific to one cloud provider. So we have the ability to operate across all of the big cloud providers. I know, I'm certain we have the best developers that understand how best practices implementations for data orchestration works. So we spend a lot of time talking to not just the business outcomes and the business users of the product, but also also for the technical people, how to help them better implement things that they may have come across on a Stack Overflow article or not necessarily just grown with how the product has migrated. So it's the ability to run it wherever you need to run it and also our ability to help you, the customer, better implement and understand those workflows that I think are two of the primary differentiators that we have. >> Lisa: Got it. >> I'll add another one if you don't mind. >> You can go ahead, Steven. >> Is lineage and dependencies between workflows. One thing we've done is to augment core Airflow with Lineage services. So using the Open Lineage framework, another open source framework for tracking datasets as they move from one workflow to another one, team to another, one data source to another is a really key component of what we do and we bundle that within the service so that as a developer or as a production engineer, you really don't have to worry about lineage, it just happens. Jeff, may show us some of this later that you can actually see as data flows from source through to a data warehouse out through a Python notebook to produce a predictive model or a dashboard. Can you see how those data products relate to each other? And when something goes wrong, figure out what upstream maybe caused the problem, or if you're about to change something, figure out what the impact is going to be on the rest of the organization. So Lineage is a big deal for us. >> Got it. >> And just to add on to that, the other thing to think about is that traditional Airflow is actually a complicated implementation. It required quite a lot of time spent understanding or was almost a bespoke language that you needed to be able to develop in two write these DAGs, which is like fundamental pipelines. So part of what we are focusing on is tooling that makes it more accessible to say a data analyst or a data scientist who doesn't have or really needs to gain the necessary background in how the semantics of Airflow DAGs works to still be able to get the benefit of what Airflow can do. So there is new features and capabilities built into the astronomer cloud platform that effectively obfuscates and removes the need to understand some of the deep work that goes on. But you can still do it, you still have that capability, but we are expanding it to be able to have orchestrated and repeatable processes accessible to more teams within the business. >> In terms of accessibility to more teams in the business. You talked about data scientists, data analysts, developers. Steven, I want to talk to you, as the chief data officer, are you having more and more conversations with that role and how is it emerging and evolving within your customer base? >> Hmm. That's a good question, and it is evolving because I think if you look historically at the way that Airflow has been used it's often from the ground up. You have individual data engineers or maybe single data engineering teams who adopt Airflow 'cause it's very popular. Lots of people know how to use it and they bring it into an organization and say, "Hey, let's use this to run our data pipelines." But then increasingly as you turn from pure workflow management and job scheduling to the larger topic of orchestration you realize it gets pretty complicated, you want to have coordination across teams, and you want to have standardization for the way that you manage your data pipelines. And so having a managed service for Airflow that exists in the cloud is easy to spin up as you expand usage across the organization. And thinking long term about that in the context of orchestration that's where I think the chief data officer or the head of analytics tends to get involved because they really want to think of this as a strategic investment that they're making. Not just per team individual Airflow deployments, but a network of data orchestrators. >> That network is key. Every company these days has to be a data company. We talk about companies being data driven. It's a common word, but it's true. It's whether it is a grocer or a bank or a hospital, they've got to be data companies. So talk to me a little bit about Astronomer's business model. How is this available? How do customers get their hands on it? >> Jeff, go ahead. >> Yeah, yeah. So we have a managed cloud service and we have two modes of operation. One, you can bring your own cloud infrastructure. So you can say here is an account in say, AWS or Azure and we can go and deploy the necessary infrastructure into that, or alternatively we can host everything for you. So it becomes a full SaaS offering. But we then provide a platform that connects at the backend to your internal IDP process. So however you are authenticating users to make sure that the correct people are accessing the services that they need with role-based access control. From there we are deploying through Kubernetes, the different services and capabilities into either your cloud account or into an account that we host. And from there Airflow does what Airflow does, which is its ability to then reach to different data systems and data platforms and to then run the orchestration. We make sure we do it securely, we have all the necessary compliance certifications required for GDPR in Europe and HIPAA based out of the US, and a whole bunch host of others. So it is a secure platform that can run in a place that you need it to run, but it is a managed Airflow that includes a lot of the extra capabilities like the cloud developer environment and the open lineage services to enhance the overall airflow experience. >> Enhance the overall experience. So Steven, going back to you, if I'm a Conde Nast or another organization, what are some of the key business outcomes that I can expect? As one of the things I think we've learned during the pandemic is access to realtime data is no longer a nice to have for organizations. It's really an imperative. It's that demanding consumer that wants to have that personalized, customized, instant access to a product or a service. So if I'm a Conde Nast or I'm one of your customers, what can I expect my business to be able to achieve as a result of data orchestration? >> Yeah, I think in a nutshell it's about providing a reliable, scalable, and easy to use service for developing and running data workflows. And talking of demanding customers, I mean, I'm actually a customer myself, as you mentioned, I'm the head of data for Astronomer. You won't be surprised to hear that we actually use Astronomer and Airflow to run all of our data pipelines. And so I can actually talk about my experience. When I started I was of course familiar with Airflow, but it always seemed a little bit unapproachable to me if I was introducing that to a new team of data scientists. They don't necessarily want to have to think about learning something new. But I think because of the layers that Astronomer has provided with our Astro service around Airflow it was pretty easy for me to get up and running. Of course I've got an incentive for doing that. I work for the Airflow company, but we went from about, at the beginning of last year, about 500 data tasks that we were running on a daily basis to about 15,000 every day. We run something like a million data operations every month within my team. And so as one outcome, just the ability to spin up new production workflows essentially in a single day you go from an idea in the morning to a new dashboard or a new model in the afternoon, that's really the business outcome is just removing that friction to operationalizing your machine learning and data workflows. >> And I imagine too, oh, go ahead, Jeff. >> Yeah, I think to add to that, one of the things that becomes part of the business cycle is a repeatable capabilities for things like reporting, for things like new machine learning models. And the impediment that has existed is that it's difficult to take that from a team that's an analyst team who then provide that or a data science team that then provide that to the data engineering team who have to work the workflow all the way through. What we're trying to unlock is the ability for those teams to directly get access to scheduling and orchestrating capabilities so that a business analyst can have a new report for C-suite execs that needs to be done once a week, but the time to repeatability for that report is much shorter. So it is then immediately in the hands of the person that needs to see it. It doesn't have to go into a long list of to-dos for a data engineering team that's already overworked that they eventually get it to it in a month's time. So that is also a part of it is that the realizing, orchestration I think is fairly well and a lot of people get the benefit of being able to orchestrate things within a business, but it's having more people be able to do it and shorten the time that that repeatability is there is one of the main benefits from good managed orchestration. >> So a lot of workforce productivity improvements in what you're doing to simplify things, giving more people access to data to be able to make those faster decisions, which ultimately helps the end user on the other end to get that product or the service that they're expecting like that. Jeff, I understand you have a demo that you can share so we can kind of dig into this. >> Yeah, let me take you through a quick look of how the whole thing works. So our starting point is our cloud infrastructure. This is the login. You go to the portal. You can see there's a a bunch of workspaces that are available. Workspaces are like individual places for people to operate in. I'm not going to delve into all the deep technical details here, but starting point for a lot of our data science customers is we have what we call our Cloud IDE, which is a web-based development environment for writing and building out DAGs without actually having to know how the underpinnings of Airflow work. This is an internal one, something that we use. You have a notebook-like interface that lets you write python code and SQL code and a bunch of specific bespoke type of blocks if you want. They all get pulled together and create a workflow. So this is a workflow, which gets compiled to something that looks like a complicated set of Python code, which is the DAG. I then have a CICD process pipeline where I commit this through to my GitHub repo. So this comes to a repo here, which is where these DAGs that I created in the previous step exist. I can then go and say, all right, I want to see how those particular DAGs have been running. We then get to the actual Airflow part. So this is the managed Airflow component. So we add the ability for teams to fairly easily bring up an Airflow instance and write code inside our notebook-like environment to get it into that instance. So you can see it's been running. That same process that we built here that graph ends up here inside this, but you don't need to know how the fundamentals of Airflow work in order to get this going. Then we can run one of these, it runs in the background and we can manage how it goes. And from there, every time this runs, it's emitting to a process underneath, which is the open lineage service, which is the lineage integration that allows me to come in here and have a look and see this was that actual, that same graph that we built, but now it's the historic version. So I know where things started, where things are going, and how it ran. And then I can also do a comparison. So if I want to see how this particular run worked compared to one historically, I can grab one from a previous date and it will show me the comparison between the two. So that combination of managed Airflow, getting Airflow up and running very quickly, but the Cloud IDE that lets you write code and know how to get something into a repeatable format get that into Airflow and have that attached to the lineage process adds what is a complete end-to-end orchestration process for any business looking to get the benefit from orchestration. >> Outstanding. Thank you so much Jeff for digging into that. So one of my last questions, Steven is for you. This is exciting. There's a lot that you guys are enabling organizations to achieve here to really become data-driven companies. So where can folks go to get their hands on this? >> Yeah, just go to astronomer.io and we have plenty of resources. If you're new to Airflow, you can read our documentation, our guides to getting started. We have a CLI that you can download that is really I think the easiest way to get started with Airflow. But you can actually sign up for a trial. You can sign up for a guided trial where our teams, we have a team of experts, really the world experts on getting Airflow up and running. And they'll take you through that trial and allow you to actually kick the tires and see how this works with your data. And I think you'll see pretty quickly that it's very easy to get started with Airflow, whether you're doing that from the command line or doing that in our cloud service. And all of that is available on our website >> astronomer.io. Jeff, last question for you. What are you excited about? There's so much going on here. What are some of the things, maybe you can give us a sneak peek coming down the road here that prospects and existing customers should be excited about? >> I think a lot of the development around the data awareness components, so one of the things that's traditionally been complicated with orchestration is you leave your data in the place that you're operating on and we're starting to have more data processing capability being built into Airflow. And from a Astronomer perspective, we are adding more capabilities around working with larger datasets, doing bigger data manipulation with inside the Airflow process itself. And that lends itself to better machine learning implementation. So as we start to grow and as we start to get better in the machine learning context, well, in the data awareness context, it unlocks a lot more capability to do and implement proper machine learning pipelines. >> Awesome guys. Exciting stuff. Thank you so much for talking to me about Astronomer, machine learning, data orchestration, and really the value in it for your customers. Steve and Jeff, we appreciate your time. >> Thank you. >> My pleasure, thanks. >> And we thank you for watching. This is season three, episode one of our ongoing series covering exciting startups from the AWS ecosystem. I'm your host, Lisa Martin. You're watching theCUBE, the leader in live tech coverage. (upbeat music)

Published Date : Mar 9 2023

SUMMARY :

of the AWS Startup Showcase let's give the audience and now it powers the data ecosystem What is the business impact or outcomes for the executives to consume how it applies to MLOps. and for me the interesting that you articulate to customers? So it's the ability to run it if you don't mind. that you can actually see as data flows the other thing to think about to more teams in the business. about that in the context of orchestration So talk to me a little bit at the backend to your So Steven, going back to you, just the ability to spin up but the time to repeatability a demo that you can share that allows me to come There's a lot that you guys We have a CLI that you can download What are some of the things, in the place that you're operating on and really the value in And we thank you for watching.

ENTITIES

Entity	Category	Confidence
Jeff	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Jeff Fletcher	PERSON	0.99+
Steven	PERSON	0.99+
Steve	PERSON	0.99+
Steven Hillion	PERSON	0.99+
Lisa	PERSON	0.99+
Europe	LOCATION	0.99+
Conde Nast	ORGANIZATION	0.99+
US	LOCATION	0.99+
thousands	QUANTITY	0.99+
two	QUANTITY	0.99+
HIPAA	TITLE	0.99+
AWS	ORGANIZATION	0.99+
two guests	QUANTITY	0.99+
Airflow	ORGANIZATION	0.99+
Airbnb	ORGANIZATION	0.99+
10 thousands	QUANTITY	0.99+
One	QUANTITY	0.99+
Electronic Arts	ORGANIZATION	0.99+
one	QUANTITY	0.99+
Python	TITLE	0.99+
two modes	QUANTITY	0.99+
Airflow	TITLE	0.98+
10,000 workflows	QUANTITY	0.98+
about 500 data tasks	QUANTITY	0.98+
today	DATE	0.98+
one outcome	QUANTITY	0.98+
tens of thousands	QUANTITY	0.98+
GDPR	TITLE	0.97+
SQL	TITLE	0.97+
GitHub	ORGANIZATION	0.96+
astronomer.io	OTHER	0.94+
Slack	ORGANIZATION	0.94+
Astronomer	ORGANIZATION	0.94+
some years ago	DATE	0.92+
once a week	QUANTITY	0.92+
Astronomer	TITLE	0.92+
theCUBE	ORGANIZATION	0.92+
last year	DATE	0.91+
Kubernetes	TITLE	0.88+
single day	QUANTITY	0.87+
about 15,000 every day	QUANTITY	0.87+
one cloud	QUANTITY	0.86+
IDE	TITLE	0.86+

Paola Peraza Calderon & Viraj Parekh, Astronomer | Cube Conversation

(soft electronic music) >> Hey everyone, welcome to this CUBE conversation as part of the AWS Startup Showcase, season three, episode one, featuring Astronomer. I'm your host, Lisa Martin. I'm in the CUBE's Palo Alto Studios, and today excited to be joined by a couple of guests, a couple of co-founders from Astronomer. Viraj Parekh is with us, as is Paola Peraza-Calderon. Thanks guys so much for joining us. Excited to dig into Astronomer. >> Thank you so much for having us. >> Yeah, thanks for having us. >> Yeah, and we're going to be talking about the role of data orchestration. Paola, let's go ahead and start with you. Give the audience that understanding, that context about Astronomer and what it is that you guys do. >> Mm-hmm. Yeah, absolutely. So, Astronomer is a, you know, we're a technology and software company for modern data orchestration, as you said, and we're the driving force behind Apache Airflow. The Open Source Workflow Management tool that's since been adopted by thousands and thousands of users, and we'll dig into this a little bit more. But, by data orchestration, we mean data pipeline, so generally speaking, getting data from one place to another, transforming it, running it on a schedule, and overall just building a central system that tangibly connects your entire ecosystem of data services, right. So what, that's Redshift, Snowflake, DVT, et cetera. And so tangibly, we build, we at Astronomer here build products powered by Apache Airflow for data teams and for data practitioners, so that they don't have to. So, we sell to data engineers, data scientists, data admins, and we really spend our time doing three things. So, the first is that we build Astro, our flagship cloud service that we'll talk more on. But here, we're really building experiences that make it easier for data practitioners to author, run, and scale their data pipeline footprint on the cloud. And then, we also contribute to Apache Airflow as an open source project and community. So, we cultivate the community of humans, and we also put out open source developer tools that actually make it easier for individual data practitioners to be productive in their day-to-day jobs, whether or not they actually use our product and and pay us money or not. And then of course, we also have professional services and education and all of these things around our commercial products that enable folks to use our products and use Airflow as effectively as possible. So yeah, super, super happy with everything we've done and hopefully that gives you an idea of where we're starting. >> Awesome, so when you're talking with those, Paola, those data engineers, those data scientists, how do you define data orchestration and what does it mean to them? >> Yeah, yeah, it's a good question. So, you know, if you Google data orchestration you're going to get something about an automated process for organizing silo data and making it accessible for processing and analysis. But, to your question, what does that actually mean, you know? So, if you look at it from a customer's perspective, we can share a little bit about how we at Astronomer actually do data orchestration ourselves and the problems that it solves for us. So, as many other companies out in the world do, we at Astronomer need to monitor how our own customers use our products, right? And so, we have a weekly meeting, for example, that goes through a dashboard and a dashboarding tool called Sigma where we see the number of monthly customers and how they're engaging with our product. But, to actually do that, you know, we have to use data from our application database, for example, that has behavioral data on what they're actually doing in our product. We also have data from third party API tools, like Salesforce and HubSpot, and other ways in which our customer, we actually engage with our customers and their behavior. And so, our data team internally at Astronomer uses a bunch of tools to transform and use that data, right? So, we use FiveTran, for example, to ingest. We use Snowflake as our data warehouse. We use other tools for data transformations. And even, if we at Astronomer don't do this, you can imagine a data team also using tools like, Monte Carlo for data quality, or Hightouch for Reverse ETL, or things like that. And, I think the point here is that data teams, you know, that are building data-driven organizations have a plethora of tooling to both ingest the right data and come up with the right interfaces to transform and actually, interact with that data. And so, that movement and sort of synchronization of data across your ecosystem is exactly what data orchestration is responsible for. Historically, I think, and Raj will talk more about this, historically, schedulers like KRON and Oozie or Control-M have taken a role here, but we think that Apache Airflow has sort of risen over the past few years as the defacto industry standard for writing data pipelines that do tasks, that do data jobs that interact with that ecosystem of tools in your organization. And so, beyond that sort of data pipeline unit, I think where we see it is that data acquisition is not only writing those data pipelines that move your data, but it's also all the things around it, right, so, CI/CD tool and Secrets Management, et cetera. So, a long-winded answer here, but I think that's how we talk about it here at Astronomer and how we're building our products. >> Excellent. Great context, Paola. Thank you. Viraj, let's bring you into the conversation. Every company these days has to be a data company, right? They've got to be a software company- >> Mm-hmm. >> whether it's my bank or my grocery store. So, how are companies actually doing data orchestration today, Viraj? >> Yeah, it's a great question. So, I think one thing to think about is like, on one hand, you know, data orchestration is kind of a new category that we're helping define, but on the other hand, it's something that companies have been doing forever, right? You need to get data moving to use it, you know. You've got it all in place, aggregate it, cleaning it, et cetera. So, when you look at what companies out there are doing, right. Sometimes, if you're a more kind of born in the cloud company, as we say, you'll adopt all these cloud native tooling things your cloud provider gives you. If you're a bank or another sort of institution like that, you know, you're probably juggling an even wider variety of tools. You're thinking about a cloud migration. You might have things like Kron running in one place, Uzi running somewhere else, Informatics running somewhere else, while you're also trying to move all your workloads to the cloud. So, there's quite a large spectrum of what the current state is for companies. And then, kind of like Paola was saying, Apache Airflow started in 2014, and it was actually started by Airbnb, and they put out this blog post that was like, "Hey here's how we use Apache Airflow to orchestrate our data across all their sources." And really since then, right, it's almost been a decade since then, Airflow emerged as the open source standard, and there's companies of all sorts using it. And, it's really used to tie all these tools together, especially as that number of tools increases, companies move to hybrid cloud, hybrid multi-cloud strategies, and so on and so forth. But you know, what we found is that if you go to any company, especially a larger one and you say like, "Hey, how are you doing data orchestration?" They'll probably say something like, "Well, I have five data teams, so I have eight different ways I do data orchestration." Right. This idea of data orchestration's been there but the right way to do it, kind of all the abstractions you need, the way your teams need to work together, and so on and so forth, hasn't really emerged just yet, right? It's such a quick moving space that companies have to combine what they were doing before with what their new business initiatives are today. So, you know, what we really believe here at Astronomer is Airflow is the core of how you solve data orchestration for any sort of use case, but it's not everything. You know, it needs a little more. And, that's really where our commercial product, Astro comes in, where we've built, not only the most tried and tested airflow experience out there. We do employ a majority of the Airflow Core Committers, right? So, we're kind of really deep in the project. We've also built the right things around developer tooling, observability, and reliability for customers to really rely on Astro as the heart of the way they do data orchestration, and kind of think of it as the foundational layer that helps tie together all the different tools, practices and teams large companies have to do today. >> That foundational layer is absolutely critical. You've both mentioned open source software. Paola, I want to go back to you, and just give the audience an understanding of how open source really plays into Astronomer's mission as a company, and into the technologies like Astro. >> Mm-hmm. Yeah, absolutely. I mean, we, so we at Astronomers started using Airflow and actually building our products because Airflow is open source and we were our own customers at the beginning of our company journey. And, I think the open source community is at the core of everything we do. You know, without that open source community and culture, I think, you know, we have less of a business, and so, we're super invested in continuing to cultivate and grow that. And, I think there's a couple sort of concrete ways in which we do this that personally make me really excited to do my own job. You know, for one, we do things like we organize meetups and we sponsor the Airflow Summit and there's these sort of baseline community efforts that I think are really important and that reminds you, hey, there just humans trying to do their jobs and learn and use both our technology and things that are out there and contribute to it. So, making it easier to contribute to Airflow, for example, is another one of our efforts. As Viraj mentioned, we also employ, you know, engineers internally who are on our team whose full-time job is to make the open source project better. Again, regardless of whether or not you're a customer of ours or not, we want to make sure that we continue to cultivate the Airflow project in and of itself. And, we're also building developer tooling that might not be a part of the Apache Open Source project, but is still open source. So, we have repositories in our own sort of GitHub organization, for example, with tools that individual data practitioners, again customers are not, can use to make them be more productive in their day-to-day jobs with Airflow writing Dags for the most common use cases out there. The last thing I'll say is how important I think we've found it to build sort of educational resources and documentation and best practices. Airflow can be complex. It's been around for a long time. There's a lot of really, really rich feature sets. And so, how do we enable folks to actually use those? And that comes in, you know, things like webinars, and best practices, and courses and curriculum that are free and accessible and open to the community are just some of the ways in which I think we're continuing to invest in that open source community over the next year and beyond. >> That's awesome. It sounds like open source is really core, not only to the mission, but really to the heart of the organization. Viraj, I want to go back to you and really try to understand how does Astronomer fit into the wider modern data stack and ecosystem? Like what does that look like for customers? >> Yeah, yeah. So, both in the open source and with our commercial customers, right? Folks everywhere are trying to tie together a huge variety of tools in order to start making sense of their data. And you know, I kind of think of it almost like as like a pyramid, right? At the base level, you need things like data reliability, data, sorry, data freshness, data availability, and so on and so forth, right? You just need your data to be there. (coughs) I'm sorry. You just need your data to be there, and you need to make it predictable when it's going to be there. You need to make sure it's kind of correct at the highest level, some quality checks, and so on and so forth. And oftentimes, that kind of takes the case of ELT or ETL use cases, right? Taking data from somewhere and moving it somewhere else, usually into some sort of analytics destination. And, that's really what businesses can do to just power the core parts of getting insights into how their business is going, right? How much revenue did I had? What's in my pipeline, salesforce, and so on and so forth. Once that kind of base foundation is there and people can get the data they need, how they need it, it really opens up a lot for what customers can do. You know, I think one of the trendier things out there right now is MLOps, and how do companies actually put machine learning into production? Well, when you think about it you kind of have to squint at it, right? Like, machine learning pipelines are really just any other data pipeline. They just have a certain set of needs that might not not be applicable to ELT pipelines. And, when you kind of have a common layer to tie together all the ways data can move through your organization, that's really what we're trying to make it so companies can do. And, that happens in financial services where, you know, we have some customers who take app data coming from their mobile apps, and actually run it through their fraud detection services to make sure that all the activity is not fraudulent. We have customers that will run sports betting models on our platform where they'll take data from a bunch of public APIs around different sporting events that are happening, transform all of that in a way their data scientist can build models with it, and then actually bet on sports based on that output. You know, one of my favorite use cases I like to talk about that we saw in the open source is we had there was one company whose their business was to deliver blood transfusions via drone into remote parts of the world. And, it was really cool because they took all this data from all sorts of places, right? Kind of orchestrated all the aggregation and cleaning and analysis that happened had to happen via airflow and the end product would be a drone being shot out into a real remote part of the world to actually give somebody blood who needed it there. Because it turns out for certain parts of the world, the easiest way to deliver blood to them is via drone and not via some other, some other thing. So, these kind of, all the things people do with the modern data stack is absolutely incredible, right? Like you were saying, every company's trying to be a data-driven company. What really energizes me is knowing that like, for all those best, super great tools out there that power a business, we get to be the connective tissue, or the, almost like the electricity that kind of ropes them all together and makes so people can actually do what they need to do. >> Right. Phenomenal use cases that you just described, Raj. I mean, just the variety alone of what you guys are able to do and impact is so cool. So Paola, when you're with those data engineers, those data scientists, and customer conversations, what's your pitch? Why use Astro? >> Mm-hmm. Yeah, yeah, it's a good question. And honestly, to piggyback off of Viraj, there's so many. I think what keeps me so energized is how mission critical both our product and data orchestration is, and those use cases really are incredible and we work with customers of all shapes and sizes. But, to answer your question, right, so why use Astra? Why use our commercial products? There's so many people using open source, why pay for something more than that? So, you know, the baseline for our business really is that Airflow has grown exponentially over the last five years, and like we said has become an industry standard that we're confident there's a huge opportunity for us as a company and as a team. But, we also strongly believe that being great at running Airflow, you know, doesn't make you a successful company at what you do. What makes you a successful company at what you do is building great products and solving problems and solving pin points of your own customers, right? And, that differentiating value isn't being amazing at running Airflow. That should be our job. And so, we want to abstract those customers from meaning to do things like manage Kubernetes infrastructure that you need to run Airflow, and then hiring someone full-time to go do that. Which can be hard, but again doesn't add differentiating value to your team, or to your product, or to your customers. So, folks to get away from managing that infrastructure sort of a base, a base layer. Folks who are looking for differentiating features that make their team more productive and allows them to spend less time tweaking Airflow configurations and more time working with the data that they're getting from their business. For help, getting, staying up with Airflow releases. There's a ton of, we've actually been pretty quick to come out with new Airflow features and releases, and actually just keeping up with that feature set and working strategically with a partner to help you make the most out of those feature sets is a key part of it. And, really it's, especially if you're an organization who currently is committed to using Airflow, you likely have a lot of Airflow environments across your organization. And, being able to see those Airflow environments in a single place and being able to enable your data practitioners to create Airflow environments with a click of a button, and then use, for example, our command line to develop your Airflow Dags locally and push them up to our product, and use all of the sort of testing and monitoring and observability that we have on top of our product is such a key. It sounds so simple, especially if you use Airflow, but really those things are, you know, baseline value props that we have for the customers that continue to be excited to work with us. And of course, I think we can go beyond that and there's, we have ambitions to add whole, a whole bunch of features and expand into different types of personas. >> Right? >> But really our main value prop is for companies who are committed to Airflow and want to abstract themselves and make use of some of the differentiating features that we now have at Astronomer. >> Got it. Awesome. >> Thank you. One thing, one thing I'll add to that, Paola, and I think you did a good job of saying is because every company's trying to be a data company, companies are at different parts of their journey along that, right? And we want to meet customers where they are, and take them through it to where they want to go. So, on one end you have folks who are like, "Hey, we're just building a data team here. We have a new initiative. We heard about Airflow. How do you help us out?" On the farther end, you know, we have some customers that have been using Airflow for five plus years and they're like, "Hey, this is awesome. We have 10 more teams we want to bring on. How can you help with this? How can we do more stuff in the open source with you? How can we tell our story together?" And, it's all about kind of taking this vast community of data users everywhere, seeing where they're at, and saying like, "Hey, Astro and Airflow can take you to the next place that you want to go." >> Which is incredibly- >> Mm-hmm. >> and you bring up a great point, Viraj, that every company is somewhere in a different place on that journey. And it's, and it's complex. But it sounds to me like a lot of what you're doing is really stripping away a lot of the complexity, really enabling folks to use their data as quickly as possible, so that it's relevant and they can serve up, you know, the right products and services to whoever wants what. Really incredibly important. We're almost out of time, but I'd love to get both of your perspectives on what's next for Astronomer. You give us a a great overview of what the company's doing, the value in it for customers. Paola, from your lens as one of the co-founders, what's next? >> Yeah, I mean, I think we'll continue to, I think cultivate in that open source community. I think we'll continue to build products that are open sourced as part of our ecosystem. I also think that we'll continue to build products that actually make Airflow, and getting started with Airflow, more accessible. So, sort of lowering that barrier to entry to our products, whether that's price wise or infrastructure requirement wise. I think making it easier for folks to get started and get their hands on our product is super important for us this year. And really it's about, I think, you know, for us, it's really about focused execution this year and all of the sort of core principles that we've been talking about. And continuing to invest in all of the things around our product that again, enable teams to use Airflow more effectively and efficiently. >> And that efficiency piece is, everybody needs that. Last question, Viraj, for you. What do you see in terms of the next year for Astronomer and for your role? >> Yeah, you know, I think Paola did a really good job of laying it out. So it's, it's really hard to disagree with her on anything, right? I think executing is definitely the most important thing. My own personal bias on that is I think more than ever it's important to really galvanize the community around airflow. So, we're going to be focusing on that a lot. We want to make it easier for our users to get get our product into their hands, be that open source users or commercial users. And last, but certainly not least, is we're also really excited about Data Lineage and this other open source project in our umbrella called Open Lineage to make it so that there's a standard way for users to get lineage out of different systems that they use. When we think about what's in store for data lineage and needing to audit the way automated decisions are being made. You know, I think that's just such an important thing that companies are really just starting with, and I don't think there's a solution that's emerged that kind of ties it all together. So, we think that as we kind of grow the role of Airflow, right, we can also make it so that we're helping solve, we're helping customers solve their lineage problems all in Astro, which is our kind of the best of both worlds for us. >> Awesome. I can definitely feel and hear the enthusiasm and the passion that you both bring to Astronomer, to your customers, to your team. I love it. We could keep talking more and more, so you're going to have to come back. (laughing) Viraj, Paola, thank you so much for joining me today on this showcase conversation. We really appreciate your insights and all the context that you provided about Astronomer. >> Thank you so much for having us. >> My pleasure. For my guests, I'm Lisa Martin. You're watching this Cube conversation. (soft electronic music)

Published Date : Feb 21 2023

SUMMARY :

to this CUBE conversation Thank you so much and what it is that you guys do. and hopefully that gives you an idea and the problems that it solves for us. to be a data company, right? So, how are companies actually kind of all the abstractions you need, and just give the And that comes in, you of the organization. and analysis that happened that you just described, Raj. that you need to run Airflow, that we now have at Astronomer. Awesome. and I think you did a good job of saying and you bring up a great point, Viraj, and all of the sort of core principles and for your role? and needing to audit the and all the context that you (soft electronic music)

ENTITIES

Entity	Category	Confidence
Viraj Parekh	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Paola	PERSON	0.99+
Viraj	PERSON	0.99+
2014	DATE	0.99+
Astronomer	ORGANIZATION	0.99+
Paola Peraza-Calderon	PERSON	0.99+
Paola Peraza Calderon	PERSON	0.99+
Airflow	ORGANIZATION	0.99+
Airbnb	ORGANIZATION	0.99+
five plus years	QUANTITY	0.99+
Astro	ORGANIZATION	0.99+
Raj	PERSON	0.99+
Uzi	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
first	QUANTITY	0.99+
both	QUANTITY	0.99+
today	DATE	0.99+
Kron	ORGANIZATION	0.99+
10 more teams	QUANTITY	0.98+
Astronomers	ORGANIZATION	0.98+
Astra	ORGANIZATION	0.98+
one	QUANTITY	0.98+
Airflow	TITLE	0.98+
Informatics	ORGANIZATION	0.98+
Monte Carlo	TITLE	0.98+
this year	DATE	0.98+
HubSpot	ORGANIZATION	0.98+
one company	QUANTITY	0.97+
Astronomer	TITLE	0.97+
next year	DATE	0.97+
Apache	ORGANIZATION	0.97+
Airflow Summit	EVENT	0.97+
AWS	ORGANIZATION	0.95+
both worlds	QUANTITY	0.93+
KRON	ORGANIZATION	0.93+
CUBE	ORGANIZATION	0.92+
M	ORGANIZATION	0.92+
Redshift	TITLE	0.91+
Snowflake	TITLE	0.91+
five data teams	QUANTITY	0.91+
GitHub	ORGANIZATION	0.91+
Oozie	ORGANIZATION	0.9+
Data Lineage	ORGANIZATION	0.9+

The Future Is Built On InFluxDB

>>Time series data is any data that's stamped in time in some way that could be every second, every minute, every five minutes, every hour, every nanosecond, whatever it might be. And typically that data comes from sources in the physical world like devices or sensors, temperature, gauges, batteries, any device really, or things in the virtual world could be software, maybe it's software in the cloud or data and containers or microservices or virtual machines. So all of these items, whether in the physical or virtual world, they're generating a lot of time series data. Now time series data has been around for a long time, and there are many examples in our everyday lives. All you gotta do is punch up any stock, ticker and look at its price over time and graphical form. And that's a simple use case that anyone can relate to and you can build timestamps into a traditional relational database. >>You just add a column to capture time and as well, there are examples of log data being dumped into a data store that can be searched and captured and ingested and visualized. Now, the problem with the latter example that I just gave you is that you gotta hunt and Peck and search and extract what you're looking for. And the problem with the former is that traditional general purpose databases they're designed as sort of a Swiss army knife for any workload. And there are a lot of functions that get in the way and make them inefficient for time series analysis, especially at scale. Like when you think about O T and edge scale, where things are happening super fast, ingestion is coming from many different sources and analysis often needs to be done in real time or near real time. And that's where time series databases come in. >>They're purpose built and can much more efficiently support ingesting metrics at scale, and then comparing data points over time, time series databases can write and read at significantly higher speeds and deal with far more data than traditional database methods. And they're more cost effective instead of throwing processing power at the problem. For example, the underlying architecture and algorithms of time series databases can optimize queries and they can reclaim wasted storage space and reuse it. At scale time, series databases are simply a better fit for the job. Welcome to moving the world with influx DB made possible by influx data. My name is Dave Valante and I'll be your host today. Influx data is the company behind InfluxDB. The open source time series database InfluxDB is designed specifically to handle time series data. As I just explained, we have an exciting program for you today, and we're gonna showcase some really interesting use cases. >>First, we'll kick it off in our Palo Alto studios where my colleague, John furrier will interview Evan Kaplan. Who's the CEO of influx data after John and Evan set the table. John's gonna sit down with Brian Gilmore. He's the director of IOT and emerging tech at influx data. And they're gonna dig into where influx data is gaining traction and why adoption is occurring and, and why it's so robust. And they're gonna have tons of examples and double click into the technology. And then we bring it back here to our east coast studios, where I get to talk to two practitioners, doing amazing things in space with satellites and modern telescopes. These use cases will blow your mind. You don't want to miss it. So thanks for being here today. And with that, let's get started. Take it away. Palo Alto. >>Okay. Today we welcome Evan Kaplan, CEO of influx data, the company behind influx DB. Welcome Evan. Thanks for coming on. >>Hey John, thanks for having me >>Great segment here on the influx DB story. What is the story? Take us through the history. Why time series? What's the story >><laugh> so the history history is actually actually pretty interesting. Um, Paul dicks, my partner in this and our founder, um, super passionate about developers and developer experience. And, um, he had worked on wall street building a number of time series kind of platform trading platforms for trading stocks. And from his point of view, it was always what he would call a yak shave, which means you had to do a ton of work just to start doing work, which means you had to write a bunch of extrinsic routines. You had to write a bunch of application handling on existing relational databases in order to come up with something that was optimized for a trading platform or a time series platform. And he sort of, he just developed this real clear point of view is this is not how developers should work. And so in 2013, he went through why Combinator and he built something for, he made his first commit to open source in flu DB at the end of 2013. And, and he basically, you know, from my point of view, he invented modern time series, which is you start with a purpose-built time series platform to do these kind of workloads. And you get all the benefits of having something right outta the box. So a developer can be totally productive right away. >>And how many people in the company what's the history of employees and stuff? >>Yeah, I think we're, I, you know, I always forget the number, but it's something like 230 or 240 people now. Um, the company, I joined the company in 2016 and I love Paul's vision. And I just had a strong conviction about the relationship between time series and IOT. Cuz if you think about it, what sensors do is they speak time, series, pressure, temperature, volume, humidity, light, they're measuring they're instrumenting something over time. And so I thought that would be super relevant over long term and I've not regretted it. >>Oh no. And it's interesting at that time, go back in the history, you know, the role of databases, well, relational database is the one database to rule the world. And then as clouds started coming in, you starting to see more databases, proliferate types of databases and time series in particular is interesting. Cuz real time has become super valuable from an application standpoint, O T which speaks time series means something it's like time matters >>Time. >>Yeah. And sometimes data's not worth it after the time, sometimes it worth it. And then you get the data lake. So you have this whole new evolution. Is this the momentum? What's the momentum, I guess the question is what's the momentum behind >>You mean what's causing us to grow. So >>Yeah, the time series, why is time series >>And the >>Category momentum? What's the bottom line? >>Well, think about it. You think about it from a broad, broad sort of frame, which is where, what everybody's trying to do is build increasingly intelligent systems, whether it's a self-driving car or a robotic system that does what you want to do or a self-healing software system, everybody wants to build increasing intelligent systems. And so in order to build these increasing intelligent systems, you have to instrument the system well, and you have to instrument it over time, better and better. And so you need a tool, a fundamental tool to drive that instrumentation. And that's become clear to everybody that that instrumentation is all based on time. And so what happened, what happened, what happened what's gonna happen? And so you get to these applications like predictive maintenance or smarter systems. And increasingly you want to do that stuff, not just intelligently, but fast in real time. So millisecond response so that when you're driving a self-driving car and the system realizes that you're about to do something, essentially you wanna be able to act in something that looks like real time, all systems want to do that, want to be more intelligent and they want to be more real time. And so we just happen to, you know, we happen to show up at the right time in the evolution of a >>Market. It's interesting near real time. Isn't good enough when you need real time. >><laugh> yeah, it's not, it's not. And it's like, and it's like, everybody wants, even when you don't need it, ironically, you want it. It's like having the feature for, you know, you buy a new television, you want that one feature, even though you're not gonna use it, you decide that your buying criteria real time is a buying criteria >>For, so you, I mean, what you're saying then is near real time is getting closer to real time as possible, as fast as possible. Right. Okay. So talk about the aspect of data, cuz we're hearing a lot of conversations on the cube in particular around how people are implementing and actually getting better. So iterating on data, but you have to know when it happened to get, know how to fix it. So this is a big part of how we're seeing with people saying, Hey, you know, I wanna make my machine learning algorithms better after the fact I wanna learn from the data. Um, how does that, how do you see that evolving? Is that one of the use cases of sensors as people bring data in off the network, getting better with the data knowing when it happened? >>Well, for sure. So, so for sure, what you're saying is, is, is none of this is non-linear, it's all incremental. And so if you take something, you know, just as an easy example, if you take a self-driving car, what you're doing is you're instrumenting that car to understand where it can perform in the real world in real time. And if you do that, if you run the loop, which is I instrumented, I watch what happens, oh, that's wrong? Oh, I have to correct for that. I correct for that in the software. If you do that for a billion times, you get a self-driving car, but every system moves along that evolution. And so you get the dynamic of, you know, of constantly instrumenting watching the system behave and do it. And this and sets up driving car is one thing. But even in the human genome, if you look at some of our customers, you know, people like, you know, people doing solar arrays, people doing power walls, like all of these systems are getting smarter. >>Well, let's get into that. What are the top applications? What are you seeing for your, with in, with influx DB, the time series, what's the sweet spot for the application use case and some customers give some >>Examples. Yeah. So it's, it's pretty easy to understand on one side of the equation that's the physical side is sensors are sensors are getting cheap. Obviously we know that and they're getting the whole physical world is getting instrumented, your home, your car, the factory floor, your wrist, watch your healthcare, you name it. It's getting instrumented in the physical world. We're watching the physical world in real time. And so there are three or four sweet spots for us, but, but they're all on that side. They're all about IOT. So they're think about consumer IOT projects like Google's nest todo, um, particle sensors, um, even delivery engines like rapid who deliver the Instacart of south America, like anywhere there's a physical location do and that's on the consumer side. And then another exciting space is the industrial side factories are changing dramatically over time. Increasingly moving away from proprietary equipment to develop or driven systems that run operational because what, what has to get smarter when you're building, when you're building a factory is systems all have to get smarter. And then, um, lastly, a lot in the renewables sustainability. So a lot, you know, Tesla, lucid, motors, Cola, motors, um, you know, lots to do with electric cars, solar arrays, windmills, arrays, just anything that's gonna get instrumented that where that instrumentation becomes part of what the purpose >>Is. It's interesting. The convergence of physical and digital is happening with the data IOT. You mentioned, you know, you think of IOT, look at the use cases there, it was proprietary OT systems. Now becoming more IP enabled internet protocol and now edge compute, getting smaller, faster, cheaper AI going to the edge. Now you have all kinds of new capabilities that bring that real time and time series opportunity. Are you seeing IOT going to a new level? What was the, what's the IOT where's the IOT dots connecting to because you know, as these two cultures merge yeah. Operations, basically industrial factory car, they gotta get smarter, intelligent edge is a buzzword, but I mean, it has to be more intelligent. Where's the, where's the action in all this. So the >>Action, really, it really at the core, it's at the developer, right? Because you're looking at these things, it's very hard to get an off the shelf system to do the kinds of physical and software interaction. So the actions really happen at the developer. And so what you're seeing is a movement in the world that, that maybe you and I grew up in with it or OT moving increasingly that developer driven capability. And so all of these IOT systems they're bespoke, they don't come out of the box. And so the developer, the architect, the CTO, they define what's my business. What am I trying to do? Am I trying to sequence a human genome and figure out when these genes express theself or am I trying to figure out when the next heart rate monitor's gonna show up on my apple watch, right? What am I trying to do? What's the system I need to build. And so starting with the developers where all of the good stuff happens here, which is different than it used to be, right. Used to be you'd buy an application or a service or a SA thing for, but with this dynamic, with this integration of systems, it's all about bespoke. It's all about building >>Something. So let's get to the developer real quick, real highlight point here is the data. I mean, I could see a developer saying, okay, I need to have an application for the edge IOT edge or car. I mean, we're gonna have, I mean, Tesla's got applications of the car it's right there. I mean, yes, there's the modern application life cycle now. So take us through how this impacts the developer. Does it impact their C I C D pipeline? Is it cloud native? I mean, where does this all, where does this go to? >>Well, so first of all, you're talking about, there was an internal journey that we had to go through as a company, which, which I think is fascinating for anybody who's interested is we went from primarily a monolithic software that was open sourced to building a cloud native platform, which means we had to move from an agile development environment to a C I C D environment. So to a degree that you are moving your service, whether it's, you know, Tesla monitoring your car and updating your power walls, right. Or whether it's a solar company updating the arrays, right. To degree that that service is cloud. Then increasingly remove from an agile development to a C I C D environment, which you're shipping code to production every day. And so it's not just the developers, all the infrastructure to support the developers to run that service and that sort of stuff. I think that's also gonna happen in a big way >>When your customer base that you have now, and as you see, evolving with infl DB, is it that they're gonna be writing more of the application or relying more on others? I mean, obviously there's an open source component here. So when you bring in kind of old way, new way old way was I got a proprietary, a platform running all this O T stuff and I gotta write, here's an application. That's general purpose. Yeah. I have some flexibility, somewhat brittle, maybe not a lot of robustness to it, but it does its job >>A good way to think about this is versus a new way >>Is >>What so yeah, good way to think about this is what, what's the role of the developer slash architect CTO that chain within a large, within an enterprise or a company. And so, um, the way to think about it is I started my career in the aerospace industry <laugh> and so when you look at what Boeing does to assemble a plane, they build very, very few of the parts. Instead, what they do is they assemble, they buy the wings, they buy the engines, they assemble, actually, they don't buy the wings. It's the one thing they buy the, the material for the w they build the wings, cuz there's a lot of tech in the wings and they end up being assemblers smart assemblers of what ends up being a flying airplane, which is pretty big deal even now. And so what, what happens with software people is they have the ability to pull from, you know, the best of the open source world. So they would pull a time series capability from us. Then they would assemble that with, with potentially some ETL logic from somebody else, or they'd assemble it with, um, a Kafka interface to be able to stream the data in. And so they become very good integrators and assemblers, but they become masters of that bespoke application. And I think that's where it goes, cuz you're not writing native code for everything. >>So they're more flexible. They have faster time to market cuz they're assembling way faster and they get to still maintain their core competency. Okay. Their wings in this case, >>They become increasingly not just coders, but designers and developers. They become broadly builders is what we like to think of it. People who start and build stuff by the way, this is not different than the people just up the road Google have been doing for years or the tier one, Amazon building all their own. >>Well, I think one of the things that's interesting is is that this idea of a systems developing a system architecture, I mean systems, uh, uh, systems have consequences when you make changes. So when you have now cloud data center on premise and edge working together, how does that work across the system? You can't have a wing that doesn't work with the other wing kind of thing. >>That's exactly. But that's where the that's where the, you know, that that Boeing or that airplane building analogy comes in for us. We've really been thoughtful about that because IOT it's critical. So our open source edge has the same API as our cloud native stuff that has enterprise on pre edge. So our multiple products have the same API and they have a relationship with each other. They can talk with each other. So the builder builds it once. And so this is where, when you start thinking about the components that people have to use to build these services is that you wanna make sure, at least that base layer, that database layer, that those components talk to each other. >>So I'll have to ask you if I'm the customer. I put my customer hat on. Okay. Hey, I'm dealing with a lot. >>That mean you have a PO for <laugh> >>A big check. I blank check. If you can answer this question only if the tech, if, if you get the question right, I got all this important operation stuff. I got my factory, I got my self-driving cars. This isn't like trivial stuff. This is my business. How should I be thinking about time series? Because now I have to make these architectural decisions, as you mentioned, and it's gonna impact my application development. So huge decision point for your customers. What should I care about the most? So what's in it for me. Why is time series >>Important? Yeah, that's a great question. So chances are, if you've got a business that was, you know, 20 years old or 25 years old, you were already thinking about time series. You probably didn't call it that you built something on a Oracle or you built something on IBM's DB two, right. And you made it work within your system. Right? And so that's what you started building. So it's already out there. There are, you know, there are probably hundreds of millions of time series applications out there today. But as you start to think about this increasing need for real time, and you start to think about increasing intelligence, you think about optimizing those systems over time. I hate the word, but digital transformation. Then you start with time series. It's a foundational base layer for any system that you're gonna build. There's no system I can think of where time series, shouldn't be the foundational base layer. If you just wanna store your data and just leave it there and then maybe look it up every five years. That's fine. That's not time. Series time series is when you're building a smarter, more intelligent, more real time system. And the developers now know that. And so the more they play a role in building these systems, the more obvious it becomes. >>And since I have a PO for you and a big check, yeah. What is, what's the value to me as I, when I implement this, what's the end state, what's it look like when it's up and running? What's the value proposition for me. What's an >>So, so when it's up and running, you're able to handle the queries, the writing of the data, the down sampling of the data, they're transforming it in near real time. So that the other dependencies that a system that gets for adjusting a solar array or trading energy off of a power wall or some sort of human genome, those systems work better. So time series is foundational. It's not like it's, you know, it's not like it's doing every action that's above, but it's foundational to build a really compelling, intelligent system. I think that's what developers and archs are seeing now. >>Bottom line, final word. What's in it for the customer. What's what, what's your, um, what's your statement to the customer? What would you say to someone looking to do something in time series on edge? >>Yeah. So, so it's pretty clear to clear to us that if you're building, if you view yourself as being in the build business of building systems that you want 'em to be increasingly intelligent, self-healing autonomous. You want 'em to operate in real time that you start from time series. But I also wanna say what's in it for us influx what's in it for us is people are doing some amazing stuff. You know, I highlighted some of the energy stuff, some of the human genome, some of the healthcare it's hard not to be proud or feel like, wow. Yeah. Somehow I've been lucky. I've arrived at the right time, in the right place with the right people to be able to deliver on that. That's that's also exciting on our side of the equation. >>Yeah. It's critical infrastructure, critical, critical operations. >>Yeah. >>Yeah. Great stuff, Evan. Thanks for coming on. Appreciate this segment. All right. In a moment, Brian Gilmore director of IOT and emerging technology that influx day will join me. You're watching the cube leader in tech coverage. Thanks for watching >>Time series data from sensors systems and applications is a key source in driving automation and prediction in technologies around the world. But managing the massive amount of timestamp data generated these days is overwhelming, especially at scale. That's why influx data developed influx DB, a time series data platform that collects stores and analyzes data influx DB empowers developers to extract valuable insights and turn them into action by building transformative IOT analytics and cloud native applications, purpose built and optimized to handle the scale and velocity of timestamped data. InfluxDB puts the power in your hands with developer tools that make it easy to get started quickly with less code InfluxDB is more than a database. It's a robust developer platform with integrated tooling. That's written in the languages you love. So you can innovate faster, run in flex DB anywhere you want by choosing the provider and region that best fits your needs across AWS, Microsoft Azure and Google cloud flex DB is fast and automatically scalable. So you can spend time delivering value to customers, not managing clusters, take control of your time series data. So you can focus on the features and functionalities that give your applications a competitive edge. Get started for free with influx DB, visit influx data.com/cloud to learn more. >>Okay. Now we're joined by Brian Gilmore director of IOT and emerging technologies at influx data. Welcome to the show. >>Thank you, John. Great to be here. >>We just spent some time with Evan going through the company and the value proposition, um, with influx DV, what's the momentum, where do you see this coming from? What's the value coming out of this? >>Well, I think it, we're sort of hitting a point where the technology is, is like the adoption of it is becoming mainstream. We're seeing it in all sorts of organizations, everybody from like the most well funded sort of advanced big technology companies to the smaller academics, the startups and the managing of that sort of data that emits from that technology is time series and us being able to give them a, a platform, a tool that's super easy to use, easy to start. And then of course will grow with them is, is been key to us. Sort of, you know, riding along with them is they're successful. >>Evan was mentioning that time series has been on everyone's radar and that's in the OT business for years. Now, you go back since 20 13, 14, even like five years ago that convergence of physical and digital coming together, IP enabled edge. Yeah. Edge has always been kind of hyped up, but why now? Why, why is the edge so hot right now from an adoption standpoint? Is it because it's just evolution, the tech getting better? >>I think it's, it's, it's twofold. I think that, you know, there was, I would think for some people, everybody was so focused on cloud over the last probably 10 years. Mm-hmm <affirmative> that they forgot about the compute that was available at the edge. And I think, you know, those, especially in the OT and on the factory floor who weren't able to take Avan full advantage of cloud through their applications, you know, still needed to be able to leverage that compute at the edge. I think the big thing that we're seeing now, which is interesting is, is that there's like a hybrid nature to all of these applications where there's definitely some data that's generated on the edge. There's definitely done some data that's generated in the cloud. And it's the ability for a developer to sort of like tie those two systems together and work with that data in a very unified uniform way. Um, that's giving them the opportunity to build solutions that, you know, really deliver value to whatever it is they're trying to do, whether it's, you know, the, the out reaches of outer space or whether it's optimizing the factory floor. >>Yeah. I think, I think one of the things you also mentions genome too, dig big data is coming to the real world. And I think I, OT has been kind of like this thing for OT and, and in some use case, but now with the, with the cloud, all companies have an edge strategy now. So yeah, what's the secret sauce because now this is hot, hot product for the whole world and not just industrial, but all businesses. What's the secret sauce. >>Well, I mean, I think part of it is just that the technology is becoming more capable and that's especially on the hardware side, right? I mean, like technology compute is getting smaller and smaller and smaller. And we find that by supporting all the way down to the edge, even to the micro controller layer with our, um, you know, our client libraries and then working hard to make our applications, especially the database as small as possible so that it can be located as close to sort of the point of origin of that data in the edge as possible is, is, is fantastic. Now you can take that. You can run that locally. You can do your local decision making. You can use influx DB as sort of an input to automation control the autonomy that people are trying to drive at the edge. But when you link it up with everything that's in the cloud, that's when you get all of the sort of cloud scale capabilities of parallelized, AI and machine learning and all of that. >>So what's interesting is the open source success has been something that we've talked about a lot in the cube about how people are leveraging that you guys have users in the enterprise users that IOT market mm-hmm <affirmative>, but you got developers now. Yeah. Kind of together brought that up. How do you see that emerging? How do developers engage? What are some of the things you're seeing that developers are really getting into with InfluxDB >>What's? Yeah. Well, I mean, I think there are the developers who are building companies, right? And these are the startups and the folks that we love to work with who are building new, you know, new services, new products, things like that. And, you know, especially on the consumer side of IOT, there's a lot of that, just those developers. But I think we, you gotta pay attention to those enterprise developers as well, right? There are tons of people with the, the title of engineer in, in your regular enterprise organizations. And they're there for systems integration. They're there for, you know, looking at what they would build versus what they would buy. And a lot of them come from, you know, a strong, open source background and they, they know the communities, they know the top platforms in those spaces and, and, you know, they're excited to be able to adopt and use, you know, to optimize inside the business as compared to just building a brand new one. >>You know, it's interesting too, when Evan and I were talking about open source versus closed OT systems, mm-hmm <affirmative> so how do you support the backwards compatibility of older systems while maintaining open dozens of data formats out there? Bunch of standards, protocols, new things are emerging. Everyone wants to have a control plane. Everyone wants to leverage the value of data. How do you guys keep track of it all? What do you guys support? >>Yeah, well, I mean, I think either through direct connection, like we have a product called Telegraph, it's unbelievable. It's open source, it's an edge agent. You can run it as close to the edge as you'd like, it speaks dozens of different protocols in its own, right? A couple of which MQTT B, C U a are very, very, um, applicable to these T use cases. But then we also, because we are sort of not only open source, but open in terms of our ability to collect data, we have a lot of partners who have built really great integrations from their own middleware, into influx DB. These are companies like ke wear and high bite who are really experts in those downstream industrial protocols. I mean, that's a business, not everybody wants to be in. It requires some very specialized, very hard work and a lot of support, um, you know, and so by making those connections and building those ecosystems, we get the best of both worlds. The customers can use the platforms they need up to the point where they would be putting into our database. >>What's some of customer testimonies that they, that share with you. Can you share some anecdotal kind of like, wow, that's the best thing I've ever used. This really changed my business, or this is a great tech that's helped me in these other areas. What are some of the, um, soundbites you hear from customers when they're successful? >>Yeah. I mean, I think it ranges. You've got customers who are, you know, just finally being able to do the monitoring of assets, you know, sort of at the edge in the field, we have a customer who's who's has these tunnel boring machines that go deep into the earth to like drill tunnels for, for, you know, cars and, and, you know, trains and things like that. You know, they are just excited to be able to stick a database onto those tunnel, boring machines, send them into the depths of the earth and know that when they come out, all of that telemetry at a very high frequency has been like safely stored. And then it can just very quickly and instantly connect up to their, you know, centralized database. So like just having that visibility is brand new to them. And that's super important. On the other hand, we have customers who are way far beyond the monitoring use case, where they're actually using the historical records in the time series database to, um, like I think Evan mentioned like forecast things. So for predictive maintenance, being able to pull in the telemetry from the machines, but then also all of that external enrichment data, the metadata, the temperatures, the pressure is who is operating the machine, those types of things, and being able to easily integrate with platforms like Jupyter notebooks or, you know, all of those scientific computing and machine learning libraries to be able to build the models, train the models, and then they can send that information back down to InfluxDB to apply it and detect those anomalies, which >>Are, I think that's gonna be an, an area. I personally think that's a hot area because I think if you look at AI right now, yeah. It's all about training the machine learning albums after the fact. So time series becomes hugely important. Yeah. Cause now you're thinking, okay, the data matters post time. Yeah. First time. And then it gets updated the new time. Yeah. So it's like constant data cleansing data iteration, data programming. We're starting to see this new use case emerge in the data field. >>Yep. Yeah. I mean, I think you agree. Yeah, of course. Yeah. The, the ability to sort of handle those pipelines of data smartly, um, intelligently, and then to be able to do all of the things you need to do with that data in stream, um, before it hits your sort of central repository. And, and we make that really easy for customers like Telegraph, not only does it have sort of the inputs to connect up to all of those protocols and the ability to capture and connect up to the, to the partner data. But also it has a whole bunch of capabilities around being able to process that data, enrich it, reform at it, route it, do whatever you need. So at that point you're basically able to, you're playing your data in exactly the way you would wanna do it. You're routing it to different, you know, destinations and, and it's, it's, it's not something that really has been in the realm of possibility until this point. Yeah. Yeah. >>And when Evan was on it's great. He was a CEO. So he sees the big picture with customers. He was, he kinda put the package together that said, Hey, we got a system. We got customers, people are wanting to leverage our product. What's your PO they're sell. He's selling too as well. So you have that whole CEO perspective, but he brought up this notion that there's multiple personas involved in kind of the influx DB system architect. You got developers and users. Can you talk about that? Reality as customers start to commercialize and operationalize this from a commercial standpoint, you got a relationship to the cloud. Yep. The edge is there. Yep. The edge is getting super important, but cloud brings a lot of scale to the table. So what is the relationship to the cloud? Can you share your thoughts on edge and its relationship to the cloud? >>Yeah. I mean, I think edge, you know, edges, you can think of it really as like the local information, right? So it's, it's generally like compartmentalized to a point of like, you know, a single asset or a single factory align, whatever. Um, but what people do who wanna pro they wanna be able to make the decisions there at the edge locally, um, quickly minus the latency of sort of taking that large volume of data, shipping it to the cloud and doing something with it there. So we allow them to do exactly that. Then what they can do is they can actually downsample that data or they can, you know, detect like the really important metrics or the anomalies. And then they can ship that to a central database in the cloud where they can do all sorts of really interesting things with it. Like you can get that centralized view of all of your global assets. You can start to compare asset to asset, and then you can do those things like we talked about, whereas you can do predictive types of analytics or, you know, larger scale anomaly detections. >>So in this model you have a lot of commercial operations, industrial equipment. Yep. The physical plant, physical business with virtual data cloud all coming together. What's the future for InfluxDB from a tech standpoint. Cause you got open. Yep. There's an ecosystem there. Yep. You have customers who want operational reliability for sure. I mean, so you got organic <laugh> >>Yeah. Yeah. I mean, I think, you know, again, we got iPhones when everybody's waiting for flying cars. Right. So I don't know. We can like absolutely perfectly predict what's coming, but I think there are some givens and I think those givens are gonna be that the world is only gonna become more hybrid. Right. And then, you know, so we are going to have much more widely distributed, you know, situations where you have data being generated in the cloud, you have data gen being generated at the edge and then there's gonna be data generated sort sort of at all points in between like physical locations as well as things that are, that are very virtual. And I think, you know, we are, we're building some technology right now. That's going to allow, um, the concept of a database to be much more fluid and flexible, sort of more aligned with what a file would be like. >>And so being able to move data to the compute for analysis or move the compute to the data for analysis, those are the types of, of solutions that we'll be bringing to the customers sort of over the next little bit. Um, but I also think we have to start thinking about like what happens when the edge is actually off the planet. Right. I mean, we've got customers, you're gonna talk to two of them, uh, in the panel who are actually working with data that comes from like outside the earth, like, you know, either in low earth orbit or you know, all the way sort of on the other side of the universe. Yeah. And, and to be able to process data like that and to do so in a way it's it's we gotta, we gotta build the fundamentals for that right now on the factory floor and in the mines and in the tunnels. Um, so that we'll be ready for that one. >>I think you bring up a good point there because one of the things that's common in the industry right now, people are talking about, this is kind of new thinking is hyper scale's always been built up full stack developers, even the old OT world, Evan was pointing out that they built everything right. And the world's going to more assembly with core competency and IP and also property being the core of their apple. So faster assembly and building, but also integration. You got all this new stuff happening. Yeah. And that's to separate out the data complexity from the app. Yes. So space genome. Yep. Driving cars throws off massive data. >>It >>Does. So is Tesla, uh, is the car the same as the data layer? >>I mean the, yeah, it's, it's certainly a point of origin. I think the thing that we wanna do is we wanna let the developers work on the world, changing problems, the things that they're trying to solve, whether it's, you know, energy or, you know, any of the other health or, you know, other challenges that these teams are, are building against. And we'll worry about that time series data and the underlying data platform so that they don't have to. Right. I mean, I think you talked about it, uh, you know, for them just to be able to adopt the platform quickly, integrate it with their data sources and the other pieces of their applications. It's going to allow them to bring much faster time to market on these products. It's gonna allow them to be more iterative. They're gonna be able to do more sort of testing and things like that. And ultimately it will, it'll accelerate the adoption and the creation of >>Technology. You mentioned earlier in, in our talk about unification of data. Yeah. How about APIs? Cuz developers love APIs in the cloud unifying APIs. How do you view view that? >>Yeah, I mean, we are APIs, that's the product itself. Like everything, people like to think of it as sort of having this nice front end, but the front end is B built on our public APIs. Um, you know, and it, it allows the developer to build all of those hooks for not only data creation, but then data processing, data analytics, and then, you know, sort of data extraction to bring it to other platforms or other applications, microservices, whatever it might be. So, I mean, it is a world of APIs right now and you know, we, we bring a very sort of useful set of them for managing the time series data. These guys are all challenged with. It's >>Interesting. You and I were talking before we came on camera about how, um, data is, feels gonna have this kind of SRE role that DevOps had site reliability engineers, which manages a bunch of servers. There's so much data out there now. Yeah. >>Yeah. It's like reigning data for sure. And I think like that ability to be like one of the best jobs on the planet is gonna be to be able to like, sort of be that data Wrangler to be able to understand like what the data sources are, what the data formats are, how to be able to efficiently move that data from point a to point B and you know, to process it correctly so that the end users of that data aren't doing any of that sort of hard upfront preparation collection storage's >>Work. Yeah. That's data as code. I mean, data engineering is it is becoming a new discipline for sure. And, and the democratization is the benefit. Yeah. To everyone, data science get easier. I mean data science, but they wanna make it easy. Right. <laugh> yeah. They wanna do the analysis, >>Right? Yeah. I mean, I think, you know, it, it's a really good point. I think like we try to give our users as many ways as there could be possible to get data in and get data out. We sort of think about it as meeting them where they are. Right. So like we build, we have the sort of client libraries that allow them to just port to us, you know, directly from the applications and the languages that they're writing, but then they can also pull it out. And at that point nobody's gonna know the users, the end consumers of that data, better than those people who are building those applications. And so they're building these user interfaces, which are making all of that data accessible for, you know, their end users inside their organization. >>Well, Brian, great segment, great insight. Thanks for sharing all, all the complexities and, and IOT that you guys helped take away with the APIs and, and assembly and, and all the system architectures that are changing edge is real cloud is real. Yeah, absolutely. Mainstream enterprises. And you got developer attraction too, so congratulations. >>Yeah. It's >>Great. Well, thank any, any last word you wanna share >>Deal with? No, just, I mean, please, you know, if you're, if you're gonna, if you're gonna check out influx TV, download it, try out the open source contribute if you can. That's a, that's a huge thing. It's part of being the open source community. Um, you know, but definitely just, just use it. I think when once people use it, they try it out. They'll understand very, >>Very quickly. So open source with developers, enterprise and edge coming together all together. You're gonna hear more about that in the next segment, too. Right. Thanks for coming on. Okay. Thanks. When we return, Dave LAN will lead a panel on edge and data influx DB. You're watching the cube, the leader in high tech enterprise coverage. >>Why the startup, we move really fast. We find that in flex DB can move as fast as us. It's just a great group, very collaborative, very interested in manufacturing. And we see a bright future in working with influence. My name is Aaron Seley. I'm the CTO at HBI. Highlight's one of the first companies to focus on manufacturing data and apply the concepts of data ops, treat that as an asset to deliver to the it system, to enable applications like overall equipment effectiveness that can help the factory produce better, smarter, faster time series data. And manufacturing's really important. If you take a piece of equipment, you have the temperature pressure at the moment that you can look at to kind of see the state of what's going on. So without that context and understanding you can't do what manufacturers ultimately want to do, which is predict the future. >>Influx DB represents kind of a new way to storm time series data with some more advanced technology and more importantly, more open technologies. The other thing that influx does really well is once the data's influx, it's very easy to get out, right? They have a modern rest API and other ways to access the data. That would be much more difficult to do integrations with classic historians highlight can serve to model data, aggregate data on the shop floor from a multitude of sources, whether that be P C U a servers, manufacturing execution systems, E R P et cetera, and then push that seamlessly into influx to then be able to run calculations. Manufacturing is changing this industrial 4.0, and what we're seeing is influx being part of that equation. Being used to store data off the unified name space, we recommend InfluxDB all the time to customers that are exploring a new way to share data manufacturing called the unified name space who have open questions around how do I share this new data that's coming through my UNS or my QTT broker? How do I store this and be able to query it over time? And we often point to influx as a solution for that is a great brand. It's a great group of people and it's a great technology. >>Okay. We're now going to go into the customer panel and we'd like to welcome Angelo Fasi. Who's a software engineer at the Vera C Ruben observatory in Caleb McLaughlin whose senior spacecraft operations software engineer at loft orbital guys. Thanks for joining us. You don't wanna miss folks this interview, Caleb, let's start with you. You work for an extremely cool company. You're launching satellites into space. I mean, there, of course doing that is, is highly complex and not a cheap endeavor. Tell us about loft Orbi and what you guys do to attack that problem. >>Yeah, absolutely. And, uh, thanks for having me here by the way. Uh, so loft orbital is a, uh, company. That's a series B startup now, uh, who and our mission basically is to provide, uh, rapid access to space for all kinds of customers. Uh, historically if you want to fly something in space, do something in space, it's extremely expensive. You need to book a launch, build a bus, hire a team to operate it, you know, have a big software teams, uh, and then eventually worry about, you know, a bunch like just a lot of very specialized engineering. And what we're trying to do is change that from a super specialized problem that has an extremely high barrier of access to a infrastructure problem. So that it's almost as simple as, you know, deploying a VM in, uh, AWS or GCP is getting your, uh, programs, your mission deployed on orbit, uh, with access to, you know, different sensors, uh, cameras, radios, stuff like that. >>So that's, that's kind of our mission. And just to give a really brief example of the kind of customer that we can serve. Uh, there's a really cool company called, uh, totem labs who is working on building, uh, IOT cons, an IOT constellation for in of things, basically being able to get telemetry from all over the world. They're the first company to demonstrate indoor T, which means you have this little modem inside a container container that you, that you track from anywhere in the world as it's going across the ocean. Um, so they're, it's really little and they've been able to stay a small startup that's focused on their product, which is the, uh, that super crazy complicated, cool radio while we handle the whole space segment for them, which just, you know, before loft was really impossible. So that's, our mission is, uh, providing space infrastructure as a service. We are kind of groundbreaking in this area and we're serving, you know, a huge variety of customers with all kinds of different missions, um, and obviously generating a ton of data in space, uh, that we've gotta handle. Yeah. >>So amazing Caleb, what you guys do, I, now I know you were lured to the skies very early in your career, but how did you kinda land on this business? >>Yeah, so, you know, I've, I guess just a little bit about me for some people, you know, they don't necessarily know what they wanna do like early in their life. For me, I was five years old and I knew, you know, I want to be in the space industry. So, you know, I started in the air force, but have, uh, stayed in the space industry, my whole career and been a part of, uh, this is the fifth space startup that I've been a part of actually. So, you know, I've, I've, uh, kind of started out in satellites, did spent some time in working in, uh, the launch industry on rockets. Then, uh, now I'm here back in satellites and you know, honestly, this is the most exciting of the difference based startups. That I've been a part of >>Super interesting. Okay. Angelo, let's, let's talk about the Ruben observatory, ver C Ruben, famous woman scientist, you know, galaxy guru. Now you guys the observatory, you're up way up high. You're gonna get a good look at the Southern sky. Now I know COVID slowed you guys down a bit, but no doubt. You continued to code away on the software. I know you're getting close. You gotta be super excited. Give us the update on, on the observatory and your role. >>All right. So yeah, Rubin is a state of the art observatory that, uh, is in construction on a remote mountain in Chile. And, um, with Rubin, we conduct the, uh, large survey of space and time we are going to observe the sky with, uh, eight meter optical telescope and take, uh, a thousand pictures every night with a 3.2 gig up peaks of camera. And we are going to do that for 10 years, which is the duration of the survey. >>Yeah. Amazing project. Now you, you were a doctor of philosophy, so you probably spent some time thinking about what's out there and then you went out to earn a PhD in astronomy, in astrophysics. So this is something that you've been working on for the better part of your career, isn't it? >>Yeah, that's that's right. Uh, about 15 years, um, I studied physics in college, then I, um, got a PhD in astronomy and, uh, I worked for about five years in another project. Um, the dark energy survey before joining rubing in 2015. >>Yeah. Impressive. So it seems like you both, you know, your organizations are looking at space from two different angles. One thing you guys both have in common of course is, is, is software. And you both use InfluxDB as part of your, your data infrastructure. How did you discover influx DB get into it? How do you use the platform? Maybe Caleb, you could start. >>Uh, yeah, absolutely. So the first company that I extensively used, uh, influx DBN was a launch startup called, uh, Astra. And we were in the process of, uh, designing our, you know, our first generation rocket there and testing the engines, pumps, everything that goes into a rocket. Uh, and when I joined the company, our data story was not, uh, very mature. We were collecting a bunch of data in LabVIEW and engineers were taking that over to MATLAB to process it. Um, and at first there, you know, that's the way that a lot of engineers and scientists are used to working. Um, and at first that was, uh, like people weren't entirely sure that that was a, um, that that needed to change, but it's something the nice thing about InfluxDB is that, you know, it's so easy to deploy. So as the, our software engineering team was able to get it deployed and, you know, up and running very quickly and then quickly also backport all of the data that we collected thus far into influx and what, uh, was amazing to see. >>And as kind of the, the super cool moment with influx is, um, when we hooked that up to Grafana Grafana as the visualization platform we used with influx, cuz it works really well with it. Uh, there was like this aha moment of our engineers who are used to this post process kind of method for dealing with their data where they could just almost instantly easily discover data that they hadn't been able to see before and take the manual processes that they would run after a test and just throw those all in influx and have live data as tests were coming. And, you know, I saw them implementing like crazy rocket equation type stuff in influx, and it just was totally game changing for how we tested. >>So Angelo, I was explaining in my open, you know, you could, you could add a column in a traditional RDBMS and do time series, but with the volume of data that you're talking about, and the example of the Caleb just gave you, I mean, you have to have a purpose built time series database, where did you first learn about influx DB? >>Yeah, correct. So I work with the data management team, uh, and my first project was the record metrics that measured the performance of our software, uh, the software that we used to process the data. So I started implementing that in a relational database. Um, but then I realized that in fact, I was dealing with time series data and I should really use a solution built for that. And then I started looking at time series databases and I found influx B. And that was, uh, back in 2018. The another use for influx DB that I'm also interested is the visits database. Um, if you think about the observations we are moving the telescope all the time in pointing to specific directions, uh, in the Skype and taking pictures every 30 seconds. So that itself is a time series. And every point in that time series, uh, we call a visit. So we want to record the metadata about those visits and flex to, uh, that time here is going to be 10 years long, um, with about, uh, 1000 points every night. It's actually not too much data compared to other, other problems. It's, uh, really just a different, uh, time scale. >>The telescope at the Ruben observatory is like pun intended, I guess the star of the show. And I, I believe I read that it's gonna be the first of the next gen telescopes to come online. It's got this massive field of view, like three orders of magnitude times the Hub's widest camera view, which is amazing, right? That's like 40 moons in, in an image amazingly fast as well. What else can you tell us about the telescope? >>Um, this telescope, it has to move really fast and it also has to carry, uh, the primary mirror, which is an eight meter piece of glass. It's very heavy and it has to carry a camera, which has about the size of a small car. And this whole structure weighs about 300 tons for that to work. Uh, the telescope needs to be, uh, very compact and stiff. Uh, and one thing that's amazing about it's design is that the telescope, um, is 300 tons structure. It sits on a tiny film of oil, which has the diameter of, uh, human hair. And that makes an almost zero friction interface. In fact, a few people can move these enormous structure with only their hands. Uh, as you said, uh, another aspect that makes this telescope unique is the optical design. It's a wide field telescope. So each image has, uh, in diameter the size of about seven full moons. And, uh, with that, we can map the entire sky in only, uh, three days. And of course doing operations everything's, uh, controlled by software and it is automatic. Um there's a very complex piece of software, uh, called the scheduler, which is responsible for moving the telescope, um, and the camera, which is, uh, recording 15 terabytes of data every night. >>Hmm. And, and, and Angela, all this data lands in influx DB. Correct. And what are you doing with, with all that data? >>Yeah, actually not. Um, so we are using flex DB to record engineering data and metadata about the observations like telemetry events and commands from the telescope. That's a much smaller data set compared to the images, but it is still challenging because, uh, you, you have some high frequency data, uh, that the system needs to keep up and we need to, to start this data and have it around for the lifetime of the price. Mm, >>Got it. Thank you. Okay, Caleb, let's bring you back in and can tell us more about the, you got these dishwasher size satellites. You're kind of using a multi-tenant model. I think it's genius, but, but tell us about the satellites themselves. >>Yeah, absolutely. So, uh, we have in space, some satellites already that as you said, are like dishwasher, mini fridge kind of size. Um, and we're working on a bunch more that are, you know, a variety of sizes from shoebox to, I guess, a few times larger than what we have today. Uh, and it is, we do shoot to have effectively something like a multi-tenant model where, uh, we will buy a bus off the shelf. The bus is, uh, what you can kind of think of as the core piece of the satellite, almost like a motherboard or something where it's providing the power. It has the solar panels, it has some radios attached to it. Uh, it handles the attitude control, basically steers the spacecraft in orbit. And then we build also in house, what we call our payload hub, which is, has all, any customer payloads attached and our own kind of edge processing sort of capabilities built into it. >>And, uh, so we integrate that. We launch it, uh, and those things, because they're in lower orbit, they're orbiting the earth every 90 minutes. That's, you know, seven kilometers per second, which is several times faster than a speeding bullet. So we've got, we have, uh, one of the unique challenges of operating spacecraft and lower orbit is that generally you can't talk to them all the time. So we're managing these things through very brief windows of time, uh, where we get to talk to them through our ground sites, either in Antarctica or, you know, in the north pole region. >>Talk more about how you use influx DB to make sense of this data through all this tech that you're launching into space. >>We basically previously we started off when I joined the company, storing all of that as Angelo did in a regular relational database. And we found that it was, uh, so slow in the size of our data would balloon over the course of a couple days to the point where we weren't able to even store all of the data that we were getting. Uh, so we migrated to influx DB to store our time series telemetry from the spacecraft. So, you know, that's things like, uh, power level voltage, um, currents counts, whatever, whatever metadata we need to monitor about the spacecraft. We now store that in, uh, in influx DB. Uh, and that has, you know, now we can actually easily store the entire volume of data for the mission life so far without having to worry about, you know, the size bloating to an unmanageable amount. >>And we can also seamlessly query, uh, large chunks of data. Like if I need to see, you know, for example, as an operator, I might wanna see how my, uh, battery state of charge is evolving over the course of the year. I can have a plot and an influx that loads that in a fraction of a second for a year's worth of data, because it does, you know, intelligent, um, I can intelligently group the data by, uh, sliding time interval. Uh, so, you know, it's been extremely powerful for us to access the data and, you know, as time has gone on, we've gradually migrated more and more of our operating data into influx. >>You know, let's, let's talk a little bit, uh, uh, but we throw this term around a lot of, you know, data driven, a lot of companies say, oh, yes, we're data driven, but you guys really are. I mean, you' got data at the core, Caleb, what does that, what does that mean to you? >>Yeah, so, you know, I think the, and the clearest example of when I saw this be like totally game changing is what I mentioned before at Astro where our engineer's feedback loop went from, you know, a lot of kind of slow researching, digging into the data to like an instant instantaneous, almost seeing the data, making decisions based on it immediately, rather than having to wait for some processing. And that's something that I've also seen echoed in my current role. Um, but to give another practical example, uh, as I said, we have a huge amount of data that comes down every orbit, and we need to be able to ingest all of that data almost instantaneously and provide it to the operator. And near real time, you know, about a second worth of latency is all that's acceptable for us to react to, to see what is coming down from the spacecraft and building that pipeline is challenging from a software engineering standpoint. >>Um, our primary language is Python, which isn't necessarily that fast. So what we've done is started, you know, in the, in the goal of being data driven is publish metrics on individual, uh, how individual pieces of our data processing pipeline are performing into influx as well. And we do that in production as well as in dev. Uh, so we have kind of a production monitoring, uh, flow. And what that has done is allow us to make intelligent decisions on our software development roadmap, where it makes the most sense for us to, uh, focus our development efforts in terms of improving our software efficiency. Uh, just because we have that visibility into where the real problems are. Um, it's sometimes we've found ourselves before we started doing this kind of chasing rabbits that weren't necessarily the real root cause of issues that we were seeing. Uh, but now, now that we're being a bit more data driven, there we are being much more effective in where we're spending our resources and our time, which is especially critical to us as we scale to, from supporting a couple satellites, to supporting many, many satellites at >>Once. Yeah. Coach. So you reduced those dead ends, maybe Angela, you could talk about what, what sort of data driven means to, to you and your teams? >>I would say that, um, having, uh, real time visibility, uh, to the telemetry data and, and metrics is, is, is crucial for us. We, we need, we need to make sure that the image that we collect with the telescope, uh, have good quality and, um, that they are within the specifications, uh, to meet our science goals. And so if they are not, uh, we want to know that as soon as possible and then, uh, start fixing problems. >>Caleb, what are your sort of event, you know, intervals like? >>So I would say that, you know, as of today on the spacecraft, the event, the, the level of timing that we deal with probably tops out at about, uh, 20 Hertz, 20 measurements per second on, uh, things like our, uh, gyroscopes, but the, you know, I think the, the core point here of the ability to have high precision data is extremely important for these kinds of scientific applications. And I'll give an example, uh, from when I worked at, on the rocket at Astra there, our baseline data rate that we would ingest data during a test is, uh, 500 Hertz. So 500 samples per second. And in some cases we would actually, uh, need to ingest much higher rate data, even up to like 1.5 kilohertz. So, uh, extremely, extremely high precision, uh, data there where timing really matters a lot. And, uh, you know, I can, one of the really powerful things about influx is the fact that it can handle this. >>That's one of the reasons we chose it, uh, because there's times when we're looking at the results of a firing where you're zooming in, you know, I talked earlier about how on my current job, we often zoom out to look, look at a year's worth of data. You're zooming in to where your screen is preoccupied by a tiny fraction of a second. And you need to see same thing as Angela just said, not just the actual telemetry, which is coming in at a high rate, but the events that are coming out of our controllers. So that can be something like, Hey, I opened this valve at exactly this time and that goes, we wanna have that at, you know, micro or even nanosecond precision so that we know, okay, we saw a spike in chamber pressure at, you know, at this exact moment, was that before or after this valve open, those kind of, uh, that kind of visibility is critical in these kind of scientific, uh, applications and absolutely game changing to be able to see that in, uh, near real time and, uh, with a really easy way for engineers to be able to visualize this data themselves without having to wait for, uh, software engineers to go build it for them. >>Can the scientists do self-serve or are you, do you have to design and build all the analytics and, and queries for your >>Scientists? Well, I think that's, that's absolutely from, from my perspective, that's absolutely one of the best things about influx and what I've seen be game changing is that, uh, generally I'd say anyone can learn to use influx. Um, and honestly, most of our users might not even know they're using influx, um, because what this, the interface that we expose to them is Grafana, which is, um, a generic graphing, uh, open source graphing library that is very similar to influx own chronograph. Sure. And what it does is, uh, let it provides this, uh, almost it's a very intuitive UI for building your queries. So you choose a measurement and it shows a dropdown of available measurements. And then you choose a particular, the particular field you wanna look at. And again, that's a dropdown, so it's really easy for our users to discover. And there's kind of point and click options for doing math aggregations. You can even do like perfect kind of predictions all within Grafana, the Grafana user interface, which is really just a wrapper around the APIs and functionality of the influx provides putting >>Data in the hands of those, you know, who have the context of domain experts is, is key. Angela, is it the same situation for you? Is it self serve? >>Yeah, correct. Uh, as I mentioned before, um, we have the astronomers making their own dashboards because they know what exactly what they, they need to, to visualize. Yeah. I mean, it's all about using the right tool for the job. I think, uh, for us, when I joined the company, we weren't using influx DB and we, we were dealing with serious issues of the database growing to an incredible size extremely quickly, and being unable to like even querying short periods of data was taking on the order of seconds, which is just not possible for operations >>Guys. This has been really formative it's, it's pretty exciting to see how the edge is mountaintops, lower orbits to be space is the ultimate edge. Isn't it. I wonder if you could answer two questions to, to wrap here, you know, what comes next for you guys? Uh, and is there something that you're really excited about that, that you're working on Caleb, maybe you could go first and an Angela, you can bring us home. >>Uh, basically what's next for loft. Orbital is more, more satellites, a greater push towards infrastructure and really making, you know, our mission is to make space simple for our customers and for everyone. And we're scaling the company like crazy now, uh, making that happen, it's extremely exciting and extremely exciting time to be in this company and to be in this industry as a whole, because there are so many interesting applications out there. So many cool ways of leveraging space that, uh, people are taking advantage of. And with, uh, companies like SpaceX and the now rapidly lowering cost, cost of launch, it's just a really exciting place to be. And we're launching more satellites. We are scaling up for some constellations and our ground system has to be improved to match. So there's a lot of, uh, improvements that we're working on to really scale up our control software, to be best in class and, uh, make it capable of handling such a large workload. So >>You guys hiring >><laugh>, we are absolutely hiring. So, uh, I would in we're we need, we have PE positions all over the company. So, uh, we need software engineers. We need people who do more aerospace, specific stuff. So, uh, absolutely. I'd encourage anyone to check out the loft orbital website, if there's, if this is at all interesting. >>All right. Angela, bring us home. >>Yeah. So what's next for us is really, uh, getting this, um, telescope working and collecting data. And when that's happen is going to be just, um, the Lu of data coming out of this camera and handling all, uh, that data is going to be really challenging. Uh, yeah. I wanna wanna be here for that. <laugh> I'm looking forward, uh, like for next year we have like an important milestone, which is our, um, commissioning camera, which is a simplified version of the, of the full camera it's going to be on sky. And so yeah, most of the system has to be working by them. >>Nice. All right, guys, you know, with that, we're gonna end it. Thank you so much, really fascinating, and thanks to influx DB for making this possible, really groundbreaking stuff, enabling value creation at the edge, you know, in the cloud and of course, beyond at the space. So really transformational work that you guys are doing. So congratulations and really appreciate the broader community. I can't wait to see what comes next from having this entire ecosystem. Now, in a moment, I'll be back to wrap up. This is Dave ante, and you're watching the cube, the leader in high tech enterprise coverage. >>Welcome Telegraph is a popular open source data collection. Agent Telegraph collects data from hundreds of systems like IOT sensors, cloud deployments, and enterprise applications. It's used by everyone from individual developers and hobbyists to large corporate teams. The Telegraph project has a very welcoming and active open source community. Learn how to get involved by visiting the Telegraph GitHub page, whether you want to contribute code, improve documentation, participate in testing, or just show what you're doing with Telegraph. We'd love to hear what you're building. >>Thanks for watching. Moving the world with influx DB made possible by influx data. I hope you learn some things and are inspired to look deeper into where time series databases might fit into your environment. If you're dealing with large and or fast data volumes, and you wanna scale cost effectively with the highest performance and you're analyzing metrics and data over time times, series databases just might be a great fit for you. Try InfluxDB out. You can start with a free cloud account by clicking on the link and the resources below. Remember all these recordings are gonna be available on demand of the cube.net and influx data.com. So check those out and poke around influx data. They are the folks behind InfluxDB and one of the leaders in the space, we hope you enjoyed the program. This is Dave Valante for the cube. We'll see you soon.

Published Date : May 12 2022

SUMMARY :

case that anyone can relate to and you can build timestamps into Now, the problem with the latter example that I just gave you is that you gotta hunt As I just explained, we have an exciting program for you today, and we're And then we bring it back here Thanks for coming on. What is the story? And, and he basically, you know, from my point of view, he invented modern time series, Yeah, I think we're, I, you know, I always forget the number, but it's something like 230 or 240 people relational database is the one database to rule the world. And then you get the data lake. So And so you get to these applications Isn't good enough when you need real time. It's like having the feature for, you know, you buy a new television, So this is a big part of how we're seeing with people saying, Hey, you know, And so you get the dynamic of, you know, of constantly instrumenting watching the What are you seeing for your, with in, with influx DB, So a lot, you know, Tesla, lucid, motors, Cola, You mentioned, you know, you think of IOT, look at the use cases there, it was proprietary And so the developer, So let's get to the developer real quick, real highlight point here is the data. So to a degree that you are moving your service, So when you bring in kind of old way, new way old way was you know, the best of the open source world. They have faster time to market cuz they're assembling way faster and they get to still is what we like to think of it. I mean systems, uh, uh, systems have consequences when you make changes. But that's where the that's where the, you know, that that Boeing or that airplane building analogy comes in So I'll have to ask you if I'm the customer. Because now I have to make these architectural decisions, as you mentioned, And so that's what you started building. And since I have a PO for you and a big check, yeah. It's not like it's, you know, it's not like it's doing every action that's above, but it's foundational to build What would you say to someone looking to do something in time series on edge? in the build business of building systems that you want 'em to be increasingly intelligent, Brian Gilmore director of IOT and emerging technology that influx day will join me. So you can focus on the Welcome to the show. Sort of, you know, riding along with them is they're successful. Now, you go back since 20 13, 14, even like five years ago that convergence of physical And I think, you know, those, especially in the OT and on the factory floor who weren't able And I think I, OT has been kind of like this thing for OT and, you know, our client libraries and then working hard to make our applications, leveraging that you guys have users in the enterprise users that IOT market mm-hmm <affirmative>, they're excited to be able to adopt and use, you know, to optimize inside the business as compared to just building mm-hmm <affirmative> so how do you support the backwards compatibility of older systems while maintaining open dozens very hard work and a lot of support, um, you know, and so by making those connections and building those ecosystems, What are some of the, um, soundbites you hear from customers when they're successful? machines that go deep into the earth to like drill tunnels for, for, you know, I personally think that's a hot area because I think if you look at AI right all of the things you need to do with that data in stream, um, before it hits your sort of central repository. So you have that whole CEO perspective, but he brought up this notion that You can start to compare asset to asset, and then you can do those things like we talked about, So in this model you have a lot of commercial operations, industrial equipment. And I think, you know, we are, we're building some technology right now. like, you know, either in low earth orbit or you know, all the way sort of on the other side of the universe. I think you bring up a good point there because one of the things that's common in the industry right now, people are talking about, I mean, I think you talked about it, uh, you know, for them just to be able to adopt the platform How do you view view that? Um, you know, and it, it allows the developer to build all of those hooks for not only data creation, There's so much data out there now. that data from point a to point B and you know, to process it correctly so that the end And, and the democratization is the benefit. allow them to just port to us, you know, directly from the applications and the languages Thanks for sharing all, all the complexities and, and IOT that you Well, thank any, any last word you wanna share No, just, I mean, please, you know, if you're, if you're gonna, if you're gonna check out influx TV, You're gonna hear more about that in the next segment, too. the moment that you can look at to kind of see the state of what's going on. And we often point to influx as a solution Tell us about loft Orbi and what you guys do to attack that problem. So that it's almost as simple as, you know, We are kind of groundbreaking in this area and we're serving, you know, a huge variety of customers and I knew, you know, I want to be in the space industry. famous woman scientist, you know, galaxy guru. And we are going to do that for 10 so you probably spent some time thinking about what's out there and then you went out to earn a PhD in astronomy, Um, the dark energy survey So it seems like you both, you know, your organizations are looking at space from two different angles. something the nice thing about InfluxDB is that, you know, it's so easy to deploy. And, you know, I saw them implementing like crazy rocket equation type stuff in influx, and it Um, if you think about the observations we are moving the telescope all the And I, I believe I read that it's gonna be the first of the next Uh, the telescope needs to be, And what are you doing with, compared to the images, but it is still challenging because, uh, you, you have some Okay, Caleb, let's bring you back in and can tell us more about the, you got these dishwasher and we're working on a bunch more that are, you know, a variety of sizes from shoebox sites, either in Antarctica or, you know, in the north pole region. Talk more about how you use influx DB to make sense of this data through all this tech that you're launching of data for the mission life so far without having to worry about, you know, the size bloating to an Like if I need to see, you know, for example, as an operator, I might wanna see how my, You know, let's, let's talk a little bit, uh, uh, but we throw this term around a lot of, you know, data driven, And near real time, you know, about a second worth of latency is all that's acceptable for us to react you know, in the, in the goal of being data driven is publish metrics on individual, So you reduced those dead ends, maybe Angela, you could talk about what, what sort of data driven means And so if they are not, So I would say that, you know, as of today on the spacecraft, the event, so that we know, okay, we saw a spike in chamber pressure at, you know, at this exact moment, the particular field you wanna look at. Data in the hands of those, you know, who have the context of domain experts is, issues of the database growing to an incredible size extremely quickly, and being two questions to, to wrap here, you know, what comes next for you guys? a greater push towards infrastructure and really making, you know, So, uh, we need software engineers. Angela, bring us home. And so yeah, most of the system has to be working by them. at the edge, you know, in the cloud and of course, beyond at the space. involved by visiting the Telegraph GitHub page, whether you want to contribute code, and one of the leaders in the space, we hope you enjoyed the program.

ENTITIES

Entity	Category	Confidence
Brian Gilmore	PERSON	0.99+
John	PERSON	0.99+
Angela	PERSON	0.99+
Evan	PERSON	0.99+
2015	DATE	0.99+
SpaceX	ORGANIZATION	0.99+
2016	DATE	0.99+
Dave Valante	PERSON	0.99+
Antarctica	LOCATION	0.99+
Boeing	ORGANIZATION	0.99+
Caleb	PERSON	0.99+
10 years	QUANTITY	0.99+
Chile	LOCATION	0.99+
Brian	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Evan Kaplan	PERSON	0.99+
Aaron Seley	PERSON	0.99+
Angelo Fasi	PERSON	0.99+
2013	DATE	0.99+
Paul	PERSON	0.99+
Tesla	ORGANIZATION	0.99+
2018	DATE	0.99+
IBM	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
two questions	QUANTITY	0.99+
Caleb McLaughlin	PERSON	0.99+
40 moons	QUANTITY	0.99+
two systems	QUANTITY	0.99+
two	QUANTITY	0.99+
Angelo	PERSON	0.99+
230	QUANTITY	0.99+
300 tons	QUANTITY	0.99+
three	QUANTITY	0.99+
500 Hertz	QUANTITY	0.99+
3.2 gig	QUANTITY	0.99+
15 terabytes	QUANTITY	0.99+
eight meter	QUANTITY	0.99+
two practitioners	QUANTITY	0.99+
20 Hertz	QUANTITY	0.99+
25 years	QUANTITY	0.99+
Today	DATE	0.99+
Palo Alto	LOCATION	0.99+
Python	TITLE	0.99+
Oracle	ORGANIZATION	0.99+
Paul dicks	PERSON	0.99+
First	QUANTITY	0.99+
iPhones	COMMERCIAL_ITEM	0.99+
first	QUANTITY	0.99+
earth	LOCATION	0.99+
240 people	QUANTITY	0.99+
three days	QUANTITY	0.99+
apple	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
HBI	ORGANIZATION	0.99+
Dave LAN	PERSON	0.99+
today	DATE	0.99+
each image	QUANTITY	0.99+
next year	DATE	0.99+
cube.net	OTHER	0.99+
InfluxDB	TITLE	0.99+
one	QUANTITY	0.98+
1000 points	QUANTITY	0.98+

Amir Kaltak, Lexit | Polycon 2018

(bubbly electronic music) >> Narrator: Live from Nassau in the Bahamas, it's theCUBE, covering Polycon 18. Brought to you by Polymath. >> Okay, welcome back everyone. We're live here in the Bahamas. This is theCUBE's exclusive coverage of the token economics world cryptocurrency blockchain, the new innovation that's changing the world. And of course, word on the ground floor, day two of coverage our next guest Amir Kaltak, CEO and founder of L-exit, L-exit, Lexit, called legalized exit, legit exit. He's automating the M&A process in a decentralized way. This is exactly the kind of value we see with cloud computing, and when you see automation and efficiencies, that's disruptive. Amir, congratulations on your awesome venture. Love your model. Let's get into details, because I think-- >> Thank you. >> You're demonstrating, in my opinion, where value is being created and then ultimately captured faster, more efficiently, because you're automating the M&A process. For people to get exit in a highly volatile, value creation, value capture world. Take a minute to explain your company. This is fantastic. Thank you for this kind intro. Hello, world. Lexit, in a nutshell, is M&A on blockchain, and I hope you guys will love it. What we do is we give you access to the world of M&A, which is currently a big boys club, and we want you guys to be all participating in it, from the small entrepreneur who just started out, who crashed his startup but created a great tac. He can sell it on it. The whole world's going to see it. Or to the seasoned entrepreneur, to the big entity, to the big enterprise, you can sell it on there too. You will be seen by all acquirers in the world, and at a penny of the cost, at multitudes of speed, you will be able to liquidize and asset your business. >> So we, Dave and I, predict that there's going to be a lot of liquidity going on at many levels, obviously. Token economics drives that, but in the startup world, you either make it or you sell it, or you put it out of business. In this world, as people start developing technology, the difference between a company and a feature might not be the same. So I might build the best app for social entrepreneurship, for solving world hunger or tracking the water supply through blockchain. And someone says, "Damn, I love that. I'm going to buy that." Now I got to go to a banker, I got to get legal fees. My choices are-- >> Yes, limited. >> Limited, hassle, costs cash. >> As a guy who just crashed a startup, let's take this example, because the majority does fail: over 90%, as matter of fact 96%. Now, you just failed, but you have this great technology you created, right? What are you going to do? You check your address book. Who might buy it? But it's limited too, because you just started out. You don't know nobody, so what you do is you go to a consultancy: M&A consultancy, the lawyers who are connected to this sphere. >> John: The gatekeepers. >> Yes, the gatekeepers. Big boys. >> John: And they take a big cut of it. >> They will tell you, it's an amazing technology. We'll help you sell it, but you make a down payment of $10,000 right now, and we'll look into it. But you don't have $10,000 right now, for instance. So what are you going to do? And even if you pay them, the likeability of them getting back to you is not that high, so what you do is on Lexit is, you get to Lexit, you open up your account, get KYC'd. We're very strict on that. It's fully legit platform, and you list it. You get assessed by our professional M&A network, given a value to it. It's not the value it will be sold for, but it's a value the professional assessor thinks you're worth, and why. He's going to say these reasons. Now, the buyers are going to see it, make the bids, and there you are. Access. >> So you guys automate that entire end-to-end process. >> Absolutely. So, Lexit is, from the listing, down to the final due diligence and drafting of agreements. Everything is in it. The final signature and the transfer of ownership. It's a full solution. >> Yes. The future of work, obviously, is about automation. I mentioned cloud computing, because we look at that market heavily. On the tech side, automation drives it, but managing processes, automating processes away is threatening to a lot of people. You're basically putting people out of business, potentially. >> Yes, I keep doing that. >> If you're successful, a cadre of ecosystem partners, service providers, traditionally go out of business, so I like that. >> Potentially. >> Well, they're going to have to adapt or change. I see this in global service integrators, like Accenture. These guys are getting eaten up by machine learning automated coding, because they can do it faster and better. >> You know my business better than I do! (both laugh) >> What we do at theCUBE, we know our stuff. So, this is disruptive, and at the end of the day, the other thing I want to get your reaction to is open-source. A lot of people in the ethos of the mission based open-source world is, I wrote coding as an open-source. If my company fails, and my VC's make it proprietary, it's like an owned asset in bankruptcy, or whatever, dying, you can't put it back, but with open-source code, there's always going to be value there, to some level. It might not be great. So, I might say "Hey, you had a failed venture, I'll buy your code." >> Exactly. >> Transfer your GitHub over. Done. >> That's how it works. >> So this is kind of like the dynamic that... Do you see that? >> This is the direction that we're heading to. We want to connect the dots, because we started Lexit out of a community approach. We figured out two years ago, when we started it... So we're two years into that right now, that this is direly needed. We don't have access, but if I got this problem right now as a startupper, so do many. And out of this thinking, we claim ourselves to be the startups for startups and empower the community. I believe that in terms of leadership, for instance, you're only a good leader if you empower everyone around you to become a good leader, based on respect and mutual purpose. >> So I got to ask you a question. >> Please. >> Cause everyone's going to ask this question of all startups. You got to know where you are. Are you a startup? Are you a growing company? Where's the product? How far along are you? When is it going to be released? Talk about the momentum of the offering that you have. Is it available in Beta? What's the status of the product itself? Because I'm sure it'll be used a lot. >> As I said, we started two years ago. The first year we didn't even write a single line of code. It was just like how do you put this huge M&A process into a usable yet powerful but simple to use platform. How do you do that? Just scalable from the small, small asset you want to sell, a line of code, an algorithm up to a large enterprise. The first year was finding out a process. What is necessary? How do we cover all aspects on different jurisdictions, and all this stuff, right? How to make it work on the legal side too. And we figured it out, and then we started doing it. And right now I can tell you guys we are scheduling the launch of Lexit, this year, in June. So we'll not just-- >> The product will be ready for production, shipping product. >> Absolutely. Available worldwide, completely worldwide ready to operate. Ready to make your deals, to put your listings, to make your bids, to get the best technology out there, but not just technology. Letters M&A, it means any kind of of business from a pizza chain, to a high tech company, to a biotech company, to food, supplies you can sell. >> Usually when I do legal documents, you see an exhibits in there, and say oh, exhibit A is all the IP, or whatever the seller's selling and the buyer's buying. When you deal with decentralized asset creation and capture, use that blockchain involved, how much is the tech involved in your process? Obviously, the legal stuff, I can really see automating away. That's like check one. But when you start dealing with assets that are either code or something durable, like property, that's maybe stored in blockchain, how do you guys look at that? Is that part of the automation? Is that a factor? Where does that impact? Is that an exhibit? Do I just say "Here's my key"? How do you deal with that? >> Alright, let's put it this way. We do want to connect existing M&A space to Lexit. The exits, they're huge structures. We do want to disrupt them, that's true, but to do that you can't just create entirely everything new. You have to kind of find a way for the big boys old club, the big banks, and all those folks around there to participate, right? To give them a familiar way to work. What we did is the token model economics in a way that people get rewarded, people pay for stuff inside of it, and such, right? Everything is triggered with smart contracts, obviously, to know did you do the down payment, did the signature happen. The smart contracts are automating the whole thing down to the final transaction. When the final transaction happens we get our commissions paid out from the Astro we have. The Lexit Crypto Astro. Everything is transferred and secure. Everybody involved into a deal knows exactly what's happening. >> John: And they have a shared incentive too. >> Absolutely >> They're tokenizing the process so there's a reward element. Right? >> Yes. >> Am I getting this right? >> Yes, the access. There are three parties in Lexit. Buyer, seller, obviously and the assessors. Professional M&A guys. They get rewarded in tokens, and that greatly. Pretty much in the magnitude of what they do in billables at the Big Four, PWC and so on. There's a high incentive there to do this in this assessment, and they get rewarded from the community pool which gets feed with all those listing fees, unlocking futures and everything that's happening within Lexit itself. The kicker is that we at Lexit believe that much in token that the commission you have to pay us is between 8 and 2%. 2% of about 35 million dollars in volume, and it gets a bit higher down to the lower ones. We take this commission only in our own token. I don't want dollars, not even Bitcoin. >> So you have your own token? >> Yes. >> Utility token or security token? >> It's a utility token strictly, and it's called LXT. >> LXT. Great. And is it available now, or are you going to launch it in June? >> Right now we are in the private pre-sale. I'll put through, and it looks like we'll keep it in the presale. It looks like the page was selling out LXT right now to the private backers. It's that high that we think in two weeks from now on, speaking mid-March, it's sold out >> What's the numbers? Hard cap, soft cap? Do you have the numbers? >> I told my team "Listen, everybody tells me: 'you're doing M&A on blockchain, you can raise hundreds of millions,'" and everybody will say "That's okay." I said "We don't need that money." I just want to raise what we do need to finalize the last mile of the dev and launch it this year. The hard cap is 10,000 ETH. 10K ETH only, roughly $9 million right now, and that's it. >> And you're going to reserve the other tokens for the community to do the work and be part of this new future of work equation? >> 50% of the total supply, which is 18 million, it's zero, goes to sale, to the market. Just 10% to us founders. You don't need more. >> So you're not greedy? >> No. >> You guys are playing it right to create-- >> I want the community to be empowered, this whole-- >> You need the community. You need the community. >> I need the community. >> So that is a different dynamic... Well, not different. That is the dynamic that everyone is agreeing on in the community in the ecosystem here, is that if you have bogarting or hoarding coins, or people taking down allocations, you miss the dynamic of the human capital, which is what the future of work is doing. You are an example-- >> Free promotion, you know what I mean? >> You're engaging. The future of work requires human capital. So if one institutional buyer buys the token out, there's no people. >> I interrupted you. You said "We are an example for what"? >> The future of work. >> I love that. >> You are executing, potentially, disruptive M&A, but you're not going after the banks directly. They can play, too. >> Right. >> So you guys are a service. You're like an Amazon.com website cloud service for... >> You could say that. >> M&A. Well, not like, but automated. Automating away things is the way to go. Do you see other examples that are like you guys, that are emerging in use cases? Obviously you're taking a known process, M&A, automating it away, making it tokenized. What other things do you see out there that's ripe for disruption? >> I do think that if somebody out there... Lexit, what we do, let's put it aside for a moment. I think supply, the supply chains of the world are ripe for disruption. I think they're inefficient. I think even food production, down to the basic needs of a human being, this is ripe for disruption. >> When I got my MBA back in the 90s, after I got my Computer Science degree in the 80s, I remember the word that always stick in my head from the books that they teach you is the "Value Chain." >> Value Chain. >> The Value Chain is a concept of anything, of value creation. This notion of chaining, blockchain, you see it... Anything that has value creation process. >> Let's take food production for a moment. Rice, okay? Rice. So now there is this farmer, somewhere in Asia, or in elsewhere, and he's producing, and he's selling it to somebody, who's picking it up, and he sells to the next distributor. He sells to an international distributor. He sold it for probably... I don't know, maybe 20 cents a pound tops? Probably just five. I don't know the prices. What happens if we could chain that supply chain, that we have a decentralized nature of how all these people can directly feed into the system and just jump those middlemen entirely. So this is what I'm speaking about. It's going to disrupt everything. Somebody's going to figure out that one. >> So you guys have a good formula, just to recap. You're automating the M&A process, you're creating a huge supply of tokens available to the community, that will help you change the game on M&A, which is also part of the process of your value chain, now tokenized, and you're taking a small cut that's a tiered commission, if you will, on the M&A transaction. >> It's like six times cheaper. >> Higher for the lower numbers and as you go higher, which you want more deals, you take a smaller cut, so it's not greedy, you're not taking a grotesque-- >> Nope. We go even beyond. Around Lexit we created a partner program. This partner program is fueling directly deals onto Lexit, and we give them 50% off the commission. People tell me "You're crazy." No I'm not. You need to incentify. So if you get thousands of these partners one day... Think of that. 50% is still a lot. I believe in sharing everything except my girlfriend. Everything is fine, so we share. You can have my beer, that's fine. (both laugh) Speaking of that, I believe in this-- >> Well, the Network Effect, too. Sharing is an ethos of distribution, so distribution is sharing. Sharing is also a social thing, but social gamification really is about distribution. You're essentially creating a network effect, and this is the fundamental pattern in token economics, is the networks. >> Totally true. >> You see that? >> That's how it happens. >> What's your situation now? You've got a deal going on. Are you with Polymath? >> That's amazing, I-- >> Talk about that. You're announcing it on stage in about an hour. >> It's true, yeah. The stage is about to come. Trevor and his team do a great job. Boarding startups with the tokens to become a security token. I believe there is a huge business for them in the future. And now we want to work with them together, so we partner up, and what we do is... One of the models is that we will help their clients to liquidize these tokens, then, over Lexit. So this is one of the thoughts we have. We're just figuring out a few things, but we're very excited. >> John: And it's all API-based, I'm assuming, right? >> All API-based. It's been highly automated, of course. Automation. It's all about automation. You have to lower the cost to make it efficient, to make it cheap for everybody involved, so you have to automate everything you can, and smart contracts are, per se, an automation tool. >> Well, Amir, good luck with your venture, Lexit. I love the idea, I love what you're doing. I think this is what we look for in theCUBE, this kind of innovation. We think it's awesome. Good luck on your team. Product's almost pre-launch. >> And the pre-sale is almost through, so if you guys want in before ETH (mumbles), let me just drop that one in two weeks. It's closing and we distro in between four to six weeks, we're going to distro the token. So it's everything happening right now, and soon after that the exchanges are waiting, and you'll be surprised. They're going to be the good ones. >> This is innovations theCUBE are covering: the blockchain, the cryptocurrency. We're at Polycon 18. Polymath is the folks putting on the event with Grit Capital, a Canadian contingency, but they know their cryptography. If you know Canada, you know what the deal is there. It's theCUBE covering it live. We'll be back with more live coverage after this short break. >> Thank you. (reverb-heavy electronic music) (moody ambient electronic) (moody ambient electronic) >> Hi, I'm John Furrier, the co-founder of SiliconANGLE Media and co-host of theCUBE. I've been in the tech business since I was 19, first programming on minicomputers in a large enterprise, and then worked at IBM and Hewlett Packard, a total of nine years in the enterprise. Various jobs from programming, training, consulting and, ultimately as an executive salesperson, and then started my first company, it was in 1997, and moved to Silicon Valley in 1999. I've been here ever since. I've always loved technology, and I love covering emerging technology. I was trained as a software developer, and loved business. I loved the impact of software technology to business. To me, creating technology that starts a company and creates value and jobs is probably one of the most rewarding things I've ever been involved in. And I bring that energy to theCUBE because theCUBE is where all the ideas are and where the experts are, where the people are and I think what's most exciting about theCUBE is that we get to talk to people who are making things happen. Entrepreneurs, CEO of companies, Venture Capitalists. People who are really, on a day in and day out basis, building great companies. In the technology business, there's just not a lot of real-time live TV coverage, and theCUBE is a non-linear TV operation. We do everything that the TV guys on cable don't do. We do longer interviews. We ask tougher questions. We ask sometimes some light questions. We talk about the person and what they feel about. It's not prompted and scripted. It's a conversation. It's authentic. And for shows that have theCUBE coverage, it makes the show buzz, it creates excitement, and more importantly, it creates great content, great digital assets that can be shared instantaneously to the world. Over 31 million people have viewed theCUBE and that is the result of great content, great conversations, and I'm so proud to be part of theCUBE, a great team. Hi, I'm John Furrier. Thanks for watching theCUBE. (emotive electronic music) >> Narrator: Robert Herjavec!

Published Date : Mar 3 2018

SUMMARY :

Brought to you by Polymath. This is exactly the kind of value to the big enterprise, you can sell it on there too. So I might build the best app for social entrepreneurship, You don't know nobody, so what you do is Yes, the gatekeepers. of them getting back to you is not that high, The final signature and the transfer of ownership. is threatening to a lot of people. a cadre of ecosystem partners, service providers, Well, they're going to have to adapt or change. A lot of people in the ethos of the mission based Transfer your GitHub over. Do you see that? This is the direction that we're heading to. Talk about the momentum of the offering that you have. Just scalable from the small, small asset you want to sell, The product will be ready for production, to a biotech company, to food, supplies you can sell. Is that part of the automation? to know did you do the down payment, so there's a reward element. and it gets a bit higher down to the lower ones. and it's called LXT. And is it available now, or are you going to launch it in June? It looks like the page was selling you can raise hundreds of millions,'" 50% of the total supply, which is 18 million, You need the community. is agreeing on in the community So if one institutional buyer buys the token out, I interrupted you. You are executing, potentially, disruptive M&A, So you guys are a service. Do you see other examples that are like you guys, down to the basic needs of a human being, from the books that they teach you is the "Value Chain." This notion of chaining, blockchain, you see it... I don't know the prices. to the community, that will help you change the game on M&A, So if you get thousands of these partners one day... Well, the Network Effect, too. Are you with Polymath? You're announcing it on stage in about an hour. One of the models is that we will help their clients so you have to automate everything you can, I love the idea, I love what you're doing. and soon after that the exchanges are waiting, Polymath is the folks putting on the event Thank you. I loved the impact of software technology to business.

ENTITIES

Entity	Category	Confidence
Amir Kaltak	PERSON	0.99+
Trevor	PERSON	0.99+
Dave	PERSON	0.99+
John	PERSON	0.99+
Asia	LOCATION	0.99+
Amir	PERSON	0.99+
John Furrier	PERSON	0.99+
1997	DATE	0.99+
June	DATE	0.99+
18 million	QUANTITY	0.99+
$10,000	QUANTITY	0.99+
four	QUANTITY	0.99+
1999	DATE	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
nine years	QUANTITY	0.99+
Silicon Valley	LOCATION	0.99+
Bahamas	LOCATION	0.99+
Hewlett Packard	ORGANIZATION	0.99+
M&A	ORGANIZATION	0.99+
10%	QUANTITY	0.99+
Lexit	ORGANIZATION	0.99+
six times	QUANTITY	0.99+
IBM	ORGANIZATION	0.99+
50%	QUANTITY	0.99+
Robert Herjavec	PERSON	0.99+
90s	DATE	0.99+
five	QUANTITY	0.99+
hundreds of millions	QUANTITY	0.99+
six weeks	QUANTITY	0.99+
80s	DATE	0.99+
Lexit	PERSON	0.99+
Grit Capital	ORGANIZATION	0.99+
Amazon.com	ORGANIZATION	0.99+
One	QUANTITY	0.99+
this year	DATE	0.99+
first company	QUANTITY	0.99+
one	QUANTITY	0.99+
mid-March	DATE	0.99+
over 90%	QUANTITY	0.99+
Accenture	ORGANIZATION	0.99+
two weeks	QUANTITY	0.99+
both	QUANTITY	0.99+
8	QUANTITY	0.98+
Canada	LOCATION	0.98+
zero	QUANTITY	0.98+
first	QUANTITY	0.98+
$9 million	QUANTITY	0.97+
Polymath	ORGANIZATION	0.97+
two years ago	DATE	0.97+
two years	QUANTITY	0.97+
about 35 million dollars	QUANTITY	0.97+
three parties	QUANTITY	0.97+
10,000 ETH	QUANTITY	0.96+
GitHub	ORGANIZATION	0.96+
Over 31 million people	QUANTITY	0.96+
Astro	ORGANIZATION	0.95+
PWC	ORGANIZATION	0.95+
2018	DATE	0.95+
10K ETH	QUANTITY	0.95+
M&A	TITLE	0.95+
single line	QUANTITY	0.94+
19	QUANTITY	0.92+
first year	QUANTITY	0.92+
theCUBE	ORGANIZATION	0.91+
Big Four	EVENT	0.9+
96%	QUANTITY	0.89+
KYC	ORGANIZATION	0.89+
Polycon 18	TITLE	0.88+
one day	QUANTITY	0.86+

Bruno Aziza & Josh Klahr, AtScale - Big Data SV 17 - #BigDataSV - #theCUBE1

>> Announcer: Live from San Jose, California, it's The Cube. Covering Big Data, Silicon Valley, 2017. (electronic music) >> Okay, welcome back everyone, live at Silicon Valley for the big The Cube coverage, I'm John Furrier, with me Wikibon analyst George Gilbert, Bruno Aziza, who's on the CMO of AtScale, Cube alumni, and Josh Klahr VP at AtScale, welcome to the Cube. >> Welcome back. >> Thank you. >> Thanks, Brian. >> Bruno, great to see you. You look great, you're smiling as always. Business is good? >> Business is great. >> Give us the update on AtScale, what's up since we last saw you in New York? >> Well, thanks for having us, first of all. And, yeah, business is great, we- I think Last time I was here on The Cube we talked about the Hadoop Maturity Survey and at the time we'd just launched the company. And, so now you look about a year out and we've grown about 10x. We have large enterprises across just about any vertical you can think of. You know, financial services, your American Express, healthcare, think about ETNA, SIGNA, GSK, retail, Home Depot, Macy's and so forth. And, we've also done a lot of work with our partner Ecosystem, so Mork's- OEM's AtScale technology which is a great way for us to get you AtScale across the US, but also internationally. And then our customers are getting recognized for the work that they are doing with AtScale. So, last year, for instance, Yellowpages got recognized by Cloudera, on their leadership award. And Macy's got a leadership award as well. So, things are going the right trajectory, and I think we're also benefitting from the fact that the industry is changing, it's maturing on the the big data side, but also there's a right definition of what business intelligence means. This idea that you can have analytics on large-scale data without having to change your visualization tools and make that work with existing stock you have in place. And, I think that's been helping us in growing- >> How did you guys do it? I mean, you know, we've talked many times in there's some secret sauce there, but, at the time when you guys were first starting it was kind of crowded field, right? >> Bruno: Yeah. >> And all these BI tools were out there, you had front end BI tools- >> Bruno: Yep. But everyone was still separate from the whole batch back end. So, what did you guys do to break out? >> So, there's two key differentiators with AtScale. The first one is we are the only platform that does not have a visualization tool. And, so people think about this as, that's a bug, that's actually a feature. Because, most enterprises have already that stuff made with traditional BI tools. And so our ability to talk to MDX and SQL types of BI tools, without any changes is a big differentiator. And then the other piece of our technology, this idea that you can get the speed, the scale and security on large data sets without having to move the data. It's a big differentiation for our enterprise to get value out of the data. They already have in Hadoop as well as non-Hadoop systems, which we cover. >> Josh, you're the VP of products, you have the roadmaps, give us a peek into what's happening with the current product. And, where's the work areas? Where are you guys going? What's the to-do list, what's the check box, and what's the innovation coming around the corner? >> Yeah, I think, to follow up on what Bruno said about how we hit the sweet spot. I think- we made a strategic choice, which is we don't want to be in the business of trying to be Tableu or Excel or be a better front end. And there's so much diversity on the back end if you look at the ecosystem right now, whether it's Spark Sequel, or Hive, or Presto, or even new cloud based systems, the sweet spot is really how do you fit into those ecosystems and support the right level of BI on top of those applications. So, what we're looking at, from a road map perspective is how do we expand and support the back end data platforms that customers are asking about? I think we saw a big white space in BI on Hadoop in particular. And that's- I'd say, we've nailed it over the past year and a half. But, we see customers now that are asking us about Google Big Query. They're asking us about Athena. I think these server-less data platforms are really, really compelling. They're going to take a while to get adoption. So, that's a big investment area for us. And then, in terms of supporting BI front ends, we're kind of doubling down on making sure our Tableau integration is great, Power BI is I think getting really big traction. >> Well, two great products, you've got Microsoft and Tableau, leaders in that area. >> The self-service BI revolution has, I would say, has won. And the business user wants their tool of choice. Where we come in is the folks responsible for data platforms on the back end, they want some level of control and consistency and so they're trying to figure out, where do you draw the line? Where do you provide standards? Where do you provide governance, and where do you let the business lose? >> All right, so, Bruno and Josh, I want you to answer the questions, be a good quiz. So, define next generation BI platforms from a functional standpoint and then under the hood. >> Yeah, there's a few things you can look at. I think if you were at the Gartner BI conference last week you saw that there was 24 vendors in the magic quadrant and I think in general people are now realizing that this is a space that is extremely crowded and it's also sitting on technology that was built 20 years ago. Now, when you talk to enterprises like the ones we work with, like, as I named earlier, you realize that they all have multiple BI tools. So, the visualization war, if you will, kind of has been set up and almost won by Microsoft and Tableau at this point. And, the average enterprise is 15 different BI tools. So, clearly, if you're trying to innovate on the visualization side, I would say you're going to have a very hard time. So, you're dealing with that level of complexity. And then, at the back end standpoint, you're now having to deal with database from the past - that's the Teradata of this world - data sources from today - Hadoop - and data sources from the future, like Google Big Query. And, so, I think the CIO answer of what is the next gen BI platform I want is something that is enabling me to simplify this very complex world. I have lots of BI tools, lots of data, how can I standardize in the middle in order to provide security, provide scale, provide speed to my business users and, you know, that's really radically going to change the space, I think. If you're trying to sell a full stack that's integrated from the bottom all the way to visualization, I don't think that's what enterprises want anymore >> Josh, under the hood, what's the next generation- you know, key leverage for the tech, and, just the enabler. >> Yeah, so, for me the end state for the next generation GI platform is a user can log in, they can point to their data, wherever that data is, it's on Prime, it's in the cloud, it's in a relational database, it's a flat file, they can design their business model. We spend a lot of time making sure we can support the creation of business models, what are the key metrics, what are the hierarchies, what are the measures, it may sound like I'm talking about OLAP. You know, that's what our history is steeped in. >> Well, faster data is coming, that's- streaming and data is coming together. >> So, I should be able to just point at those data sets and turn around and be able to analyze it immediately. On the back end that means we need to have pretty robust modeling capabilities. So that you can define those complex metrics, so you can functionally do what are traditional business analytics, period over period comparisons, rolling averages, navigate up and down business hierarchies. The optimizations should be built in. It shouldn't be the responsibility of the designer to figure out, do I need to create indeces, do I need to create aggregates, do I need to create summarization? That should all be handled for you automatically. Shouldn't think about data movement. And so that's really what we've built in from an AtScale perspective on the back end. Point to data, we're smart about creating optimal data structure so you get fast performance. And then, you should be able to connect whatever BI tool you want. You should be able to connect Excel, we can talk the MDX Query language. We can talk Sequel, we can talk Dax, whatever language you want to talk. >> So, take the syntax out of the hands of the user. >> Yeah. >> Yeah. >> And getting in the weeds on that stuff. Make it easier for them- >> Exactly. >> And the key word I think, for the future of BI is open, right? We've been buying tools over the last- >> What do you mean by that, explain. >> Open means that you can choose whatever BI tool you want, and you can choose whatever data you want. And, as a business user there's no real compromise. But, because you're getting an open platform it doesn't mean that you have to trade off complexity. I think some of the stuff that Josh was talking about, period analysis, the type of multidimensional analysis that you need, calendar analysis, historical data, that's still going to be needed, but you're going to need to provide this in a world where the business, user, and IT organization expects that the tools they buy are going to be open to the rest of the ecosystem, and that's new, I think. >> George, you want to get a question in, edgewise? Come on. (group laughs) >> You know, I've been sort of a single-issue candidate, I guess, this week on machine learning and how it's sort of touching all the different sectors. And, I'm wondering, are you- how do you see yourselves as part of a broader pipeline of different users adding different types of value to data? >> I think maybe on the machine learning topic there is a few different ways to look at it. The first is we do use machine learning in our own product. I talked about this concept of auto-optimization. One of the things that AtScale does is it looks at end-user query patterns. And we look at those query patterns and try to figure out how can we be smart about anticipating the next thing they're going to ask so we can pre-index, or pre-materialize that data? So, there's machine learning in the context of making AtScale a better product. >> Reusing things that are already done, that's been the whole machine-learning- >> Yes. >> Demos, we saw Google Next with the video editing and the video recognition stuff, that's been- >> Exactly. >> Huge part of it. >> You've got users giving you signals, take that information and be smart with it. I think, in terms of the customer work flow - Comcast, for example, a customer of ours - we are in a data discovery phase, there's a data science group that looks at all of their set top box data, and they're trying to discover programming patterns. Who uses the Yankees' network for example? And where they use AtScale is what I would call a descriptive element, where they're trying to figure out what are the key measures and trends, and what are the attributes that contribute to that. And then they'll go in and they'll use machine learning tools on top of that same data set to come up with predictive algorithms. >> So, just to be clear there, they're hypotehsizing about, like, say, either the pattern of users that might be- have an affinity for a certain channel or channels, or they're looking for pathways. >> Yes. And I'd say our role in that right now is a descriptive role. We're supporting the descriptive element of that analytics life cycle. I think over time our customers are going to push us to build in more of our own capabilities, when it comes to, okay, I discovered something descriptive, can you come up with a model that helps me predict it the next time around? Honestly, right now people want BI. People want very traditional BI on the next generation data platform. >> Just, continuing on that theme, leaving machine learning aside, I guess, as I understand it, when we talked about the old school vendors, Care Data, when they wanted to support data scientists they grafted on some machine learning, like a parallel version of our- in the core Teradata engine. They also bought Astro Data, which was, you know, for a different audience. So, I guess, my question is, will we see from you, ultimately, a separate product line to support a new class of users? Or, are you thinking about new functionality that gets integrated into the core product. I think it's more of the latter. So, the way that we view it- and this is really looking at, like I said, what people are asking for today is, kind of, the basic, traditional BI. What we're building is essentially a business model. So, when someone uses AtScale, they're designing and they're telling us, they're asserting, these are the things I'm interested in measuring, and these are the attributes that I think might contribute to it. And, so that puts us in a pretty good position to start using, whether it's Spark on the back end, or built in machine learning algorithms on the Hadoop cluster, let's start using our knowledge of that business model to help make predictions on behalf of the customer. So, just a follow-up, and this really leaves out the machine learning part, which is, it sounds like, we went- in terms of big data we we first to archive it- supported more data retension than could do affordably with the data warehouse. Then we did the ETL offload, now we're doing more and more of the visualization, the ad-hoc stuff. >> That's exactly right. So, what- in a couple years time, what remains in the classic data warehouse, and what's in the Hadoop category? >> Well, so there is, I think what you're describing is the pure evolution, of, you know, any technology where you start with the infrastructure, you know, we've been in this for over ten years, now, you've got cloud. They are going APO and then going into the data science workbench. >> That's not official yet. >> I think we read about this, or at least they filed. But I think the direction is showing- now people are relying on the platform, the Hadoop platform, in order to build applications on top of it. And, so, I think, just like Josh is saying, the mainstream application on top of the database - and I think this is true for non-Hadoop systems as well - is always going to be analytics. Of course, data science is something that provides a lot of value, but it typically provides a lot of value to a few set of people that will then scale it out to the rest of their organization. I think if you now project out to what does this mean for the CIO and their environment, I don't think any of these platforms, Teradata or Hadoop, or Google, or Amazon or any of those, I don't think do 100% replace. And, I think that's where it becomes interesting, because you're now having to deal with a hetergeneous environment, where the business user is up, they're using Excel, they're using they're standard net application, they might be using the result of machine learning models, but they're also having to deal with the heterogeneous environment at the data level. Hadoop on Prime, Hadoop in the cloud, non-Hadoop in the cloud and non-Hadoop on Prime. And, of course that's a market that I think is very interesting for us as a simplification platform for that world. >> I think you guys are really thinking about it in a new way, and I think that's kind of a great, modern approach, let the freedom- and by the way, quick question on the Microsoft tool and Tableau, what percentage share do you think they are of the market? 50? Because you mentioned those are the two top ones. >> Are they? >> Yeah, I mentioned them, because if you look at the magic quadrant, clearly Microsoft, Power BI and Tableau have really shot up all the way to the right. >> Because it's easy to use, and it's easy to work with data. >> I think so, I think- look, from a functionality standpoint, you see Tableau's done a very good job on the visualization side. I think, from a business standpoint, and a business model execution, and I can talk from my days at Microsoft, it's a very great distribution model to get thousands and thousands of users to use power BI. Now, the guys that we didn't talk about on the last magic quadrant. People who are like Google Data Studio, or Amazon Quicksite, and I think that will change the ecosystem as well. Which, again, is great news for AtScale. >> More muscle coming in. >> That's right. >> For you guys, just more rising tide floats all boats. >> That's right. >> So, you guys are powering it. >> That's right. >> Modern BI would be safe to say? >> That's the idea. The idea is that the visualization is basically commoditized at this point. And what business users want and what enterprise leaders want is the ability to provide freedom and openness to their business users and never have to compromise security, speed and also the complexity of those models, which is what we- we're in the business of. >> Get people working, get people productive faster. >> In whatever tool they want. >> All right, Bruno. Thanks so much. Thanks for coming on. AtScale. Modern BI here in The Cube. Breaking it down. This is The Cube covering bid data SV strata Hadoop. Back with more coverage after this short break. (electronic music)

Published Date : Mar 15 2017

SUMMARY :

it's The Cube. live at Silicon Valley for the big The Cube coverage, Bruno, great to see you. Hadoop Maturity Survey and at the time So, what did you guys do to break out? this idea that you can get the speed, What's the to-do list, what's the check box, the sweet spot is really how do you Microsoft and Tableau, leaders in that area. and where do you let the business lose? I want you to answer the questions, So, the visualization war, if you will, and, just the enabler. for the next generation GI platform is and data is coming together. of the designer to figure out, So, take the syntax out of the hands And getting in the weeds on that stuff. the type of multidimensional analysis that you need, George, you want to get a question in, edgewise? all the different sectors. the next thing they're going to ask You've got users giving you signals, either the pattern of users that might be- on the next generation data platform. So, the way that we view it- and what's in the Hadoop category? is the pure evolution, of, you know, the Hadoop platform, in order to build applications I think you guys are really thinking about it because if you look at the magic quadrant, and it's easy to work with data. Now, the guys that we didn't talk about For you guys, just more The idea is that the visualization This is The Cube covering bid data

ENTITIES

Entity	Category	Confidence
George Gilbert	PERSON	0.99+
Bruno	PERSON	0.99+
Bruno Aziza	PERSON	0.99+
George	PERSON	0.99+
Comcast	ORGANIZATION	0.99+
ETNA	ORGANIZATION	0.99+
Brian	PERSON	0.99+
John Furrier	PERSON	0.99+
New York	LOCATION	0.99+
Josh Klahr	PERSON	0.99+
SIGNA	ORGANIZATION	0.99+
GSK	ORGANIZATION	0.99+
Josh	PERSON	0.99+
Home Depot	ORGANIZATION	0.99+
24 vendors	QUANTITY	0.99+
Microsoft	ORGANIZATION	0.99+
Yankees'	ORGANIZATION	0.99+
thousands	QUANTITY	0.99+
US	LOCATION	0.99+
Excel	TITLE	0.99+
last year	DATE	0.99+
Amazon	ORGANIZATION	0.99+
100%	QUANTITY	0.99+
San Jose, California	LOCATION	0.99+
last week	DATE	0.99+
Silicon Valley	LOCATION	0.99+
AtScale	ORGANIZATION	0.99+
American Express	ORGANIZATION	0.99+
first one	QUANTITY	0.99+
first	QUANTITY	0.99+
20 years ago	DATE	0.99+
50	QUANTITY	0.98+
2017	DATE	0.98+
Tableau	TITLE	0.98+
Macy's	ORGANIZATION	0.98+
One	QUANTITY	0.98+
Mork	ORGANIZATION	0.98+
power BI	TITLE	0.98+
Ecosystem	ORGANIZATION	0.98+
Sequel	PERSON	0.97+
Google	ORGANIZATION	0.97+
this week	DATE	0.97+
Power BI	TITLE	0.97+
Cloudera	ORGANIZATION	0.96+
15 different BI tools	QUANTITY	0.95+
past year and a half	DATE	0.95+
over ten years	QUANTITY	0.95+
today	DATE	0.95+
Tableu	TITLE	0.94+
Tableau	ORGANIZATION	0.94+
SQL	TITLE	0.93+
Astro Data	ORGANIZATION	0.93+
Cube	ORGANIZATION	0.92+
Wikibon	ORGANIZATION	0.92+
two key differentiators	QUANTITY	0.92+
AtScale	TITLE	0.91+
Care Data	ORGANIZATION	0.9+
about 10x	QUANTITY	0.9+
Spark Sequel	TITLE	0.89+
two top ones	QUANTITY	0.89+
Hadoop	TITLE	0.88+
Athena	ORGANIZATION	0.87+
two great products	QUANTITY	0.87+
Big Query	TITLE	0.86+
The Cube	ORGANIZATION	0.85+
Big Data	ORGANIZATION	0.85+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Astro: