AI Meets the Supercloud | Supercloud2

(upbeat music) >> Okay, welcome back everyone at Supercloud 2 event, live here in Palo Alto, theCUBE Studios live stage performance, virtually syndicating it all over the world. I'm John Furrier with Dave Vellante here as Cube alumni, and special influencer guest, Howie Xu, VP of Machine Learning and Zscaler, also part-time as a CUBE analyst 'cause he is that good. Comes on all the time. You're basically a CUBE analyst as well. Thanks for coming on. >> Thanks for inviting me. >> John: Technically, you're not really a CUBE analyst, but you're kind of like a CUBE analyst. >> Happy New Year to everyone. >> Dave: Great to see you. >> Great to see you, Dave and John. >> John: We've been talking about ChatGPT online. You wrote a great post about it being more like Amazon, not like Google. >> Howie: More than just Google Search. >> More than Google Search. Oh, it's going to compete with Google Search, which it kind of does a little bit, but more its infrastructure. So a clever point, good segue into this conversation, because this is kind of the beginning of these kinds of next gen things we're going to see. Things where it's like an obvious next gen, it's getting real. Kind of like seeing the browser for the first time, Mosaic browser. Whoa, this internet thing's real. I think this is that moment and Supercloud like enablement is coming. So this has been a big part of the Supercloud kind of theme. >> Yeah, you talk about Supercloud, you talk about, you know, AI, ChatGPT. I really think the ChatGPT is really another Netscape moment, the browser moment. Because if you think about internet technology, right? It was brewing for 20 years before early 90s. Not until you had a, you know, browser, people realize, "Wow, this is how wonderful this technology could do." Right? You know, all the wonderful things. Then you have Yahoo and Amazon. I think we have brewing, you know, the AI technology for, you know, quite some time. Even then, you know, neural networks, deep learning. But not until ChatGPT came along, people realize, "Wow, you know, the user interface, user experience could be that great," right? So I really think, you know, if you look at the last 30 years, there is a browser moment, there is iPhone moment. I think ChatGPT moment is as big as those. >> Dave: What do you see as the intersection of things like ChatGPT and the Supercloud? Of course, the media's going to focus, journalists are going to focus on all the negatives and the privacy. Okay. You know we're going to get by that, right? Always do. Where do you see the Supercloud and sort of the distributed data fitting in with ChatGPT? Does it use that as a data source? What's the link? >> Howie: I think there are number of use cases. One of the use cases, we talked about why we even have Supercloud because of the complexity, because of the, you know, heterogeneous nature of different clouds. In order for me as a developer, in order for me to create applications, I have so many things to worry about, right? It's a complexity. But with ChatGPT, with the AI, I don't have to worry about it, right? Those kind of details will be taken care of by, you know, the underlying layer. So we have been talking about on this show, you know, over the last, what, year or so about the Supercloud, hey, defining that, you know, API layer spanning across, you know, multiple clouds. I think that will be happening. However, for a lot of the things, that will be more hidden, right? A lot of that will be automated by the bots. You know, we were just talking about it right before the show. One of the profound statement I heard from Adrian Cockcroft about 10 years ago was, "Hey Howie, you know, at Netflix, right? You know, IT is just one API call away." That's a profound statement I heard about a decade ago. I think next decade, right? You know, the IT is just one English language away, right? So when it's one English language away, it's no longer as important, API this, API that. You still need API just like hardware, right? You still need all of those things. That's going to be more hidden. The high level thing will be more, you know, English language or the language, right? Any language for that matter. >> Dave: And so through language, you'll tap services that live across the Supercloud, is what you're saying? >> Howie: You just tell what you want, what you desire, right? You know, the bots will help you to figure out where the complexity is, right? You know, like you said, a lot of criticism about, "Hey, ChatGPT doesn't do this, doesn't do that." But if you think about how to break things down, right? For instance, right, you know, ChatGPT doesn't have Microsoft stock price today, obviously, right? However, you can ask ChatGPT to write a program for you, retrieve the Microsoft stock price, (laughs) and then just run it, right? >> Dave: Yeah. >> So the thing to think about- >> John: It's only going to get better. It's only going to get better. >> The thing people kind of unfairly criticize ChatGPT is it doesn't do this. But can you not break down humans' task into smaller things and get complex things to be done by the ChatGPT? I think we are there already, you know- >> John: That to me is the real game changer. That's the assembly of atomic elements at the top of the stack, whether the interface is voice or some programmatic gesture based thing, you know, wave your hand or- >> Howie: One of the analogy I used in my blog was, you know, each person, each professional now is a quarterback. And we suddenly have, you know, a lot more linebacks or you know, any backs to work for you, right? For free even, right? You know, and then that's sort of, you should think about it. You are the quarterback of your day-to-day job, right? Your job is not to do everything manually yourself. >> Dave: You call the play- >> Yes. >> Dave: And they execute. Do your job. >> Yes, exactly. >> Yeah, all the players are there. All the elves are in the North Pole making the toys, Dave, as we say. But this is the thing, I want to get your point. This change is going to require a new kind of infrastructure software relationship, a new kind of operating runtime, a new kind of assembler, a new kind of loader link things. This very operating systems kind of concepts. >> Data intensive, right? How to process the data, how to, you know, process so gigantic data in parallel, right? That's actually a tough job, right? So if you think about ChatGPT, why OpenAI is ahead of the game, right? You know, Google may not want to acknowledge it, right? It's not necessarily they do, you know, not have enough data scientist, but the software engineering pieces, you know, behind it, right? To train the model, to actually do all those things in parallel, to do all those things in a cost effective way. So I think, you know, a lot of those still- >> Let me ask you a question. Let me ask you a question because we've had this conversation privately, but I want to do it while we're on stage here. Where are all the alpha geeks and developers and creators and entrepreneurs going to gravitate to? You know, in every wave, you see it in crypto, all the alphas went into crypto. Now I think with ChatGPT, you're going to start to see, like, "Wow, it's that moment." A lot of people are going to, you know, scrum and do startups. CTOs will invent stuff. There's a lot of invention, a lot of computer science and customer requirements to figure out. That's new. Where are the alpha entrepreneurs going to go to? What do you think they're going to gravitate to? If you could point to the next layer to enable this super environment, super app environment, Supercloud. 'Cause there's a lot to do to enable what you just said. >> Howie: Right. You know, if you think about using internet as the analogy, right? You know, in the early 90s, internet came along, browser came along. You had two kind of companies, right? One is Amazon, the other one is walmart.com. And then there were company, like maybe GE or whatnot, right? Really didn't take advantage of internet that much. I think, you know, for entrepreneurs, suddenly created the Yahoo, Amazon of the ChatGPT native era. That's what we should be all excited about. But for most of the Fortune 500 companies, your job is to surviving sort of the big revolution. So you at least need to do your walmart.com sooner than later, right? (laughs) So not be like GE, right? You know, hand waving, hey, I do a lot of the internet, but you know, when you look back last 20, 30 years, what did they do much with leveraging the- >> So you think they're going to jump in, they're going to build service companies or SaaS tech companies or Supercloud companies? >> Howie: Okay, so there are two type of opportunities from that perspective. One is, you know, the OpenAI ish kind of the companies, I think the OpenAI, the game is still open, right? You know, it's really Close AI today. (laughs) >> John: There's room for competition, you mean? >> There's room for competition, right. You know, you can still spend you know, 50, $100 million to build something interesting. You know, there are company like Cohere and so on and so on. There are a bunch of companies, I think there is that. And then there are companies who's going to leverage those sort of the new AI primitives. I think, you know, we have been talking about AI forever, but finally, finally, it's no longer just good, but also super useful. I think, you know, the time is now. >> John: And if you have the cloud behind you, what do you make the Amazon do differently? 'Cause Amazon Web Services is only going to grow with this. It's not going to get smaller. There's more horsepower to handle, there's more needs. >> Howie: Well, Microsoft already showed what's the future, right? You know, you know, yes, there is a kind of the container, you know, the serverless that will continue to grow. But the future is really not about- >> John: Microsoft's shown the future? >> Well, showing that, you know, working with OpenAI, right? >> Oh okay. >> They already said that, you know, we are going to have ChatGPT service. >> $10 billion, I think they're putting it. >> $10 billion putting, and also open up the Open API services, right? You know, I actually made a prediction that Microsoft future hinges on OpenAI. I think, you know- >> John: They believe that $10 billion bet. >> Dave: Yeah. $10 billion bet. So I want to ask you a question. It's somewhat academic, but it's relevant. For a number of years, it looked like having first mover advantage wasn't an advantage. PCs, spreadsheets, the browser, right? Social media, Friendster, right? Mobile. Apple wasn't first to mobile. But that's somewhat changed. The cloud, AWS was first. You could debate whether or not, but AWS okay, they have first mover advantage. Crypto, Bitcoin, first mover advantage. Do you think OpenAI will have first mover advantage? >> It certainly has its advantage today. I think it's year two. I mean, I think the game is still out there, right? You know, we're still in the first inning, early inning of the game. So I don't think that the game is over for the rest of the players, whether the big players or the OpenAI kind of the, sort of competitors. So one of the VCs actually asked me the other day, right? "Hey, how much money do I need to spend, invest, to get, you know, another shot to the OpenAI sort of the level?" You know, I did a- (laughs) >> Line up. >> That's classic VC. "How much does it cost me to replicate?" >> I'm pretty sure he asked the question to a bunch of guys, right? >> Good luck with that. (laughs) >> So we kind of did some napkin- >> What'd you come up with? (laughs) >> $100 million is the order of magnitude that I came up with, right? You know, not a billion, not 10 million, right? So 100 million. >> John: Hundreds of millions. >> Yeah, yeah, yeah. 100 million order of magnitude is what I came up with. You know, we can get into details, you know, in other sort of the time, but- >> Dave: That's actually not that much if you think about it. >> Howie: Exactly. So when he heard me articulating why is that, you know, he's thinking, right? You know, he actually, you know, asked me, "Hey, you know, there's this company. Do you happen to know this company? Can I reach out?" You know, those things. So I truly believe it's not a billion or 10 billion issue, it's more like 100. >> John: And also, your other point about referencing the internet revolution as a good comparable. The other thing there is online user population was a big driver of the growth of that. So what's the equivalent here for online user population for AI? Is it more apps, more users? I mean, we're still early on, it's first inning. >> Yeah. We're kind of the, you know- >> What's the key metric for success of this sector? Do you have a read on that? >> I think the, you know, the number of users is a good metrics, but I think it's going to be a lot of people are going to use AI services without even knowing they're using it, right? You know, I think a lot of the applications are being already built on top of OpenAI, and then they are kind of, you know, help people to do marketing, legal documents, you know, so they're already inherently OpenAI kind of the users already. So I think yeah. >> Well, Howie, we've got to wrap, but I really appreciate you coming on. I want to give you a last minute to wrap up here. In your experience, and you've seen many waves of innovation. You've even had your hands in a lot of the big waves past three inflection points. And obviously, machine learning you're doing now, you're deep end. Why is this Supercloud movement, this wave of Supercloud and the discussion of this next inflection point, why is it so important? For the folks watching, why should they be paying attention to this particular moment in time? Could you share your super clip on Supercloud? >> Howie: Right. So this is simple from my point of view. So why do you even have cloud to begin with, right? IT is too complex, too complex to operate or too expensive. So there's a newer model. There is a better model, right? Let someone else operate it, there is elasticity out of it, right? That's great. Until you have multiple vendors, right? Many vendors even, you know, we're talking about kind of how to make multiple vendors look like the same, but frankly speaking, even one vendor has, you know, thousand services. Now it's kind of getting, what Kid was talking about what, cloud chaos, right? It's the evolution. You know, the history repeats itself, right? You know, you have, you know, next great things and then too many great things, and then people need to sort of abstract this out. So it's almost that you must do this. But I think how to abstract this out is something that at this time, AI is going to help a lot, right? You know, like I mentioned, right? A lot of the abstraction, you don't have to think about API anymore. I bet 10 years from now, you know, IT is one language away, not API away. So think about that world, right? So Supercloud in, in my opinion, sure, you kind of abstract things out. You have, you know, consistent layers. But who's going to do that? Is that like we all agreed upon the model, agreed upon those APIs? Not necessary. There are certain, you know, truth in that, but there are other truths, let bots take care of, right? Whether you know, I want some X happens, whether it's going to be done by Azure, by AWS, by GCP, bots will figure out at a given time with certain contacts with your security requirement, posture requirement. I'll think that out. >> John: That's awesome. And you know, Dave, you and I have been talking about this. We think scale is the new ratification. If you have first mover advantage, I'll see the benefit, but scale is a huge thing. OpenAI, AWS. >> Howie: Yeah. Every day, we are using OpenAI. Today, we are labeling data for them. So you know, that's a little bit of the- (laughs) >> John: Yeah. >> First mover advantage that other people don't have, right? So it's kind of scary. So I'm very sure that Google is a little bit- (laughs) >> When we do our super AI event, you're definitely going to be keynoting. (laughs) >> Howie: I think, you know, we're talking about Supercloud, you know, before long, we are going to talk about super intelligent cloud. (laughs) >> I'm super excited, Howie, about this. Thanks for coming on. Great to see you, Howie Xu. Always a great analyst for us contributing to the community. VP of Machine Learning and Zscaler, industry legend and friend of theCUBE. Thanks for coming on and sharing really, really great advice and insight into what this next wave means. This Supercloud is the next wave. "If you're not on it, you're driftwood," says Pat Gelsinger. So you're going to see a lot more discussion. We'll be back more here live in Palo Alto after this short break. >> Thank you. (upbeat music)

Published Date : Feb 17 2023

SUMMARY :

it all over the world. but you're kind of like a CUBE analyst. Great to see you, You wrote a great post about Kind of like seeing the So I really think, you know, Of course, the media's going to focus, will be more, you know, You know, like you said, John: It's only going to get better. I think we are there already, you know- you know, wave your hand or- or you know, any backs Do your job. making the toys, Dave, as we say. So I think, you know, A lot of people are going to, you know, I think, you know, for entrepreneurs, One is, you know, the OpenAI I think, you know, the time is now. John: And if you have You know, you know, yes, They already said that, you know, $10 billion, I think I think, you know- that $10 billion bet. So I want to ask you a question. to get, you know, another "How much does it cost me to replicate?" Good luck with that. You know, not a billion, into details, you know, if you think about it. You know, he actually, you know, asked me, the internet revolution We're kind of the, you know- I think the, you know, in a lot of the big waves You have, you know, consistent layers. And you know, Dave, you and I So you know, that's a little bit of the- So it's kind of scary. to be keynoting. Howie: I think, you know, This Supercloud is the next wave. (upbeat music)

ENTITIES

Entity	Category	Confidence
John	PERSON	0.99+
Pat Gelsinger	PERSON	0.99+
Dave	PERSON	0.99+
Dave Vellante	PERSON	0.99+
GE	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Adrian Cockcroft	PERSON	0.99+
John Furrier	PERSON	0.99+
$10 billion	QUANTITY	0.99+
Yahoo	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Palo Alto	LOCATION	0.99+
10 million	QUANTITY	0.99+
Microsoft	ORGANIZATION	0.99+
Netflix	ORGANIZATION	0.99+
50	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
Howie Xu	PERSON	0.99+
CUBE	ORGANIZATION	0.99+
$100 million	QUANTITY	0.99+
100 million	QUANTITY	0.99+
Hundreds of millions	QUANTITY	0.99+
Apple	ORGANIZATION	0.99+
Amazon Web Services	ORGANIZATION	0.99+
10 billion	QUANTITY	0.99+
iPhone	COMMERCIAL_ITEM	0.99+
North Pole	LOCATION	0.99+
next decade	DATE	0.99+
first	QUANTITY	0.99+
Cohere	ORGANIZATION	0.99+
first inning	QUANTITY	0.99+
100	QUANTITY	0.99+
Today	DATE	0.99+
Machine Learning	ORGANIZATION	0.99+
Supercloud 2	EVENT	0.99+
English	OTHER	0.98+
each person	QUANTITY	0.98+
two type	QUANTITY	0.98+
One	QUANTITY	0.98+
one	QUANTITY	0.98+
Zscaler	ORGANIZATION	0.98+
early 90s	DATE	0.97+
Howie	PERSON	0.97+
two kind	QUANTITY	0.97+
one vendor	QUANTITY	0.97+
one language	QUANTITY	0.97+
each professional	QUANTITY	0.97+

Angelo Fausti & Caleb Maclachlan | The Future is Built on InfluxDB

>> Okay. We're now going to go into the customer panel, and we'd like to welcome Angelo Fausti, who's a software engineer at the Vera C. Rubin Observatory, and Caleb Maclachlan who's senior spacecraft operations software engineer at Loft Orbital. Guys, thanks for joining us. You don't want to miss folks this interview. Caleb, let's start with you. You work for an extremely cool company, you're launching satellites into space. Of course doing that is highly complex and not a cheap endeavor. Tell us about Loft Orbital and what you guys do to attack that problem. >> Yeah, absolutely. And thanks for having me here by the way. So Loft Orbital is a company that's a series B startup now, who, and our mission basically is to provide rapid access to space for all kinds of customers. Historically, if you want to fly something in space, do something in space, it's extremely expensive. You need to book a launch, build a bus, hire a team to operate it, have a big software teams, and then eventually worry about, a bunch like, just a lot of very specialized engineering. And what we're trying to do is change that from a super specialized problem that has an extremely high barrier of access, to a infrastructure problem. So that it's almost as simple as deploying a VM in AWS or GCP is getting your programs, your mission deployed on orbit with access to different sensors, cameras, radios, stuff like that. So, that's kind of our mission and just to give a really brief example of the kind of customer that we can serve. There's a really cool company called Totum Labs, who is working on building IoT cons, an IoT constellation for, internet of things, basically being able to get telemetry from all over the world. They're the first company to demonstrate indoor IoT which means you have this little modem inside a container that container that you track from anywhere in the world as it's going across the ocean. So, and it's really little, and they've been able to stay a small startup that's focused on their product, which is the, that super crazy, complicated, cool radio, while we handle the whole space segment for them, which just, you know, before Loft was really impossible. So that's our mission is providing space infrastructure as a service. We are kind of groundbreaking in this area and we're serving a huge variety of customers with all kinds of different missions, and obviously generating a ton of data in space that we've got to handle. >> Yeah. So amazing Caleb, what you guys do. Now, I know you were lured to the skies very early in your career, but how did you kind of land in this business? >> Yeah, so, I guess just a little bit about me. For some people, they don't necessarily know what they want to do like earlier in their life. For me I was five years old and I knew I want to be in the space industry. So, I started in the Air Force, but have stayed in the space industry my whole career and been a part of, this is the fifth space startup that I've been a part of actually. So, I've kind of started out in satellites, spent some time in working in the launch industry on rockets, then, now I'm here back in satellites and honestly, this is the most exciting of the different space startups that I've been a part of. >> Super interesting. Okay. Angelo, let's talk about the Rubin Observatory. Vera C. Rubin, famous woman scientist, galaxy guru. Now you guys, the Observatory, you're up way up high, you get a good look at the Southern sky. And I know COVID slowed you guys down a bit, but no doubt you continued to code away on the software. I know you're getting close, you got to be super excited, give us the update on the Observatory and your role. >> All right. So, yeah. Rubin is a state of the art observatory that is in construction on a remote mountain in Chile. And, with Rubin we'll conduct the large survey of space and time. We're going to observe the sky with eight meter optical telescope and take 1000 pictures every night with 2.2 Gigapixel camera. And we are going to do that for 10 years, which is the duration of the survey. >> Yeah, amazing project. Now, you earned a doctor of philosophy so you probably spent some time thinking about what's out there, and then you went out to earn a PhD in astronomy and astrophysics. So, this is something that you've been working on for the better part of your career, isn't it? >> Yeah, that's right, about 15 years. I studied physics in college. Then I got a PhD in astronomy. And, I worked for about five years in another project, the Dark Energy Survey before joining Rubin in 2015. >> Yeah, impressive. So it seems like both your organizations are looking at space from two different angles. One thing you guys both have in common of course is software, and you both use InfluxDB as part of your data infrastructure. How did you discover InfluxDB, get into it? How do you use the platform? Maybe Caleb you could start. >> Yeah, absolutely. So, the first company that I extensively used InfluxDB in, was a launch startup called Astra. And we were in the process of designing our first generation rocket there, and testing the engines, pumps, everything that goes into a rocket. And, when I joined the company our data story was not very mature. We were collecting a bunch of data in LabVIEW and engineers were taking that over to MATLAB to process it. And at first, there, you know, that's the way that a lot of engineers and scientists are used to working. And at first that was, like people weren't entirely sure that that was, that needed to change. But, it's, something, the nice thing about InfluxDB is that, it's so easy to deploy. So as, our software engineering team was able to get it deployed and, up and running very quickly and then quickly also backport all of the data that we collected this far into Influx. And, what was amazing to see and is kind of the super cool moment with Influx is, when we hooked that up to Grafana, Grafana as the visualization platform we used with Influx, 'cause it works really well with it. There was like this aha moment of our engineers who are used to this post process kind of method for dealing with their data, where they could just almost instantly easily discover data that they hadn't been able to see before, and take the manual processes that they would run after a test and just throw those all in Influx and have live data as tests were coming, and, I saw them implementing like crazy rocket equation type stuff in Influx, and it just was totally game changing for how we tested. >> So Angelo, I was explaining in my open, that you could add a column in a traditional RDBMS and do time series, but with the volume of data that you're talking about in the example that Caleb just gave, you have to have a purpose built time series database. Where did you first learn about InfluxDB? >> Yeah, correct. So, I work with the data management team, and my first project was the record metrics that measured the performance of our software, the software that we used to process the data. So I started implementing that in our relational database. But then I realized that in fact I was dealing with time series data and I should really use a solution built for that. And then I started looking at time series databases and I found InfluxDB, and that was back in 2018. The, another use for InfluxDB that I'm also interested is the visits database. If you think about the observations, we are moving the telescope all the time and pointing to specific directions in the sky and taking pictures every 30 seconds. So that itself is a time series. And every point in that time series, we call a visit. So we want to record the metadata about those visits in InfluxDB. That time series is going to be 10 years long, with about 1000 points every night. It's actually not too much data compared to other problems. It's really just a different time scale. >> The telescope at the Rubin Observatory is like, pun intended, I guess the star of the show. And I believe I read that it's going to be the first of the next gen telescopes to come online. It's got this massive field of view, like three orders of magnitude times the Hubble's widest camera view, which is amazing. Like, that's like 40 moons in an image, amazingly fast as well. What else can you tell us about the telescope? >> This telescope it has to move really fast. And, it also has to carry the primary mirror which is an eight meter piece of glass. It's very heavy. And it has to carry a camera which has about the size of a small car. And this whole structure weighs about 300 tons. For that to work, the telescope needs to be very compact and stiff. And one thing that's amazing about it's design is that, the telescope, this 300 tons structure, it sits on a tiny film of oil, which has the diameter of human hair. And that makes an, almost zero friction interface. In fact, a few people can move this enormous structure with only their hands. As you said, another aspect that makes this telescope unique is the optical design. It's a wide field telescope. So, each image has, in diameter the size of about seven full moons. And, with that, we can map the entire sky in only three days. And of course, during operations everything's controlled by software and it is automatic. There's a very complex piece of software called the Scheduler, which is responsible for moving the telescope, and the camera, which is recording 15 terabytes of data every night. >> And Angelo, all this data lands in InfluxDB, correct? And what are you doing with all that data? >> Yeah, actually not. So we use InfluxDB to record engineering data and metadata about the observations. Like telemetry, events, and commands from the telescope. That's a much smaller data set compared to the images. But it is still challenging because you have some high frequency data that the system needs to keep up, and, we need to store this data and have it around for the lifetime of the project. >> Got it. Thank you. Okay, Caleb, let's bring you back in. Tell us more about the, you got these dishwasher size satellites, kind of using a multi-tenant model, I think it's genius. But tell us about the satellites themselves. >> Yeah, absolutely. So, we have in space some satellites already that as you said, are like dishwasher, mini fridge kind of size. And we're working on a bunch more that are a variety of sizes from shoebox to, I guess, a few times larger than what we have today. And it is, we do shoot to have effectively something like a multi-tenant model where we will buy a bus off the shelf. The bus is what you can kind of think of as the core piece of the satellite, almost like a motherboard or something where it's providing the power, it has the solar panels, it has some radios attached to it. It handles the attitude control, basically steers the spacecraft in orbit, and then we build also in-house, what we call our payload hub which is, has all, any customer payloads attached and our own kind of Edge processing sort of capabilities built into it. And, so we integrate that, we launch it, and those things because they're in lower Earth orbit, they're orbiting the earth every 90 minutes. That's, seven kilometers per second which is several times faster than a speeding bullet. So we have one of the unique challenges of operating spacecraft in lower Earth orbit is that generally you can't talk to them all the time. So, we're managing these things through very brief windows of time, where we get to talk to them through our ground sites, either in Antarctica or in the North pole region. >> Talk more about how you use InfluxDB to make sense of this data through all this tech that you're launching into space. >> We basically, previously we started off when I joined the company, storing all of that as Angelo did in a regular relational database. And we found that it was so slow and the size of our data would balloon over the course of a couple days to the point where we weren't able to even store all of the data that we were getting. So we migrated to InfluxDB to store our time series telemetry from the spacecraft. So, that's things like power level, voltage, currents, counts, whatever metadata we need to monitor about the spacecraft, we now store that in InfluxDB. And that has, now we can actually easily store the entire volume of data for the mission life so far without having to worry about the size bloating to an unmanageable amount, and we can also seamlessly query large chunks of data. Like if I need to see, you know, for example, as an operator, I might want to see how my battery state of charge is evolving over the course of the year, I can have, plot in an Influx that loads that in a fraction of a second for a year's worth of data because it does, intelligent, it can intelligently group the data by assigning time interval. So, it's been extremely powerful for us to access the data. And, as time has gone on, we've gradually migrated more and more of our operating data into Influx. >> Yeah. Let's talk a little bit about, we throw this term around a lot of, you know, data driven, a lot of companies say, "Oh yes, we're data driven." But you guys really are, I mean, you got data at the core. Caleb, what does that mean to you? >> Yeah, so, you know, I think the, and the clearest example of when I saw this be like totally game changing is what I mentioned before at Astra where our engineer's feedback loop went from a lot of kind of slow researching, digging into the data to like an instant, instantaneous almost, seeing the data, making decisions based on it immediately rather than having to wait for some processing. And that's something that I've also seen echoed in my current role. But to give another practical example, as I said, we have a huge amount of data that comes down every orbit and we need to be able to ingest all of that data almost instantaneously and provide it to the operator in near real time, about a second worth of latency is all that's acceptable for us to react to see what is coming down from the spacecraft. And building that pipeline is challenging from a software engineering standpoint. My primary language is Python which isn't necessarily that fast. So what we've done is started, and the goal of being data-driven is publish metrics on individual, how individual pieces of our data processing pipeline are performing into Influx as well. And we do that in production as well as in dev. So we have kind of a production monitoring flow. And what that has done is allow us to make intelligent decisions on our software development roadmap where it makes the most sense for us to focus our development efforts in terms of improving our software efficiency, just because we have that visibility into where the real problems are. And sometimes we've found ourselves before we started doing this, kind of chasing rabbits that weren't necessarily the real root cause of issues that we were seeing. But now that we're being a bit more data driven there, we are being much more effective in where we're spending our resources and our time, which is especially critical to us as we scale from supporting a couple of satellites to supporting many, many satellites at once. >> Yeah, of course is how you reduced those dead ends. Maybe Angelo you could talk about what sort of data-driven means to you and your teams. >> I would say that, having real time visibility to the telemetry data and metrics is crucial for us. We need to make sure that the images that we collect with the telescope have good quality, and, that they are within the specifications to meet our science goals. And so if they are not, we want to know that as soon as possible and then start fixing problems. >> Caleb, what are your sort of event, you know, intervals like? >> So I would say that, as of today on the spacecraft, the event, the level of timing that we deal with probably tops out at about 20 Hertz, 20 measurements per second on things like our gyroscopes. But, the, I think the core point here of the ability to have high precision data is extremely important for these kinds of scientific applications and I'll give an example from when I worked at, on the rockets at Astra. There, our baseline data rate that we would ingest data during a test is 500 Hertz. So 500 samples per second, and in some cases we would actually need to ingest much higher rate data, even up to like 1.5 kilohertz, so extremely, extremely high precision data there where timing really matters a lot. And, you know, I can, one of the really powerful things about Influx is the fact that it can handle this. That's one of the reasons we chose it, because, there's, times when we're looking at the results of a firing where you're zooming in, you know, I talked earlier about how on my current job we often zoom out to look at a year's worth of data. You're zooming in to where your screen is preoccupied by a tiny fraction of a second, and you need to see same thing as Angelo just said, not just the actual telemetry, which is coming in at a high rate, but the events that are coming out of our controllers, so that can be something like, "Hey, I opened this valve at exactly this time," and that goes, we want to have that at, micro, or even nanosecond precision so that we know, okay, we saw a spike in chamber pressure at this exact moment, was that before or after this valve opened? That kind of visibility is critical in these kind of scientific applications, and absolutely game changing to be able to see that in near real time, and with, a really easy way for engineers to be able to visualize this data themselves without having to wait for us software engineers to go build it for them. >> Can the scientists do self-serve or do you have to design and build all the analytics and queries for your scientists? >> Well, I think that's absolutely, from my perspective that's absolutely one of the best things about Influx and what I've seen be game changing is that, generally I'd say anyone can learn to use Influx. And honestly, most of our users might not even know they're using Influx, because, the interface that we expose to them is Grafana, which is a generic graphing, open source graphing library that is very similar to Influx zone Chronograf. >> Sure. >> And what it does is, it provides this almost, it's a very intuitive UI for building your queries. So, you choose a measurement and it shows a dropdown of available measurements. And then you choose the particular fields you want to look at, and again, that's a dropdown. So, it's really easy for our users to discover and there's kind of point and click options for doing math, aggregations. You can even do like perfect kind of predictions all within Grafana, the Grafana user interface, which is really just a wrapper around the APIs and functionality that Influx provides. >> Putting data in the hands of those who have the context, the domain experts is key. Angelo, is it the same situation for you, is it self-serve? >> Yeah, correct. As I mentioned before, we have the astronomers making their own dashboards because they know what exactly what they need to visualize. >> Yeah, I mean, it's all about using the right tool for the job. I think for us, when I joined the company we weren't using InfluxDB and we were dealing with serious issues of the database growing to an incredible size extremely quickly, and being unable to like even querying short periods of data was taking on the order of seconds, which is just not possible for operations. >> Guys, this has been really formative, it's pretty exciting to see how the edge, is mountaintops, lower Earth orbits, I mean space is the ultimate edge, isn't it? I wonder if you could answer two questions to wrap here. You know, what comes next for you guys? And is there something that you're really excited about that you're working on? Caleb maybe you could go first and then Angelo you can bring us home. >> Basically what's next for Loft Orbital is more satellites, a greater push towards infrastructure, and really making, our mission is to make space simple for our customers and for everyone. And we're scaling the company like crazy now, making that happen. It's extremely exciting, an extremely exciting time to be in this company and to be in this industry as a whole. Because there are so many interesting applications out there, so many cool ways of leveraging space that people are taking advantage of, and with companies like SpaceX and the, now rapidly lowering cost of launch it's just a really exciting place to be in. We're launching more satellites, we are scaling up for some constellations, and our ground system has to be improved to match. So, there's a lot of improvements that we're working on to really scale up our control software to be best in class and make it capable of handling such a large workload, so. >> Are you guys hiring? >> We are absolutely hiring, so I would, we have positions all over the company, so, we need software engineers, we need people who do more aerospace specific stuff. So absolutely, I'd encourage anyone to check out the Loft Orbital website, if this is at all interesting. >> All right, Angelo, bring us home. >> Yeah. So what's next for us is really getting this telescope working and collecting data. And when that's happened is going to be just a deluge of data coming out of this camera and handling all that data is going to be really challenging. Yeah, I want to be here for that, I'm looking forward. Like for next year we have like an important milestone, which is our commissioning camera, which is a simplified version of the full camera, it's going to be on sky, and so yeah, most of the system has to be working by then. >> Nice. All right guys, with that we're going to end it. Thank you so much, really fascinating, and thanks to InfluxDB for making this possible, really groundbreaking stuff, enabling value creation at the Edge, in the cloud, and of course, beyond at the space. So, really transformational work that you guys are doing, so congratulations and really appreciate the broader community. I can't wait to see what comes next from having this entire ecosystem. Now, in a moment, I'll be back to wrap up. This is Dave Vellante, and you're watching theCUBE, the leader in high tech enterprise coverage. >> Welcome. Telegraf is a popular open source data collection agent. Telegraf collects data from hundreds of systems like IoT sensors, cloud deployments, and enterprise applications. It's used by everyone from individual developers and hobbyists, to large corporate teams. The Telegraf project has a very welcoming and active Open Source community. Learn how to get involved by visiting the Telegraf GitHub page. Whether you want to contribute code, improve documentation, participate in testing, or just show what you're doing with Telegraf. We'd love to hear what you're building. >> Thanks for watching Moving the World with InfluxDB, made possible by Influx Data. I hope you learned some things and are inspired to look deeper into where time series databases might fit into your environment. If you're dealing with large and or fast data volumes, and you want to scale cost effectively with the highest performance, and you're analyzing metrics and data over time, times series databases just might be a great fit for you. Try InfluxDB out. You can start with a free cloud account by clicking on the link in the resources below. Remember, all these recordings are going to be available on demand of thecube.net and influxdata.com, so check those out. And poke around Influx Data. They are the folks behind InfluxDB, and one of the leaders in the space. We hope you enjoyed the program, this is Dave Vellante for theCUBE, we'll see you soon. (upbeat music)

Published Date : May 18 2022

SUMMARY :

and what you guys do of the kind of customer that we can serve. So amazing Caleb, what you guys do. of the different space startups the Rubin Observatory. Rubin is a state of the art observatory and then you went out to the Dark Energy Survey and you both use InfluxDB and is kind of the super in the example that Caleb just gave, the software that we that it's going to be the first and the camera, that the system needs to keep up, let's bring you back in. is that generally you can't to make sense of this data all of the data that we were getting. But you guys really are, I digging into the data to like an instant, means to you and your teams. the images that we collect of the ability to have high precision data because, the interface that and functionality that Influx provides. Angelo, is it the same situation for you, we have the astronomers and we were dealing with and then Angelo you can bring us home. and to be in this industry as a whole. out the Loft Orbital website, most of the system has and of course, beyond at the space. and hobbyists, to large corporate teams. and one of the leaders in the space.

ENTITIES

Entity	Category	Confidence
Caleb	PERSON	0.99+
Caleb Maclachlan	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Angelo Fausti	PERSON	0.99+
Loft Orbital	ORGANIZATION	0.99+
Chile	LOCATION	0.99+
Totum Labs	ORGANIZATION	0.99+
2015	DATE	0.99+
10 years	QUANTITY	0.99+
Antarctica	LOCATION	0.99+
1000 pictures	QUANTITY	0.99+
SpaceX	ORGANIZATION	0.99+
2018	DATE	0.99+
15 terabytes	QUANTITY	0.99+
40 moons	QUANTITY	0.99+
Vera C. Rubin	PERSON	0.99+
Influx	TITLE	0.99+
Python	TITLE	0.99+
300 tons	QUANTITY	0.99+
500 Hertz	QUANTITY	0.99+
Angelo	PERSON	0.99+
two questions	QUANTITY	0.99+
earth	LOCATION	0.99+
next year	DATE	0.99+
Telegraf	ORGANIZATION	0.99+
Astra	ORGANIZATION	0.99+
InfluxDB	TITLE	0.99+
today	DATE	0.99+
2.2 Gigapixel	QUANTITY	0.99+
both	QUANTITY	0.99+
each image	QUANTITY	0.99+
thecube.net	OTHER	0.99+
North pole	LOCATION	0.99+
first project	QUANTITY	0.99+
first	QUANTITY	0.99+
One	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
Earth	LOCATION	0.99+
one	QUANTITY	0.99+
eight meter	QUANTITY	0.98+
first generation	QUANTITY	0.98+
Vera C. Rubin Observatory	ORGANIZATION	0.98+
three orders	QUANTITY	0.98+
influxdata.com	OTHER	0.98+
1.5 kilohertz	QUANTITY	0.98+
three days	QUANTITY	0.98+
first company	QUANTITY	0.97+
one thing	QUANTITY	0.97+
Moving the World	TITLE	0.97+
Grafana	TITLE	0.97+
two different angles	QUANTITY	0.97+
about 1000 points	QUANTITY	0.97+
Rubin Observatory	LOCATION	0.96+
hundreds of systems	QUANTITY	0.96+

The Future Is Built On InFluxDB

>>Time series data is any data that's stamped in time in some way that could be every second, every minute, every five minutes, every hour, every nanosecond, whatever it might be. And typically that data comes from sources in the physical world like devices or sensors, temperature, gauges, batteries, any device really, or things in the virtual world could be software, maybe it's software in the cloud or data and containers or microservices or virtual machines. So all of these items, whether in the physical or virtual world, they're generating a lot of time series data. Now time series data has been around for a long time, and there are many examples in our everyday lives. All you gotta do is punch up any stock, ticker and look at its price over time and graphical form. And that's a simple use case that anyone can relate to and you can build timestamps into a traditional relational database. >>You just add a column to capture time and as well, there are examples of log data being dumped into a data store that can be searched and captured and ingested and visualized. Now, the problem with the latter example that I just gave you is that you gotta hunt and Peck and search and extract what you're looking for. And the problem with the former is that traditional general purpose databases they're designed as sort of a Swiss army knife for any workload. And there are a lot of functions that get in the way and make them inefficient for time series analysis, especially at scale. Like when you think about O T and edge scale, where things are happening super fast, ingestion is coming from many different sources and analysis often needs to be done in real time or near real time. And that's where time series databases come in. >>They're purpose built and can much more efficiently support ingesting metrics at scale, and then comparing data points over time, time series databases can write and read at significantly higher speeds and deal with far more data than traditional database methods. And they're more cost effective instead of throwing processing power at the problem. For example, the underlying architecture and algorithms of time series databases can optimize queries and they can reclaim wasted storage space and reuse it. At scale time, series databases are simply a better fit for the job. Welcome to moving the world with influx DB made possible by influx data. My name is Dave Valante and I'll be your host today. Influx data is the company behind InfluxDB. The open source time series database InfluxDB is designed specifically to handle time series data. As I just explained, we have an exciting program for you today, and we're gonna showcase some really interesting use cases. >>First, we'll kick it off in our Palo Alto studios where my colleague, John furrier will interview Evan Kaplan. Who's the CEO of influx data after John and Evan set the table. John's gonna sit down with Brian Gilmore. He's the director of IOT and emerging tech at influx data. And they're gonna dig into where influx data is gaining traction and why adoption is occurring and, and why it's so robust. And they're gonna have tons of examples and double click into the technology. And then we bring it back here to our east coast studios, where I get to talk to two practitioners, doing amazing things in space with satellites and modern telescopes. These use cases will blow your mind. You don't want to miss it. So thanks for being here today. And with that, let's get started. Take it away. Palo Alto. >>Okay. Today we welcome Evan Kaplan, CEO of influx data, the company behind influx DB. Welcome Evan. Thanks for coming on. >>Hey John, thanks for having me >>Great segment here on the influx DB story. What is the story? Take us through the history. Why time series? What's the story >><laugh> so the history history is actually actually pretty interesting. Um, Paul dicks, my partner in this and our founder, um, super passionate about developers and developer experience. And, um, he had worked on wall street building a number of time series kind of platform trading platforms for trading stocks. And from his point of view, it was always what he would call a yak shave, which means you had to do a ton of work just to start doing work, which means you had to write a bunch of extrinsic routines. You had to write a bunch of application handling on existing relational databases in order to come up with something that was optimized for a trading platform or a time series platform. And he sort of, he just developed this real clear point of view is this is not how developers should work. And so in 2013, he went through why Combinator and he built something for, he made his first commit to open source in flu DB at the end of 2013. And, and he basically, you know, from my point of view, he invented modern time series, which is you start with a purpose-built time series platform to do these kind of workloads. And you get all the benefits of having something right outta the box. So a developer can be totally productive right away. >>And how many people in the company what's the history of employees and stuff? >>Yeah, I think we're, I, you know, I always forget the number, but it's something like 230 or 240 people now. Um, the company, I joined the company in 2016 and I love Paul's vision. And I just had a strong conviction about the relationship between time series and IOT. Cuz if you think about it, what sensors do is they speak time, series, pressure, temperature, volume, humidity, light, they're measuring they're instrumenting something over time. And so I thought that would be super relevant over long term and I've not regretted it. >>Oh no. And it's interesting at that time, go back in the history, you know, the role of databases, well, relational database is the one database to rule the world. And then as clouds started coming in, you starting to see more databases, proliferate types of databases and time series in particular is interesting. Cuz real time has become super valuable from an application standpoint, O T which speaks time series means something it's like time matters >>Time. >>Yeah. And sometimes data's not worth it after the time, sometimes it worth it. And then you get the data lake. So you have this whole new evolution. Is this the momentum? What's the momentum, I guess the question is what's the momentum behind >>You mean what's causing us to grow. So >>Yeah, the time series, why is time series >>And the >>Category momentum? What's the bottom line? >>Well, think about it. You think about it from a broad, broad sort of frame, which is where, what everybody's trying to do is build increasingly intelligent systems, whether it's a self-driving car or a robotic system that does what you want to do or a self-healing software system, everybody wants to build increasing intelligent systems. And so in order to build these increasing intelligent systems, you have to instrument the system well, and you have to instrument it over time, better and better. And so you need a tool, a fundamental tool to drive that instrumentation. And that's become clear to everybody that that instrumentation is all based on time. And so what happened, what happened, what happened what's gonna happen? And so you get to these applications like predictive maintenance or smarter systems. And increasingly you want to do that stuff, not just intelligently, but fast in real time. So millisecond response so that when you're driving a self-driving car and the system realizes that you're about to do something, essentially you wanna be able to act in something that looks like real time, all systems want to do that, want to be more intelligent and they want to be more real time. And so we just happen to, you know, we happen to show up at the right time in the evolution of a >>Market. It's interesting near real time. Isn't good enough when you need real time. >><laugh> yeah, it's not, it's not. And it's like, and it's like, everybody wants, even when you don't need it, ironically, you want it. It's like having the feature for, you know, you buy a new television, you want that one feature, even though you're not gonna use it, you decide that your buying criteria real time is a buying criteria >>For, so you, I mean, what you're saying then is near real time is getting closer to real time as possible, as fast as possible. Right. Okay. So talk about the aspect of data, cuz we're hearing a lot of conversations on the cube in particular around how people are implementing and actually getting better. So iterating on data, but you have to know when it happened to get, know how to fix it. So this is a big part of how we're seeing with people saying, Hey, you know, I wanna make my machine learning algorithms better after the fact I wanna learn from the data. Um, how does that, how do you see that evolving? Is that one of the use cases of sensors as people bring data in off the network, getting better with the data knowing when it happened? >>Well, for sure. So, so for sure, what you're saying is, is, is none of this is non-linear, it's all incremental. And so if you take something, you know, just as an easy example, if you take a self-driving car, what you're doing is you're instrumenting that car to understand where it can perform in the real world in real time. And if you do that, if you run the loop, which is I instrumented, I watch what happens, oh, that's wrong? Oh, I have to correct for that. I correct for that in the software. If you do that for a billion times, you get a self-driving car, but every system moves along that evolution. And so you get the dynamic of, you know, of constantly instrumenting watching the system behave and do it. And this and sets up driving car is one thing. But even in the human genome, if you look at some of our customers, you know, people like, you know, people doing solar arrays, people doing power walls, like all of these systems are getting smarter. >>Well, let's get into that. What are the top applications? What are you seeing for your, with in, with influx DB, the time series, what's the sweet spot for the application use case and some customers give some >>Examples. Yeah. So it's, it's pretty easy to understand on one side of the equation that's the physical side is sensors are sensors are getting cheap. Obviously we know that and they're getting the whole physical world is getting instrumented, your home, your car, the factory floor, your wrist, watch your healthcare, you name it. It's getting instrumented in the physical world. We're watching the physical world in real time. And so there are three or four sweet spots for us, but, but they're all on that side. They're all about IOT. So they're think about consumer IOT projects like Google's nest todo, um, particle sensors, um, even delivery engines like rapid who deliver the Instacart of south America, like anywhere there's a physical location do and that's on the consumer side. And then another exciting space is the industrial side factories are changing dramatically over time. Increasingly moving away from proprietary equipment to develop or driven systems that run operational because what, what has to get smarter when you're building, when you're building a factory is systems all have to get smarter. And then, um, lastly, a lot in the renewables sustainability. So a lot, you know, Tesla, lucid, motors, Cola, motors, um, you know, lots to do with electric cars, solar arrays, windmills, arrays, just anything that's gonna get instrumented that where that instrumentation becomes part of what the purpose >>Is. It's interesting. The convergence of physical and digital is happening with the data IOT. You mentioned, you know, you think of IOT, look at the use cases there, it was proprietary OT systems. Now becoming more IP enabled internet protocol and now edge compute, getting smaller, faster, cheaper AI going to the edge. Now you have all kinds of new capabilities that bring that real time and time series opportunity. Are you seeing IOT going to a new level? What was the, what's the IOT where's the IOT dots connecting to because you know, as these two cultures merge yeah. Operations, basically industrial factory car, they gotta get smarter, intelligent edge is a buzzword, but I mean, it has to be more intelligent. Where's the, where's the action in all this. So the >>Action, really, it really at the core, it's at the developer, right? Because you're looking at these things, it's very hard to get an off the shelf system to do the kinds of physical and software interaction. So the actions really happen at the developer. And so what you're seeing is a movement in the world that, that maybe you and I grew up in with it or OT moving increasingly that developer driven capability. And so all of these IOT systems they're bespoke, they don't come out of the box. And so the developer, the architect, the CTO, they define what's my business. What am I trying to do? Am I trying to sequence a human genome and figure out when these genes express theself or am I trying to figure out when the next heart rate monitor's gonna show up on my apple watch, right? What am I trying to do? What's the system I need to build. And so starting with the developers where all of the good stuff happens here, which is different than it used to be, right. Used to be you'd buy an application or a service or a SA thing for, but with this dynamic, with this integration of systems, it's all about bespoke. It's all about building >>Something. So let's get to the developer real quick, real highlight point here is the data. I mean, I could see a developer saying, okay, I need to have an application for the edge IOT edge or car. I mean, we're gonna have, I mean, Tesla's got applications of the car it's right there. I mean, yes, there's the modern application life cycle now. So take us through how this impacts the developer. Does it impact their C I C D pipeline? Is it cloud native? I mean, where does this all, where does this go to? >>Well, so first of all, you're talking about, there was an internal journey that we had to go through as a company, which, which I think is fascinating for anybody who's interested is we went from primarily a monolithic software that was open sourced to building a cloud native platform, which means we had to move from an agile development environment to a C I C D environment. So to a degree that you are moving your service, whether it's, you know, Tesla monitoring your car and updating your power walls, right. Or whether it's a solar company updating the arrays, right. To degree that that service is cloud. Then increasingly remove from an agile development to a C I C D environment, which you're shipping code to production every day. And so it's not just the developers, all the infrastructure to support the developers to run that service and that sort of stuff. I think that's also gonna happen in a big way >>When your customer base that you have now, and as you see, evolving with infl DB, is it that they're gonna be writing more of the application or relying more on others? I mean, obviously there's an open source component here. So when you bring in kind of old way, new way old way was I got a proprietary, a platform running all this O T stuff and I gotta write, here's an application. That's general purpose. Yeah. I have some flexibility, somewhat brittle, maybe not a lot of robustness to it, but it does its job >>A good way to think about this is versus a new way >>Is >>What so yeah, good way to think about this is what, what's the role of the developer slash architect CTO that chain within a large, within an enterprise or a company. And so, um, the way to think about it is I started my career in the aerospace industry <laugh> and so when you look at what Boeing does to assemble a plane, they build very, very few of the parts. Instead, what they do is they assemble, they buy the wings, they buy the engines, they assemble, actually, they don't buy the wings. It's the one thing they buy the, the material for the w they build the wings, cuz there's a lot of tech in the wings and they end up being assemblers smart assemblers of what ends up being a flying airplane, which is pretty big deal even now. And so what, what happens with software people is they have the ability to pull from, you know, the best of the open source world. So they would pull a time series capability from us. Then they would assemble that with, with potentially some ETL logic from somebody else, or they'd assemble it with, um, a Kafka interface to be able to stream the data in. And so they become very good integrators and assemblers, but they become masters of that bespoke application. And I think that's where it goes, cuz you're not writing native code for everything. >>So they're more flexible. They have faster time to market cuz they're assembling way faster and they get to still maintain their core competency. Okay. Their wings in this case, >>They become increasingly not just coders, but designers and developers. They become broadly builders is what we like to think of it. People who start and build stuff by the way, this is not different than the people just up the road Google have been doing for years or the tier one, Amazon building all their own. >>Well, I think one of the things that's interesting is is that this idea of a systems developing a system architecture, I mean systems, uh, uh, systems have consequences when you make changes. So when you have now cloud data center on premise and edge working together, how does that work across the system? You can't have a wing that doesn't work with the other wing kind of thing. >>That's exactly. But that's where the that's where the, you know, that that Boeing or that airplane building analogy comes in for us. We've really been thoughtful about that because IOT it's critical. So our open source edge has the same API as our cloud native stuff that has enterprise on pre edge. So our multiple products have the same API and they have a relationship with each other. They can talk with each other. So the builder builds it once. And so this is where, when you start thinking about the components that people have to use to build these services is that you wanna make sure, at least that base layer, that database layer, that those components talk to each other. >>So I'll have to ask you if I'm the customer. I put my customer hat on. Okay. Hey, I'm dealing with a lot. >>That mean you have a PO for <laugh> >>A big check. I blank check. If you can answer this question only if the tech, if, if you get the question right, I got all this important operation stuff. I got my factory, I got my self-driving cars. This isn't like trivial stuff. This is my business. How should I be thinking about time series? Because now I have to make these architectural decisions, as you mentioned, and it's gonna impact my application development. So huge decision point for your customers. What should I care about the most? So what's in it for me. Why is time series >>Important? Yeah, that's a great question. So chances are, if you've got a business that was, you know, 20 years old or 25 years old, you were already thinking about time series. You probably didn't call it that you built something on a Oracle or you built something on IBM's DB two, right. And you made it work within your system. Right? And so that's what you started building. So it's already out there. There are, you know, there are probably hundreds of millions of time series applications out there today. But as you start to think about this increasing need for real time, and you start to think about increasing intelligence, you think about optimizing those systems over time. I hate the word, but digital transformation. Then you start with time series. It's a foundational base layer for any system that you're gonna build. There's no system I can think of where time series, shouldn't be the foundational base layer. If you just wanna store your data and just leave it there and then maybe look it up every five years. That's fine. That's not time. Series time series is when you're building a smarter, more intelligent, more real time system. And the developers now know that. And so the more they play a role in building these systems, the more obvious it becomes. >>And since I have a PO for you and a big check, yeah. What is, what's the value to me as I, when I implement this, what's the end state, what's it look like when it's up and running? What's the value proposition for me. What's an >>So, so when it's up and running, you're able to handle the queries, the writing of the data, the down sampling of the data, they're transforming it in near real time. So that the other dependencies that a system that gets for adjusting a solar array or trading energy off of a power wall or some sort of human genome, those systems work better. So time series is foundational. It's not like it's, you know, it's not like it's doing every action that's above, but it's foundational to build a really compelling, intelligent system. I think that's what developers and archs are seeing now. >>Bottom line, final word. What's in it for the customer. What's what, what's your, um, what's your statement to the customer? What would you say to someone looking to do something in time series on edge? >>Yeah. So, so it's pretty clear to clear to us that if you're building, if you view yourself as being in the build business of building systems that you want 'em to be increasingly intelligent, self-healing autonomous. You want 'em to operate in real time that you start from time series. But I also wanna say what's in it for us influx what's in it for us is people are doing some amazing stuff. You know, I highlighted some of the energy stuff, some of the human genome, some of the healthcare it's hard not to be proud or feel like, wow. Yeah. Somehow I've been lucky. I've arrived at the right time, in the right place with the right people to be able to deliver on that. That's that's also exciting on our side of the equation. >>Yeah. It's critical infrastructure, critical, critical operations. >>Yeah. >>Yeah. Great stuff, Evan. Thanks for coming on. Appreciate this segment. All right. In a moment, Brian Gilmore director of IOT and emerging technology that influx day will join me. You're watching the cube leader in tech coverage. Thanks for watching >>Time series data from sensors systems and applications is a key source in driving automation and prediction in technologies around the world. But managing the massive amount of timestamp data generated these days is overwhelming, especially at scale. That's why influx data developed influx DB, a time series data platform that collects stores and analyzes data influx DB empowers developers to extract valuable insights and turn them into action by building transformative IOT analytics and cloud native applications, purpose built and optimized to handle the scale and velocity of timestamped data. InfluxDB puts the power in your hands with developer tools that make it easy to get started quickly with less code InfluxDB is more than a database. It's a robust developer platform with integrated tooling. That's written in the languages you love. So you can innovate faster, run in flex DB anywhere you want by choosing the provider and region that best fits your needs across AWS, Microsoft Azure and Google cloud flex DB is fast and automatically scalable. So you can spend time delivering value to customers, not managing clusters, take control of your time series data. So you can focus on the features and functionalities that give your applications a competitive edge. Get started for free with influx DB, visit influx data.com/cloud to learn more. >>Okay. Now we're joined by Brian Gilmore director of IOT and emerging technologies at influx data. Welcome to the show. >>Thank you, John. Great to be here. >>We just spent some time with Evan going through the company and the value proposition, um, with influx DV, what's the momentum, where do you see this coming from? What's the value coming out of this? >>Well, I think it, we're sort of hitting a point where the technology is, is like the adoption of it is becoming mainstream. We're seeing it in all sorts of organizations, everybody from like the most well funded sort of advanced big technology companies to the smaller academics, the startups and the managing of that sort of data that emits from that technology is time series and us being able to give them a, a platform, a tool that's super easy to use, easy to start. And then of course will grow with them is, is been key to us. Sort of, you know, riding along with them is they're successful. >>Evan was mentioning that time series has been on everyone's radar and that's in the OT business for years. Now, you go back since 20 13, 14, even like five years ago that convergence of physical and digital coming together, IP enabled edge. Yeah. Edge has always been kind of hyped up, but why now? Why, why is the edge so hot right now from an adoption standpoint? Is it because it's just evolution, the tech getting better? >>I think it's, it's, it's twofold. I think that, you know, there was, I would think for some people, everybody was so focused on cloud over the last probably 10 years. Mm-hmm <affirmative> that they forgot about the compute that was available at the edge. And I think, you know, those, especially in the OT and on the factory floor who weren't able to take Avan full advantage of cloud through their applications, you know, still needed to be able to leverage that compute at the edge. I think the big thing that we're seeing now, which is interesting is, is that there's like a hybrid nature to all of these applications where there's definitely some data that's generated on the edge. There's definitely done some data that's generated in the cloud. And it's the ability for a developer to sort of like tie those two systems together and work with that data in a very unified uniform way. Um, that's giving them the opportunity to build solutions that, you know, really deliver value to whatever it is they're trying to do, whether it's, you know, the, the out reaches of outer space or whether it's optimizing the factory floor. >>Yeah. I think, I think one of the things you also mentions genome too, dig big data is coming to the real world. And I think I, OT has been kind of like this thing for OT and, and in some use case, but now with the, with the cloud, all companies have an edge strategy now. So yeah, what's the secret sauce because now this is hot, hot product for the whole world and not just industrial, but all businesses. What's the secret sauce. >>Well, I mean, I think part of it is just that the technology is becoming more capable and that's especially on the hardware side, right? I mean, like technology compute is getting smaller and smaller and smaller. And we find that by supporting all the way down to the edge, even to the micro controller layer with our, um, you know, our client libraries and then working hard to make our applications, especially the database as small as possible so that it can be located as close to sort of the point of origin of that data in the edge as possible is, is, is fantastic. Now you can take that. You can run that locally. You can do your local decision making. You can use influx DB as sort of an input to automation control the autonomy that people are trying to drive at the edge. But when you link it up with everything that's in the cloud, that's when you get all of the sort of cloud scale capabilities of parallelized, AI and machine learning and all of that. >>So what's interesting is the open source success has been something that we've talked about a lot in the cube about how people are leveraging that you guys have users in the enterprise users that IOT market mm-hmm <affirmative>, but you got developers now. Yeah. Kind of together brought that up. How do you see that emerging? How do developers engage? What are some of the things you're seeing that developers are really getting into with InfluxDB >>What's? Yeah. Well, I mean, I think there are the developers who are building companies, right? And these are the startups and the folks that we love to work with who are building new, you know, new services, new products, things like that. And, you know, especially on the consumer side of IOT, there's a lot of that, just those developers. But I think we, you gotta pay attention to those enterprise developers as well, right? There are tons of people with the, the title of engineer in, in your regular enterprise organizations. And they're there for systems integration. They're there for, you know, looking at what they would build versus what they would buy. And a lot of them come from, you know, a strong, open source background and they, they know the communities, they know the top platforms in those spaces and, and, you know, they're excited to be able to adopt and use, you know, to optimize inside the business as compared to just building a brand new one. >>You know, it's interesting too, when Evan and I were talking about open source versus closed OT systems, mm-hmm <affirmative> so how do you support the backwards compatibility of older systems while maintaining open dozens of data formats out there? Bunch of standards, protocols, new things are emerging. Everyone wants to have a control plane. Everyone wants to leverage the value of data. How do you guys keep track of it all? What do you guys support? >>Yeah, well, I mean, I think either through direct connection, like we have a product called Telegraph, it's unbelievable. It's open source, it's an edge agent. You can run it as close to the edge as you'd like, it speaks dozens of different protocols in its own, right? A couple of which MQTT B, C U a are very, very, um, applicable to these T use cases. But then we also, because we are sort of not only open source, but open in terms of our ability to collect data, we have a lot of partners who have built really great integrations from their own middleware, into influx DB. These are companies like ke wear and high bite who are really experts in those downstream industrial protocols. I mean, that's a business, not everybody wants to be in. It requires some very specialized, very hard work and a lot of support, um, you know, and so by making those connections and building those ecosystems, we get the best of both worlds. The customers can use the platforms they need up to the point where they would be putting into our database. >>What's some of customer testimonies that they, that share with you. Can you share some anecdotal kind of like, wow, that's the best thing I've ever used. This really changed my business, or this is a great tech that's helped me in these other areas. What are some of the, um, soundbites you hear from customers when they're successful? >>Yeah. I mean, I think it ranges. You've got customers who are, you know, just finally being able to do the monitoring of assets, you know, sort of at the edge in the field, we have a customer who's who's has these tunnel boring machines that go deep into the earth to like drill tunnels for, for, you know, cars and, and, you know, trains and things like that. You know, they are just excited to be able to stick a database onto those tunnel, boring machines, send them into the depths of the earth and know that when they come out, all of that telemetry at a very high frequency has been like safely stored. And then it can just very quickly and instantly connect up to their, you know, centralized database. So like just having that visibility is brand new to them. And that's super important. On the other hand, we have customers who are way far beyond the monitoring use case, where they're actually using the historical records in the time series database to, um, like I think Evan mentioned like forecast things. So for predictive maintenance, being able to pull in the telemetry from the machines, but then also all of that external enrichment data, the metadata, the temperatures, the pressure is who is operating the machine, those types of things, and being able to easily integrate with platforms like Jupyter notebooks or, you know, all of those scientific computing and machine learning libraries to be able to build the models, train the models, and then they can send that information back down to InfluxDB to apply it and detect those anomalies, which >>Are, I think that's gonna be an, an area. I personally think that's a hot area because I think if you look at AI right now, yeah. It's all about training the machine learning albums after the fact. So time series becomes hugely important. Yeah. Cause now you're thinking, okay, the data matters post time. Yeah. First time. And then it gets updated the new time. Yeah. So it's like constant data cleansing data iteration, data programming. We're starting to see this new use case emerge in the data field. >>Yep. Yeah. I mean, I think you agree. Yeah, of course. Yeah. The, the ability to sort of handle those pipelines of data smartly, um, intelligently, and then to be able to do all of the things you need to do with that data in stream, um, before it hits your sort of central repository. And, and we make that really easy for customers like Telegraph, not only does it have sort of the inputs to connect up to all of those protocols and the ability to capture and connect up to the, to the partner data. But also it has a whole bunch of capabilities around being able to process that data, enrich it, reform at it, route it, do whatever you need. So at that point you're basically able to, you're playing your data in exactly the way you would wanna do it. You're routing it to different, you know, destinations and, and it's, it's, it's not something that really has been in the realm of possibility until this point. Yeah. Yeah. >>And when Evan was on it's great. He was a CEO. So he sees the big picture with customers. He was, he kinda put the package together that said, Hey, we got a system. We got customers, people are wanting to leverage our product. What's your PO they're sell. He's selling too as well. So you have that whole CEO perspective, but he brought up this notion that there's multiple personas involved in kind of the influx DB system architect. You got developers and users. Can you talk about that? Reality as customers start to commercialize and operationalize this from a commercial standpoint, you got a relationship to the cloud. Yep. The edge is there. Yep. The edge is getting super important, but cloud brings a lot of scale to the table. So what is the relationship to the cloud? Can you share your thoughts on edge and its relationship to the cloud? >>Yeah. I mean, I think edge, you know, edges, you can think of it really as like the local information, right? So it's, it's generally like compartmentalized to a point of like, you know, a single asset or a single factory align, whatever. Um, but what people do who wanna pro they wanna be able to make the decisions there at the edge locally, um, quickly minus the latency of sort of taking that large volume of data, shipping it to the cloud and doing something with it there. So we allow them to do exactly that. Then what they can do is they can actually downsample that data or they can, you know, detect like the really important metrics or the anomalies. And then they can ship that to a central database in the cloud where they can do all sorts of really interesting things with it. Like you can get that centralized view of all of your global assets. You can start to compare asset to asset, and then you can do those things like we talked about, whereas you can do predictive types of analytics or, you know, larger scale anomaly detections. >>So in this model you have a lot of commercial operations, industrial equipment. Yep. The physical plant, physical business with virtual data cloud all coming together. What's the future for InfluxDB from a tech standpoint. Cause you got open. Yep. There's an ecosystem there. Yep. You have customers who want operational reliability for sure. I mean, so you got organic <laugh> >>Yeah. Yeah. I mean, I think, you know, again, we got iPhones when everybody's waiting for flying cars. Right. So I don't know. We can like absolutely perfectly predict what's coming, but I think there are some givens and I think those givens are gonna be that the world is only gonna become more hybrid. Right. And then, you know, so we are going to have much more widely distributed, you know, situations where you have data being generated in the cloud, you have data gen being generated at the edge and then there's gonna be data generated sort sort of at all points in between like physical locations as well as things that are, that are very virtual. And I think, you know, we are, we're building some technology right now. That's going to allow, um, the concept of a database to be much more fluid and flexible, sort of more aligned with what a file would be like. >>And so being able to move data to the compute for analysis or move the compute to the data for analysis, those are the types of, of solutions that we'll be bringing to the customers sort of over the next little bit. Um, but I also think we have to start thinking about like what happens when the edge is actually off the planet. Right. I mean, we've got customers, you're gonna talk to two of them, uh, in the panel who are actually working with data that comes from like outside the earth, like, you know, either in low earth orbit or you know, all the way sort of on the other side of the universe. Yeah. And, and to be able to process data like that and to do so in a way it's it's we gotta, we gotta build the fundamentals for that right now on the factory floor and in the mines and in the tunnels. Um, so that we'll be ready for that one. >>I think you bring up a good point there because one of the things that's common in the industry right now, people are talking about, this is kind of new thinking is hyper scale's always been built up full stack developers, even the old OT world, Evan was pointing out that they built everything right. And the world's going to more assembly with core competency and IP and also property being the core of their apple. So faster assembly and building, but also integration. You got all this new stuff happening. Yeah. And that's to separate out the data complexity from the app. Yes. So space genome. Yep. Driving cars throws off massive data. >>It >>Does. So is Tesla, uh, is the car the same as the data layer? >>I mean the, yeah, it's, it's certainly a point of origin. I think the thing that we wanna do is we wanna let the developers work on the world, changing problems, the things that they're trying to solve, whether it's, you know, energy or, you know, any of the other health or, you know, other challenges that these teams are, are building against. And we'll worry about that time series data and the underlying data platform so that they don't have to. Right. I mean, I think you talked about it, uh, you know, for them just to be able to adopt the platform quickly, integrate it with their data sources and the other pieces of their applications. It's going to allow them to bring much faster time to market on these products. It's gonna allow them to be more iterative. They're gonna be able to do more sort of testing and things like that. And ultimately it will, it'll accelerate the adoption and the creation of >>Technology. You mentioned earlier in, in our talk about unification of data. Yeah. How about APIs? Cuz developers love APIs in the cloud unifying APIs. How do you view view that? >>Yeah, I mean, we are APIs, that's the product itself. Like everything, people like to think of it as sort of having this nice front end, but the front end is B built on our public APIs. Um, you know, and it, it allows the developer to build all of those hooks for not only data creation, but then data processing, data analytics, and then, you know, sort of data extraction to bring it to other platforms or other applications, microservices, whatever it might be. So, I mean, it is a world of APIs right now and you know, we, we bring a very sort of useful set of them for managing the time series data. These guys are all challenged with. It's >>Interesting. You and I were talking before we came on camera about how, um, data is, feels gonna have this kind of SRE role that DevOps had site reliability engineers, which manages a bunch of servers. There's so much data out there now. Yeah. >>Yeah. It's like reigning data for sure. And I think like that ability to be like one of the best jobs on the planet is gonna be to be able to like, sort of be that data Wrangler to be able to understand like what the data sources are, what the data formats are, how to be able to efficiently move that data from point a to point B and you know, to process it correctly so that the end users of that data aren't doing any of that sort of hard upfront preparation collection storage's >>Work. Yeah. That's data as code. I mean, data engineering is it is becoming a new discipline for sure. And, and the democratization is the benefit. Yeah. To everyone, data science get easier. I mean data science, but they wanna make it easy. Right. <laugh> yeah. They wanna do the analysis, >>Right? Yeah. I mean, I think, you know, it, it's a really good point. I think like we try to give our users as many ways as there could be possible to get data in and get data out. We sort of think about it as meeting them where they are. Right. So like we build, we have the sort of client libraries that allow them to just port to us, you know, directly from the applications and the languages that they're writing, but then they can also pull it out. And at that point nobody's gonna know the users, the end consumers of that data, better than those people who are building those applications. And so they're building these user interfaces, which are making all of that data accessible for, you know, their end users inside their organization. >>Well, Brian, great segment, great insight. Thanks for sharing all, all the complexities and, and IOT that you guys helped take away with the APIs and, and assembly and, and all the system architectures that are changing edge is real cloud is real. Yeah, absolutely. Mainstream enterprises. And you got developer attraction too, so congratulations. >>Yeah. It's >>Great. Well, thank any, any last word you wanna share >>Deal with? No, just, I mean, please, you know, if you're, if you're gonna, if you're gonna check out influx TV, download it, try out the open source contribute if you can. That's a, that's a huge thing. It's part of being the open source community. Um, you know, but definitely just, just use it. I think when once people use it, they try it out. They'll understand very, >>Very quickly. So open source with developers, enterprise and edge coming together all together. You're gonna hear more about that in the next segment, too. Right. Thanks for coming on. Okay. Thanks. When we return, Dave LAN will lead a panel on edge and data influx DB. You're watching the cube, the leader in high tech enterprise coverage. >>Why the startup, we move really fast. We find that in flex DB can move as fast as us. It's just a great group, very collaborative, very interested in manufacturing. And we see a bright future in working with influence. My name is Aaron Seley. I'm the CTO at HBI. Highlight's one of the first companies to focus on manufacturing data and apply the concepts of data ops, treat that as an asset to deliver to the it system, to enable applications like overall equipment effectiveness that can help the factory produce better, smarter, faster time series data. And manufacturing's really important. If you take a piece of equipment, you have the temperature pressure at the moment that you can look at to kind of see the state of what's going on. So without that context and understanding you can't do what manufacturers ultimately want to do, which is predict the future. >>Influx DB represents kind of a new way to storm time series data with some more advanced technology and more importantly, more open technologies. The other thing that influx does really well is once the data's influx, it's very easy to get out, right? They have a modern rest API and other ways to access the data. That would be much more difficult to do integrations with classic historians highlight can serve to model data, aggregate data on the shop floor from a multitude of sources, whether that be P C U a servers, manufacturing execution systems, E R P et cetera, and then push that seamlessly into influx to then be able to run calculations. Manufacturing is changing this industrial 4.0, and what we're seeing is influx being part of that equation. Being used to store data off the unified name space, we recommend InfluxDB all the time to customers that are exploring a new way to share data manufacturing called the unified name space who have open questions around how do I share this new data that's coming through my UNS or my QTT broker? How do I store this and be able to query it over time? And we often point to influx as a solution for that is a great brand. It's a great group of people and it's a great technology. >>Okay. We're now going to go into the customer panel and we'd like to welcome Angelo Fasi. Who's a software engineer at the Vera C Ruben observatory in Caleb McLaughlin whose senior spacecraft operations software engineer at loft orbital guys. Thanks for joining us. You don't wanna miss folks this interview, Caleb, let's start with you. You work for an extremely cool company. You're launching satellites into space. I mean, there, of course doing that is, is highly complex and not a cheap endeavor. Tell us about loft Orbi and what you guys do to attack that problem. >>Yeah, absolutely. And, uh, thanks for having me here by the way. Uh, so loft orbital is a, uh, company. That's a series B startup now, uh, who and our mission basically is to provide, uh, rapid access to space for all kinds of customers. Uh, historically if you want to fly something in space, do something in space, it's extremely expensive. You need to book a launch, build a bus, hire a team to operate it, you know, have a big software teams, uh, and then eventually worry about, you know, a bunch like just a lot of very specialized engineering. And what we're trying to do is change that from a super specialized problem that has an extremely high barrier of access to a infrastructure problem. So that it's almost as simple as, you know, deploying a VM in, uh, AWS or GCP is getting your, uh, programs, your mission deployed on orbit, uh, with access to, you know, different sensors, uh, cameras, radios, stuff like that. >>So that's, that's kind of our mission. And just to give a really brief example of the kind of customer that we can serve. Uh, there's a really cool company called, uh, totem labs who is working on building, uh, IOT cons, an IOT constellation for in of things, basically being able to get telemetry from all over the world. They're the first company to demonstrate indoor T, which means you have this little modem inside a container container that you, that you track from anywhere in the world as it's going across the ocean. Um, so they're, it's really little and they've been able to stay a small startup that's focused on their product, which is the, uh, that super crazy complicated, cool radio while we handle the whole space segment for them, which just, you know, before loft was really impossible. So that's, our mission is, uh, providing space infrastructure as a service. We are kind of groundbreaking in this area and we're serving, you know, a huge variety of customers with all kinds of different missions, um, and obviously generating a ton of data in space, uh, that we've gotta handle. Yeah. >>So amazing Caleb, what you guys do, I, now I know you were lured to the skies very early in your career, but how did you kinda land on this business? >>Yeah, so, you know, I've, I guess just a little bit about me for some people, you know, they don't necessarily know what they wanna do like early in their life. For me, I was five years old and I knew, you know, I want to be in the space industry. So, you know, I started in the air force, but have, uh, stayed in the space industry, my whole career and been a part of, uh, this is the fifth space startup that I've been a part of actually. So, you know, I've, I've, uh, kind of started out in satellites, did spent some time in working in, uh, the launch industry on rockets. Then, uh, now I'm here back in satellites and you know, honestly, this is the most exciting of the difference based startups. That I've been a part of >>Super interesting. Okay. Angelo, let's, let's talk about the Ruben observatory, ver C Ruben, famous woman scientist, you know, galaxy guru. Now you guys the observatory, you're up way up high. You're gonna get a good look at the Southern sky. Now I know COVID slowed you guys down a bit, but no doubt. You continued to code away on the software. I know you're getting close. You gotta be super excited. Give us the update on, on the observatory and your role. >>All right. So yeah, Rubin is a state of the art observatory that, uh, is in construction on a remote mountain in Chile. And, um, with Rubin, we conduct the, uh, large survey of space and time we are going to observe the sky with, uh, eight meter optical telescope and take, uh, a thousand pictures every night with a 3.2 gig up peaks of camera. And we are going to do that for 10 years, which is the duration of the survey. >>Yeah. Amazing project. Now you, you were a doctor of philosophy, so you probably spent some time thinking about what's out there and then you went out to earn a PhD in astronomy, in astrophysics. So this is something that you've been working on for the better part of your career, isn't it? >>Yeah, that's that's right. Uh, about 15 years, um, I studied physics in college, then I, um, got a PhD in astronomy and, uh, I worked for about five years in another project. Um, the dark energy survey before joining rubing in 2015. >>Yeah. Impressive. So it seems like you both, you know, your organizations are looking at space from two different angles. One thing you guys both have in common of course is, is, is software. And you both use InfluxDB as part of your, your data infrastructure. How did you discover influx DB get into it? How do you use the platform? Maybe Caleb, you could start. >>Uh, yeah, absolutely. So the first company that I extensively used, uh, influx DBN was a launch startup called, uh, Astra. And we were in the process of, uh, designing our, you know, our first generation rocket there and testing the engines, pumps, everything that goes into a rocket. Uh, and when I joined the company, our data story was not, uh, very mature. We were collecting a bunch of data in LabVIEW and engineers were taking that over to MATLAB to process it. Um, and at first there, you know, that's the way that a lot of engineers and scientists are used to working. Um, and at first that was, uh, like people weren't entirely sure that that was a, um, that that needed to change, but it's something the nice thing about InfluxDB is that, you know, it's so easy to deploy. So as the, our software engineering team was able to get it deployed and, you know, up and running very quickly and then quickly also backport all of the data that we collected thus far into influx and what, uh, was amazing to see. >>And as kind of the, the super cool moment with influx is, um, when we hooked that up to Grafana Grafana as the visualization platform we used with influx, cuz it works really well with it. Uh, there was like this aha moment of our engineers who are used to this post process kind of method for dealing with their data where they could just almost instantly easily discover data that they hadn't been able to see before and take the manual processes that they would run after a test and just throw those all in influx and have live data as tests were coming. And, you know, I saw them implementing like crazy rocket equation type stuff in influx, and it just was totally game changing for how we tested. >>So Angelo, I was explaining in my open, you know, you could, you could add a column in a traditional RDBMS and do time series, but with the volume of data that you're talking about, and the example of the Caleb just gave you, I mean, you have to have a purpose built time series database, where did you first learn about influx DB? >>Yeah, correct. So I work with the data management team, uh, and my first project was the record metrics that measured the performance of our software, uh, the software that we used to process the data. So I started implementing that in a relational database. Um, but then I realized that in fact, I was dealing with time series data and I should really use a solution built for that. And then I started looking at time series databases and I found influx B. And that was, uh, back in 2018. The another use for influx DB that I'm also interested is the visits database. Um, if you think about the observations we are moving the telescope all the time in pointing to specific directions, uh, in the Skype and taking pictures every 30 seconds. So that itself is a time series. And every point in that time series, uh, we call a visit. So we want to record the metadata about those visits and flex to, uh, that time here is going to be 10 years long, um, with about, uh, 1000 points every night. It's actually not too much data compared to other, other problems. It's, uh, really just a different, uh, time scale. >>The telescope at the Ruben observatory is like pun intended, I guess the star of the show. And I, I believe I read that it's gonna be the first of the next gen telescopes to come online. It's got this massive field of view, like three orders of magnitude times the Hub's widest camera view, which is amazing, right? That's like 40 moons in, in an image amazingly fast as well. What else can you tell us about the telescope? >>Um, this telescope, it has to move really fast and it also has to carry, uh, the primary mirror, which is an eight meter piece of glass. It's very heavy and it has to carry a camera, which has about the size of a small car. And this whole structure weighs about 300 tons for that to work. Uh, the telescope needs to be, uh, very compact and stiff. Uh, and one thing that's amazing about it's design is that the telescope, um, is 300 tons structure. It sits on a tiny film of oil, which has the diameter of, uh, human hair. And that makes an almost zero friction interface. In fact, a few people can move these enormous structure with only their hands. Uh, as you said, uh, another aspect that makes this telescope unique is the optical design. It's a wide field telescope. So each image has, uh, in diameter the size of about seven full moons. And, uh, with that, we can map the entire sky in only, uh, three days. And of course doing operations everything's, uh, controlled by software and it is automatic. Um there's a very complex piece of software, uh, called the scheduler, which is responsible for moving the telescope, um, and the camera, which is, uh, recording 15 terabytes of data every night. >>Hmm. And, and, and Angela, all this data lands in influx DB. Correct. And what are you doing with, with all that data? >>Yeah, actually not. Um, so we are using flex DB to record engineering data and metadata about the observations like telemetry events and commands from the telescope. That's a much smaller data set compared to the images, but it is still challenging because, uh, you, you have some high frequency data, uh, that the system needs to keep up and we need to, to start this data and have it around for the lifetime of the price. Mm, >>Got it. Thank you. Okay, Caleb, let's bring you back in and can tell us more about the, you got these dishwasher size satellites. You're kind of using a multi-tenant model. I think it's genius, but, but tell us about the satellites themselves. >>Yeah, absolutely. So, uh, we have in space, some satellites already that as you said, are like dishwasher, mini fridge kind of size. Um, and we're working on a bunch more that are, you know, a variety of sizes from shoebox to, I guess, a few times larger than what we have today. Uh, and it is, we do shoot to have effectively something like a multi-tenant model where, uh, we will buy a bus off the shelf. The bus is, uh, what you can kind of think of as the core piece of the satellite, almost like a motherboard or something where it's providing the power. It has the solar panels, it has some radios attached to it. Uh, it handles the attitude control, basically steers the spacecraft in orbit. And then we build also in house, what we call our payload hub, which is, has all, any customer payloads attached and our own kind of edge processing sort of capabilities built into it. >>And, uh, so we integrate that. We launch it, uh, and those things, because they're in lower orbit, they're orbiting the earth every 90 minutes. That's, you know, seven kilometers per second, which is several times faster than a speeding bullet. So we've got, we have, uh, one of the unique challenges of operating spacecraft and lower orbit is that generally you can't talk to them all the time. So we're managing these things through very brief windows of time, uh, where we get to talk to them through our ground sites, either in Antarctica or, you know, in the north pole region. >>Talk more about how you use influx DB to make sense of this data through all this tech that you're launching into space. >>We basically previously we started off when I joined the company, storing all of that as Angelo did in a regular relational database. And we found that it was, uh, so slow in the size of our data would balloon over the course of a couple days to the point where we weren't able to even store all of the data that we were getting. Uh, so we migrated to influx DB to store our time series telemetry from the spacecraft. So, you know, that's things like, uh, power level voltage, um, currents counts, whatever, whatever metadata we need to monitor about the spacecraft. We now store that in, uh, in influx DB. Uh, and that has, you know, now we can actually easily store the entire volume of data for the mission life so far without having to worry about, you know, the size bloating to an unmanageable amount. >>And we can also seamlessly query, uh, large chunks of data. Like if I need to see, you know, for example, as an operator, I might wanna see how my, uh, battery state of charge is evolving over the course of the year. I can have a plot and an influx that loads that in a fraction of a second for a year's worth of data, because it does, you know, intelligent, um, I can intelligently group the data by, uh, sliding time interval. Uh, so, you know, it's been extremely powerful for us to access the data and, you know, as time has gone on, we've gradually migrated more and more of our operating data into influx. >>You know, let's, let's talk a little bit, uh, uh, but we throw this term around a lot of, you know, data driven, a lot of companies say, oh, yes, we're data driven, but you guys really are. I mean, you' got data at the core, Caleb, what does that, what does that mean to you? >>Yeah, so, you know, I think the, and the clearest example of when I saw this be like totally game changing is what I mentioned before at Astro where our engineer's feedback loop went from, you know, a lot of kind of slow researching, digging into the data to like an instant instantaneous, almost seeing the data, making decisions based on it immediately, rather than having to wait for some processing. And that's something that I've also seen echoed in my current role. Um, but to give another practical example, uh, as I said, we have a huge amount of data that comes down every orbit, and we need to be able to ingest all of that data almost instantaneously and provide it to the operator. And near real time, you know, about a second worth of latency is all that's acceptable for us to react to, to see what is coming down from the spacecraft and building that pipeline is challenging from a software engineering standpoint. >>Um, our primary language is Python, which isn't necessarily that fast. So what we've done is started, you know, in the, in the goal of being data driven is publish metrics on individual, uh, how individual pieces of our data processing pipeline are performing into influx as well. And we do that in production as well as in dev. Uh, so we have kind of a production monitoring, uh, flow. And what that has done is allow us to make intelligent decisions on our software development roadmap, where it makes the most sense for us to, uh, focus our development efforts in terms of improving our software efficiency. Uh, just because we have that visibility into where the real problems are. Um, it's sometimes we've found ourselves before we started doing this kind of chasing rabbits that weren't necessarily the real root cause of issues that we were seeing. Uh, but now, now that we're being a bit more data driven, there we are being much more effective in where we're spending our resources and our time, which is especially critical to us as we scale to, from supporting a couple satellites, to supporting many, many satellites at >>Once. Yeah. Coach. So you reduced those dead ends, maybe Angela, you could talk about what, what sort of data driven means to, to you and your teams? >>I would say that, um, having, uh, real time visibility, uh, to the telemetry data and, and metrics is, is, is crucial for us. We, we need, we need to make sure that the image that we collect with the telescope, uh, have good quality and, um, that they are within the specifications, uh, to meet our science goals. And so if they are not, uh, we want to know that as soon as possible and then, uh, start fixing problems. >>Caleb, what are your sort of event, you know, intervals like? >>So I would say that, you know, as of today on the spacecraft, the event, the, the level of timing that we deal with probably tops out at about, uh, 20 Hertz, 20 measurements per second on, uh, things like our, uh, gyroscopes, but the, you know, I think the, the core point here of the ability to have high precision data is extremely important for these kinds of scientific applications. And I'll give an example, uh, from when I worked at, on the rocket at Astra there, our baseline data rate that we would ingest data during a test is, uh, 500 Hertz. So 500 samples per second. And in some cases we would actually, uh, need to ingest much higher rate data, even up to like 1.5 kilohertz. So, uh, extremely, extremely high precision, uh, data there where timing really matters a lot. And, uh, you know, I can, one of the really powerful things about influx is the fact that it can handle this. >>That's one of the reasons we chose it, uh, because there's times when we're looking at the results of a firing where you're zooming in, you know, I talked earlier about how on my current job, we often zoom out to look, look at a year's worth of data. You're zooming in to where your screen is preoccupied by a tiny fraction of a second. And you need to see same thing as Angela just said, not just the actual telemetry, which is coming in at a high rate, but the events that are coming out of our controllers. So that can be something like, Hey, I opened this valve at exactly this time and that goes, we wanna have that at, you know, micro or even nanosecond precision so that we know, okay, we saw a spike in chamber pressure at, you know, at this exact moment, was that before or after this valve open, those kind of, uh, that kind of visibility is critical in these kind of scientific, uh, applications and absolutely game changing to be able to see that in, uh, near real time and, uh, with a really easy way for engineers to be able to visualize this data themselves without having to wait for, uh, software engineers to go build it for them. >>Can the scientists do self-serve or are you, do you have to design and build all the analytics and, and queries for your >>Scientists? Well, I think that's, that's absolutely from, from my perspective, that's absolutely one of the best things about influx and what I've seen be game changing is that, uh, generally I'd say anyone can learn to use influx. Um, and honestly, most of our users might not even know they're using influx, um, because what this, the interface that we expose to them is Grafana, which is, um, a generic graphing, uh, open source graphing library that is very similar to influx own chronograph. Sure. And what it does is, uh, let it provides this, uh, almost it's a very intuitive UI for building your queries. So you choose a measurement and it shows a dropdown of available measurements. And then you choose a particular, the particular field you wanna look at. And again, that's a dropdown, so it's really easy for our users to discover. And there's kind of point and click options for doing math aggregations. You can even do like perfect kind of predictions all within Grafana, the Grafana user interface, which is really just a wrapper around the APIs and functionality of the influx provides putting >>Data in the hands of those, you know, who have the context of domain experts is, is key. Angela, is it the same situation for you? Is it self serve? >>Yeah, correct. Uh, as I mentioned before, um, we have the astronomers making their own dashboards because they know what exactly what they, they need to, to visualize. Yeah. I mean, it's all about using the right tool for the job. I think, uh, for us, when I joined the company, we weren't using influx DB and we, we were dealing with serious issues of the database growing to an incredible size extremely quickly, and being unable to like even querying short periods of data was taking on the order of seconds, which is just not possible for operations >>Guys. This has been really formative it's, it's pretty exciting to see how the edge is mountaintops, lower orbits to be space is the ultimate edge. Isn't it. I wonder if you could answer two questions to, to wrap here, you know, what comes next for you guys? Uh, and is there something that you're really excited about that, that you're working on Caleb, maybe you could go first and an Angela, you can bring us home. >>Uh, basically what's next for loft. Orbital is more, more satellites, a greater push towards infrastructure and really making, you know, our mission is to make space simple for our customers and for everyone. And we're scaling the company like crazy now, uh, making that happen, it's extremely exciting and extremely exciting time to be in this company and to be in this industry as a whole, because there are so many interesting applications out there. So many cool ways of leveraging space that, uh, people are taking advantage of. And with, uh, companies like SpaceX and the now rapidly lowering cost, cost of launch, it's just a really exciting place to be. And we're launching more satellites. We are scaling up for some constellations and our ground system has to be improved to match. So there's a lot of, uh, improvements that we're working on to really scale up our control software, to be best in class and, uh, make it capable of handling such a large workload. So >>You guys hiring >><laugh>, we are absolutely hiring. So, uh, I would in we're we need, we have PE positions all over the company. So, uh, we need software engineers. We need people who do more aerospace, specific stuff. So, uh, absolutely. I'd encourage anyone to check out the loft orbital website, if there's, if this is at all interesting. >>All right. Angela, bring us home. >>Yeah. So what's next for us is really, uh, getting this, um, telescope working and collecting data. And when that's happen is going to be just, um, the Lu of data coming out of this camera and handling all, uh, that data is going to be really challenging. Uh, yeah. I wanna wanna be here for that. <laugh> I'm looking forward, uh, like for next year we have like an important milestone, which is our, um, commissioning camera, which is a simplified version of the, of the full camera it's going to be on sky. And so yeah, most of the system has to be working by them. >>Nice. All right, guys, you know, with that, we're gonna end it. Thank you so much, really fascinating, and thanks to influx DB for making this possible, really groundbreaking stuff, enabling value creation at the edge, you know, in the cloud and of course, beyond at the space. So really transformational work that you guys are doing. So congratulations and really appreciate the broader community. I can't wait to see what comes next from having this entire ecosystem. Now, in a moment, I'll be back to wrap up. This is Dave ante, and you're watching the cube, the leader in high tech enterprise coverage. >>Welcome Telegraph is a popular open source data collection. Agent Telegraph collects data from hundreds of systems like IOT sensors, cloud deployments, and enterprise applications. It's used by everyone from individual developers and hobbyists to large corporate teams. The Telegraph project has a very welcoming and active open source community. Learn how to get involved by visiting the Telegraph GitHub page, whether you want to contribute code, improve documentation, participate in testing, or just show what you're doing with Telegraph. We'd love to hear what you're building. >>Thanks for watching. Moving the world with influx DB made possible by influx data. I hope you learn some things and are inspired to look deeper into where time series databases might fit into your environment. If you're dealing with large and or fast data volumes, and you wanna scale cost effectively with the highest performance and you're analyzing metrics and data over time times, series databases just might be a great fit for you. Try InfluxDB out. You can start with a free cloud account by clicking on the link and the resources below. Remember all these recordings are gonna be available on demand of the cube.net and influx data.com. So check those out and poke around influx data. They are the folks behind InfluxDB and one of the leaders in the space, we hope you enjoyed the program. This is Dave Valante for the cube. We'll see you soon.

Published Date : May 12 2022

SUMMARY :

case that anyone can relate to and you can build timestamps into Now, the problem with the latter example that I just gave you is that you gotta hunt As I just explained, we have an exciting program for you today, and we're And then we bring it back here Thanks for coming on. What is the story? And, and he basically, you know, from my point of view, he invented modern time series, Yeah, I think we're, I, you know, I always forget the number, but it's something like 230 or 240 people relational database is the one database to rule the world. And then you get the data lake. So And so you get to these applications Isn't good enough when you need real time. It's like having the feature for, you know, you buy a new television, So this is a big part of how we're seeing with people saying, Hey, you know, And so you get the dynamic of, you know, of constantly instrumenting watching the What are you seeing for your, with in, with influx DB, So a lot, you know, Tesla, lucid, motors, Cola, You mentioned, you know, you think of IOT, look at the use cases there, it was proprietary And so the developer, So let's get to the developer real quick, real highlight point here is the data. So to a degree that you are moving your service, So when you bring in kind of old way, new way old way was you know, the best of the open source world. They have faster time to market cuz they're assembling way faster and they get to still is what we like to think of it. I mean systems, uh, uh, systems have consequences when you make changes. But that's where the that's where the, you know, that that Boeing or that airplane building analogy comes in So I'll have to ask you if I'm the customer. Because now I have to make these architectural decisions, as you mentioned, And so that's what you started building. And since I have a PO for you and a big check, yeah. It's not like it's, you know, it's not like it's doing every action that's above, but it's foundational to build What would you say to someone looking to do something in time series on edge? in the build business of building systems that you want 'em to be increasingly intelligent, Brian Gilmore director of IOT and emerging technology that influx day will join me. So you can focus on the Welcome to the show. Sort of, you know, riding along with them is they're successful. Now, you go back since 20 13, 14, even like five years ago that convergence of physical And I think, you know, those, especially in the OT and on the factory floor who weren't able And I think I, OT has been kind of like this thing for OT and, you know, our client libraries and then working hard to make our applications, leveraging that you guys have users in the enterprise users that IOT market mm-hmm <affirmative>, they're excited to be able to adopt and use, you know, to optimize inside the business as compared to just building mm-hmm <affirmative> so how do you support the backwards compatibility of older systems while maintaining open dozens very hard work and a lot of support, um, you know, and so by making those connections and building those ecosystems, What are some of the, um, soundbites you hear from customers when they're successful? machines that go deep into the earth to like drill tunnels for, for, you know, I personally think that's a hot area because I think if you look at AI right all of the things you need to do with that data in stream, um, before it hits your sort of central repository. So you have that whole CEO perspective, but he brought up this notion that You can start to compare asset to asset, and then you can do those things like we talked about, So in this model you have a lot of commercial operations, industrial equipment. And I think, you know, we are, we're building some technology right now. like, you know, either in low earth orbit or you know, all the way sort of on the other side of the universe. I think you bring up a good point there because one of the things that's common in the industry right now, people are talking about, I mean, I think you talked about it, uh, you know, for them just to be able to adopt the platform How do you view view that? Um, you know, and it, it allows the developer to build all of those hooks for not only data creation, There's so much data out there now. that data from point a to point B and you know, to process it correctly so that the end And, and the democratization is the benefit. allow them to just port to us, you know, directly from the applications and the languages Thanks for sharing all, all the complexities and, and IOT that you Well, thank any, any last word you wanna share No, just, I mean, please, you know, if you're, if you're gonna, if you're gonna check out influx TV, You're gonna hear more about that in the next segment, too. the moment that you can look at to kind of see the state of what's going on. And we often point to influx as a solution Tell us about loft Orbi and what you guys do to attack that problem. So that it's almost as simple as, you know, We are kind of groundbreaking in this area and we're serving, you know, a huge variety of customers and I knew, you know, I want to be in the space industry. famous woman scientist, you know, galaxy guru. And we are going to do that for 10 so you probably spent some time thinking about what's out there and then you went out to earn a PhD in astronomy, Um, the dark energy survey So it seems like you both, you know, your organizations are looking at space from two different angles. something the nice thing about InfluxDB is that, you know, it's so easy to deploy. And, you know, I saw them implementing like crazy rocket equation type stuff in influx, and it Um, if you think about the observations we are moving the telescope all the And I, I believe I read that it's gonna be the first of the next Uh, the telescope needs to be, And what are you doing with, compared to the images, but it is still challenging because, uh, you, you have some Okay, Caleb, let's bring you back in and can tell us more about the, you got these dishwasher and we're working on a bunch more that are, you know, a variety of sizes from shoebox sites, either in Antarctica or, you know, in the north pole region. Talk more about how you use influx DB to make sense of this data through all this tech that you're launching of data for the mission life so far without having to worry about, you know, the size bloating to an Like if I need to see, you know, for example, as an operator, I might wanna see how my, You know, let's, let's talk a little bit, uh, uh, but we throw this term around a lot of, you know, data driven, And near real time, you know, about a second worth of latency is all that's acceptable for us to react you know, in the, in the goal of being data driven is publish metrics on individual, So you reduced those dead ends, maybe Angela, you could talk about what, what sort of data driven means And so if they are not, So I would say that, you know, as of today on the spacecraft, the event, so that we know, okay, we saw a spike in chamber pressure at, you know, at this exact moment, the particular field you wanna look at. Data in the hands of those, you know, who have the context of domain experts is, issues of the database growing to an incredible size extremely quickly, and being two questions to, to wrap here, you know, what comes next for you guys? a greater push towards infrastructure and really making, you know, So, uh, we need software engineers. Angela, bring us home. And so yeah, most of the system has to be working by them. at the edge, you know, in the cloud and of course, beyond at the space. involved by visiting the Telegraph GitHub page, whether you want to contribute code, and one of the leaders in the space, we hope you enjoyed the program.

ENTITIES

Entity	Category	Confidence
Brian Gilmore	PERSON	0.99+
John	PERSON	0.99+
Angela	PERSON	0.99+
Evan	PERSON	0.99+
2015	DATE	0.99+
SpaceX	ORGANIZATION	0.99+
2016	DATE	0.99+
Dave Valante	PERSON	0.99+
Antarctica	LOCATION	0.99+
Boeing	ORGANIZATION	0.99+
Caleb	PERSON	0.99+
10 years	QUANTITY	0.99+
Chile	LOCATION	0.99+
Brian	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Evan Kaplan	PERSON	0.99+
Aaron Seley	PERSON	0.99+
Angelo Fasi	PERSON	0.99+
2013	DATE	0.99+
Paul	PERSON	0.99+
Tesla	ORGANIZATION	0.99+
2018	DATE	0.99+
IBM	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
two questions	QUANTITY	0.99+
Caleb McLaughlin	PERSON	0.99+
40 moons	QUANTITY	0.99+
two systems	QUANTITY	0.99+
two	QUANTITY	0.99+
Angelo	PERSON	0.99+
230	QUANTITY	0.99+
300 tons	QUANTITY	0.99+
three	QUANTITY	0.99+
500 Hertz	QUANTITY	0.99+
3.2 gig	QUANTITY	0.99+
15 terabytes	QUANTITY	0.99+
eight meter	QUANTITY	0.99+
two practitioners	QUANTITY	0.99+
20 Hertz	QUANTITY	0.99+
25 years	QUANTITY	0.99+
Today	DATE	0.99+
Palo Alto	LOCATION	0.99+
Python	TITLE	0.99+
Oracle	ORGANIZATION	0.99+
Paul dicks	PERSON	0.99+
First	QUANTITY	0.99+
iPhones	COMMERCIAL_ITEM	0.99+
first	QUANTITY	0.99+
earth	LOCATION	0.99+
240 people	QUANTITY	0.99+
three days	QUANTITY	0.99+
apple	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
HBI	ORGANIZATION	0.99+
Dave LAN	PERSON	0.99+
today	DATE	0.99+
each image	QUANTITY	0.99+
next year	DATE	0.99+
cube.net	OTHER	0.99+
InfluxDB	TITLE	0.99+
one	QUANTITY	0.98+
1000 points	QUANTITY	0.98+

Moving The World With InfluxDB

(upbeat music) >> Okay, we're now going to go into the customer panel. And we'd like to welcome Angelo Fausti, who's software engineer at the Vera C Rubin Observatory, and Caleb Maclachlan, who's senior spacecraft operations software engineer at Loft Orbital. Guys, thanks for joining us. You don't want to miss folks, this interview. Caleb, let's start with you. You work for an extremely cool company. You're launching satellites into space. Cause doing that is highly complex and not a cheap endeavor. Tell us about Loft Orbital and what you guys do to attack that problem? >> Yeah, absolutely. And thanks for having me here, by the way. So Loft Orbital is a company that's a series B startup now. And our mission basically is to provide rapid access to space for all kinds of customers. Historically, if you want to fly something in space, do something in space, it's extremely expensive. You need to book a launch, build a bus, hire a team to operate it, have big software teams, and then eventually worry about a lot of very specialized engineering. And what we're trying to do is, change that from a super specialized problem that has an extremely high barrier of access to a infrastructure problem. So that it's almost as simple as deploying a VM in AWS or GCP, as getting your programs, your mission deployed on orbit, with access to different sensors, cameras, radios, stuff like that. So that's kind of our mission. And just to give a really brief example of the kind of customer that we can serve. There's a really cool company called Totum labs, who is working on building an IoT constellation, for Internet of Things. Basically being able to get telemetry from all over the world. They're the first company to demonstrate indoor IoT, which means you have this little modem inside a container. A container that you track from anywhere on the world as it's going across the ocean. So it's really little. And they've been able to stay small startup that's focused on their product, which is that super crazy, complicated, cool radio, while we handle the whole space segment for them, which just, before Loft was really impossible. So that's our mission is, providing space infrastructure as a service. We are kind of groundbreaking in this area, and we're serving a huge variety of customers with all kinds of different missions, and obviously, generating a ton of data in space that we've got to handle. >> Yeah, so amazing, Caleb, what you guys do. I know you were lured to the skies very early in your career, but how did you kind of land in this business? >> Yeah, so I guess just a little bit about me. For some people, they don't necessarily know what they want to do, early in their life. For me, I was five years old and I knew, I want to be in the space industry. So I started in the Air Force, but have stayed in the space industry my whole career and been a part of, this is the fifth space startup that I've been a part of, actually. So I've kind of started out in satellites, did spend some time in working in the launch industry on rockets. Now I'm here back in satellites. And honestly, this is the most exciting of the different space startups that I've been a part of. So, always been passionate about space and basically writing software for operating in space for basically extending how we write software into orbit. >> Super interesting. Okay, Angelo. Let's talk about the Rubin Observatory Vera C. Rubin, famous woman scientists, Galaxy guru, Now you guys, the observatory are up, way up high, you're going to get a good look at the southern sky. I know COVID slowed you guys down a bit. But no doubt you continue to code away on the software. I know you're getting close. You got to be super excited. Give us the update on the observatory and your role. >> All right. So yeah, Rubin is state of the art observatory that is in construction on a remote mountain in Chile. And with Rubin we'll conduct the large survey of space and time. We are going to observe the sky with eight meter optical telescope and take 1000 pictures every night with 3.2 gigapixel camera. And we're going to do that for 10 years, which is the duration of the survey. The goal is to produce an unprecedented data set. Which is going to be about .5 exabytes of image data. And from these images will detect and measure the properties of billions of astronomical objects. We are also building a science platform that's hosted on Google Cloud, so that the scientists and the public can explore this data to make discoveries. >> Yeah, amazing project. Now, you aren't a Doctor of Philosophy. So you probably spent some time thinking about what's out there. And then you went on to earn a PhD in astronomy and astrophysics. So this is something that you've been working on for the better part of your career, isn't it? >> Yeah, that's right. About 15 years. I studied physics in college, then I got a PhD in astronomy. And I worked for about five years in another project, the Dark Energy survey before joining Rubin in 2015. >> Yeah, impressive. So it seems like both your organizations are looking at space from two different angles. One thing you guys both have in common, of course, is software. And you both use InfluxDB as part of your data infrastructure. How did you discover InfluxDB, get into it? How do you use the platform? Maybe Caleb, you can start. >> Yeah, absolutely. So the first company that I extensively used InfluxDB in was a launch startup called Astra. And we were in the process of designing our first generation rocket there and testing the engines, pumps. Everything that goes into a rocket. And when I joined the company, our data story was not very mature. We were collecting a bunch of data in LabVIEW. And engineers were taking that over to MATLAB to process it. And at first, that's the way that a lot of engineers and scientists are used to working. And at first that was, like, people weren't entirely sure that, that needed to change. But it's something, the nice thing about InfluxDB is that, it's so easy to deploy. So our software engineering team was able to get it deployed and up and running very quickly and then quickly also backport all of the data that we've collected thus far into Influx. And what was amazing to see and it's kind of the super cool moment with Influx is, when we hooked that up to Grafana, Grafana, is the visualization platform we use with influx, because it works really well with it. There was like this aha moment of our engineers who are used to this post process kind of method for dealing with their data, where they could just almost instantly, easily discover data that they hadn't been able to see before. And take the manual processes that they would run after a test and just throw those all in Influx and have live data as tests were coming. And I saw them implementing crazy rocket equation type stuff in Influx and it just was totally game changing for how we tested. And things that previously it would be like run a test, then wait an hour for the engineers to crunch the data and then we run another test with some changed parameters or a changed startup sequence or something like that, became, by the time the test is over, the engineers know what the next step is, because they have this just like instant game changing access to data. So since that experience, basically everywhere I've gone, every company since then, I've been promoting InfluxDB and using it and spinning it up and quickly showing people how simple and easy it is. >> Yeah, thank you. So Angelo, I was explaining in my open that, you know you could add a column in a traditional RDBMS and do time series. But with the volume of data that you're talking about in the example that Caleb just gave, you have to have a purpose built time series database. Where did you first learn about InfluxDB? >> Yeah, correct. So I worked with the data management team and my first project was the record metrics that measure the performance of our software. The software that we use to process the data. So I started implementing that in our relational database. But then I realized that in fact, I was dealing with time series data. And I should really use a solution built for that. And then I started looking at time series databases and I found InfluxDB, that was back in 2018. Then I got involved in another project. To record telemetry data from the telescope itself. It's very challenging because you have so many subsystems and sensors, producing data. And with that data, the goal is to look at the telescope harder in real time so we can make decisions and make sure that everything's doing the right thing. And another use for InfluxDB that I'm also interested, is the visits database. If you think about the observations, we are moving the telescope all the time and pointing to specific directions in the sky and taking pictures every 30 seconds. So that itself is a time series. And every point in the time series, we call that visit. So we want to record the metadata about those visits in InfluxDB. That time series is going to be 10 years long, with about 1000 points every night. It's actually not too much data compared to the other problems. It's really just the different time scale. So yeah, we have plans on continuing using InfluxDB and finding new applications in the project. >> Yeah and the speed with which you can actually get high quality images. Angelo, my understanding is, you use InfluxDB, as you said, you're monitoring the telescope hardware and the software. And just say, some of the scientific data as well. The telescope at the Rubin Observatory is like, no pun intended, I guess, the star of the show. And I believe, I read that it's going to be the first of the next gen telescopes to come online. It's got this massive field of view, like three orders of magnitude times the Hubble's widest camera view, which is amazing. That's like 40 moons in an image, and amazingly fast as well. What else can you tell us about the telescope? >> Yeah, so it's really a challenging project, from the point of view of engineering. This telescope, it has to move really fast. And it also has to carry the primary mirror, which is an eight meter piece of glass, it's very heavy. And it has to carry a camera, which is about the size of a small car. And this whole structure weighs about 300 pounds. For that to work, the telescope needs to be very compact and stiff. And one thing that's amazing about its design is that the telescope, this 300 tons structure, it sits on a tiny film of oil, which has the diameter of human hair, in that brings an almost zero friction interface. In fact, a few people can move this enormous structure with only their hands. As you said, another aspect that makes this telescope unique is the optical design. It's a wide field telescope. So each image has, in diameter, the size of about seven full moons. And with that we can map the entire sky in only three days. And of course, during operations, everything's controlled by software, and it's automatic. There's a very complex piece of software called the scheduler, which is responsible for moving the telescope and the camera. Which will record the 15 terabytes of data every night. >> And Angelo, all this data lands in InfluxDB, correct? And what are you doing with all that data? >> Yeah, actually not. So we're using InfluxDB to record engineering data and metadata about the observations, like telemetry events and the commands from the telescope. That's a much smaller data set compared to the images. But it is still challenging because you have some high frequency data that the system needs to keep up and we need to store this data and have it around for the lifetime of the project. >> Hm. So at the mountain, we keep the data for 30 days. So the observers, they use Influx and InfluxDB instance, running there to analyze the data. But we also replicate the data to another instance running at the US data facility, where we have more computational resources and so more people can look at the data without interfering with the observations. Yeah, I have to say that InfluxDB has been really instrumental for us, and especially at this phase of the project where we are testing and integrating the different pieces of hardware. And it's not just the database, right. It's the whole platform. So I like to give this example, when we are doing this kind of task, it's hard to know in advance which dashboards and visualizations you're going to need, right. So what you really need is a data exploration tool. And with tools like chronograph, for example, having the ability to query and create dashboards on the fly was really a game changer for us. So astronomers, they typically are not software engineers, but they are the ones that know better than anyone, what needs to be monitored. And so they use chronograph and they can create the dashboards and the visualizations that they need. >> Got it. Thank you. Okay, Caleb, let's bring you back in. Tell us more about, you got these dishwasher size satellites are kind of using a multi tenant model. I think it's genius. But tell us about the satellites themselves. >> Yeah, absolutely. So we have in space, some satellites already. That, as you said, are like dishwasher, mini fridge kind of size. And we're working on a bunch more that are a variety of sizes from shoe box to I guess, a few times larger than what we have today. And it is, we do shoot to have, effectively something like a multi tenant model where we will buy a bus off the shelf, the bus is, what you can kind of think of as the core piece of the satellite, almost like a motherboard or something. Where it's providing the power, it has the solar panels, it has some radios attached to it, it handles the altitude control, basically steers the spacecraft in orbit. And then we build, also in house, what we call our payload hub, which is has all any customer payloads attached, and our own kind of edge processing sort of capabilities built into it. And so we integrate that, we launch it, and those things, because they're in low Earth orbit, they're orbiting the Earth every 90 minutes. That's seven kilometers per second, which is several times faster than a speeding bullet. So we've got, we have one of the unique challenges of operating spacecraft in lower Earth orbit is that generally you can't talk to them all the time. So we're managing these things through very brief windows of time. Where we get to talk to them through our ground sites, either in Antarctica or in the North Pole region. So we'll see them for 10 minutes, and then we won't see them for the next 90 minutes as they zip around the Earth collecting data. So one of the challenges that exists for a company like ours is, that's a lot of, you have to be able to make real time decisions operationally, in those short windows that can sometimes be critical to the health and safety of the spacecraft. And it could be possible that we put ourselves into a low power state in the previous orbit or something potentially dangerous to the satellite can occur. And so as an operator, you need to very quickly process that data coming in. And not just the the live data, but also the massive amounts of data that were collected in, what we call the back orbit, which is the time that we couldn't see the spacecraft. >> We got it. So talk more about how you use InfluxDB to make sense of this data from all those tech that you're launching into space. >> Yeah, so we basically, previously we started off, when I joined the company, storing all of that, as Angelo did, in a regular relational database. And we found that it was so slow, and the size of our data would balloon over the course of a couple of days to the point where we weren't able to even store all of the data that we were getting. So we migrated to InfluxDB to store our time series telemetry from the spacecraft. So that thing's like power level voltage, currents counts, whatever metadata we need to monitor about the spacecraft, we now store that in InfluxDB. And that has, you know, now we can actually easily store the entire volume of data for the mission life so far, without having to worry about the size bloating to an unmanageable amount. And we can also seamlessly query large chunks of data, like if I need to see, for example, as an operator, I might want to see how my battery state of charge is evolving over the course of the year, I can have a plot in an Influx that loads that in a fraction of a second for a year's worth of data, because it does, you know, intelligent. I can intelligently group the data by citing time interval. So it's been extremely powerful for us to access the data. And as time has gone on, we've gradually migrated more and more of our operating data into Influx. So not only do we store the basic telemetry about the bus and our payload hub, but we're also storing data for our customers, that our customers are generating on board about things like you know, one example of a customer that's doing something pretty cool. They have a computer on our satellite, which they can reprogram themselves to do some AI enabled edge compute type capability in space. And so they're sending us some metrics about the status of their workloads, in addition to the basics, like the temperature of their payload, their computer or whatever else. And we're delivering that data to them through Influx in a Grafana dashboard that they can plot where they can see, not only has this pipeline succeeded or failed, but also where was the spacecraft when this occurred? What was the voltage being supplied to their payload? Whatever they need to see, it's all right there for them. Because we're aggregating all that data in InfluxDB. >> That's awesome. You're measuring everything. Let's talk a little bit about, we throw this term around a lot, data driven. A lot of companies say, Oh, yes, we're data driven. But you guys really are. I mean, you got data at the core. Caleb, what does that what does that mean to you? >> Yeah, so you know, I think, the clearest example of when I saw this, be like totally game changing is, what I mentioned before it, at Astra, were our engineers feedback loop went from a lot of, kind of slow researching, digging into the data to like an instant, instantaneous, almost, Seeing the data, making decisions based on it immediately, rather than having to wait for some processing. And that's something that I've also seen echoed in my current role. But to give another practical example, as I said, we have a huge amount of data that comes down every orbit, and we need to be able to ingest all that data almost instantaneously and provide it to the operator in near real time. About a second worth of latency is all that's acceptable for us to react to. To see what is coming down from the spacecraft and building that pipeline is challenging, from a software engineering standpoint. Our primary language is Python, which isn't necessarily that fast. So what we've done is started, in the in the goal being data driven, is publish metrics on individual, how individual pieces of our data processing pipeline, are performing into Influx as well. And we do that in production as well as in dev. So we have kind of a production monitoring flow. And what that has done is, allow us to make intelligent decisions on our software development roadmap. Where it makes the most sense for us to focus our development efforts in terms of improving our software efficiency, just because we have that visibility into where the real problems are. At sometimes we've found ourselves, before we started doing this, kind of chasing rabbits that weren't necessarily the real root cause of issues that we were seeing. But now, that we're being a bit more data driven, there, we are being much more effective in where we're spending our resources and our time, which is especially critical to us as we scaled from supporting a couple of satellites to supporting many, many satellites at once. >> So you reduce those dead ends. Maybe Angela, you could talk about what sort of data driven means to you and your team? >> Yeah, I would say that having real time visibility, to the telemetry data and metrics is crucial for us. We need to make sure that the images that we collect, with the telescope have good quality and that they are within the specifications to meet our science goals. And so if they are not, we want to know that as soon as possible, and then start fixing problems. >> Yeah, so I mean, you think about these big science use cases, Angelo. They are extremely high precision, you have to have a lot of granularity, very tight tolerances. How does that play into your time series data strategy? >> Yeah, so one of the subsystems that produce the high volume and high rates is the structure that supports the telescope's primary mirror. So on that structure, we have hundreds of actuators that compensate the shape of the mirror for the formations. That's part of our active updated system. So that's really real time. And we have to record this high data rates, and we have requirements to handle data that are a few 100 hertz. So we can easily configure our database with milliseconds precision, that's for telemetry data. But for events, sometimes we have events that are very close to each other and then we need to configure database with higher precision. >> um hm For example, micro seconds. >> Yeah, so Caleb, what are your event intervals like? >> So I would say that, as of today on the spacecraft, the event, the level of timing that we deal with probably tops out at about 20 hertz, 20 measurements per second on things like our gyroscopes. But I think the core point here of the ability to have high precision data is extremely important for these kinds of scientific applications. And I'll give you an example, from when I worked on the rockets at Astra. There, our baseline data rate that we would ingest data during a test is 500 hertz, so 500 samples per second. And in some cases, we would actually need to ingest much higher rate data. Even up to like 1.5 kilohertz. So extremely, extremely high precision data there, where timing really matters a lot. And, I can, one of the really powerful things about Influx is the fact that it can handle this, that's one of the reasons we chose it. Because there's times when we're looking at the results of firing, where you're zooming in. I've talked earlier about how on my current job, we often zoom out to look at a year's worth of data. You're zooming in, to where your screen is preoccupied by a tiny fraction of a second. And you need to see, same thing, as Angelo just said, not just the actual telemetry, which is coming in at a high rate, but the events that are coming out of our controllers. So that can be something like, hey, I opened this valve at exactly this time. And that goes, we want to have that at micro or even nanosecond precision, so that we know, okay, we saw a spike in chamber pressure at this exact moment, was that before or after this valve open? That kind of visibility is critical in these kinds of scientific applications and absolutely game changing, to be able to see that in near real time. And with a really easy way for engineers to be able to visualize this data themselves without having to wait for us software engineers to go build it for them. >> Can the scientists do self serve? Or do you have to design and build all the analytics and queries for scientists? >> I think that's absolutely from my perspective, that's absolutely one of the best things about Influx, and what I've seen be game changing is that, generally, I'd say anyone can learn to use Influx. And honestly, most of our users might not even know they're using Influx. Because the interface that we expose to them is Grafana, which is generic graphing, open source graphing library that is very similar to Influx zone chronograph. >> Sure. >> And what it does is, it provides this, almost, it's a very intuitive UI for building your query. So you choose a measurement, and it shows a drop down of available measurements, and then you choose the particular field you want to look at. And again, that's a drop down. So it's really easy for our users to discover it. And there's kind of point and click options for doing math, aggregations. You can even do like, perfect kind of predictions all within Grafana. The Grafana user interface, which is really just a wrapper around the API's and functionality that Influx provides. So yes, absolutely, that's been the most powerful thing about it, is that it gets us out of the way, us software engineers, who may not know quite as much as the scientists and engineers that are closer to the interesting math. And they build these crazy dashboards that I'm just like, wow, I had no idea you could do that. I had no idea that, that is something that you would want to see. And absolutely, that's the most empowering piece. >> Yeah, putting data in the hands of those who have the context, the domain experts is key. Angelo is it the same situation for you? Is it self serve? >> Yeah, correct. As I mentioned before, we have the astronomers making their own dashboards, because they know exactly what they need to visualize. And I have an example just from last week. We had an engineer at the observatory that was building a dashboard to monitor the cooling system of the entire building. And he was familiar with InfluxQL, which was the primarily query language in version one of InfluxDB. And he had, that was really a challenge because he had all the data spread at multiple InfluxDB measurements. And he was like doing one query for each measurement and was not able to produce what he needed. And then, but that's the perfect use case for Flux, which is the new data scripting language that Influx data developed and introduced as the main language in version two. And so with Flux, he was able to combine data from multiple measurements and summarize this data in a nice table. So yeah, having more flexible and powerful language, also allows you to make better a visualization. >> So Angelo, where would you be without time series database, that technology generally, may be specifically InfluxDB, as one of the leading platforms. Would you be able to do this? >> Yeah, it's hard to imagine, doing what we are doing without InfluxDB. And I don't know, perhaps it would be just a matter of time to rediscover InfluxDB. >> Yeah. How about you Caleb? >> Yeah, I mean, it's all about using the right tool for the job. I think for us, when I joined the company, we weren't using InfluxDB and we were dealing with serious issues of the database growing to a an incredible size, extremely quickly. And being unable to, like even querying short periods of data, was taking on the order of seconds, which is just not possible for operations. So time series database is, if you're dealing with large volumes of time series data, Time series database is the right tool for the job and Influx is a great one for it. So, yeah, it's absolutely required to use for this kind of data, there is not really any other option. >> Guys, this has been really informative. It's pretty exciting to see, how the edge is mountain tops, lower Earth orbits. Space is the ultimate edge. Isn't it. I wonder if you could two questions to wrap here. What comes next for you guys? And is there something that you're really excited about? That you're working on. Caleb, may be you could go first and than Angelo you could bring us home. >> Yeah absolutely, So basically, what's next for Loft Orbital is more, more satellites a greater push towards infrastructure and really making, our mission is to make space simple for our customers and for everyone. And we're scaling the company like crazy now, making that happen. It's extremely exciting and extremely exciting time to be in this company and to be in this industry as a whole. Because there are so many interesting applications out there. So many cool ways of leveraging space that people are taking advantage of and with companies like SpaceX, now rapidly lowering cost of launch. It's just a really exciting place to be in. And we're launching more satellites. We're scaling up for some constellations and our ground system has to be improved to match. So there is a lot of improvements that we are working on to really scale up our control systems to be best in class and make it capable of handling such large workloads. So, yeah. What's next for us is just really 10X ing what we are doing. And that's extremely exciting. >> And anything else you are excited about? Maybe something personal? Maybe, you know, the titbit you want to share. Are you guys hiring? >> We're absolutely hiring. So, we've positions all over the company. So we need software engineers. We need people who do more aerospace specific stuff. So absolutely, I'd encourage anyone to check out the Loft Orbital website, if this is at all interesting. Personal wise, I don't have any interesting personal things that are data related. But my current hobby is sea kayaking, so I'm working on becoming a sea kayaking instructor. So if anyone likes to go sea kayaking out in the San Francisco Bay area, hopefully I'll see you out there. >> Love it. All right, Angelo, bring us home. >> Yeah. So what's next for us is, we're getting this telescope working and collecting data and when that's happened, it's going to be just a delish of data coming out of this camera. And handling all that data, is going to be a really challenging. Yeah, I wonder I might not be here for that I'm looking for it, like for next year we have an important milestone, which is our commissioning camera, which is a simplified version of the full camera, is going to be on sky and so most of the system has to be working by then. >> Any cool hobbies that you are working on or any side project? >> Yeah, actually, during the pandemic I started gardening. And I live here in Two Sun, Arizona. It gets really challenging during the summer because of the lack of water, right. And so, we have an automatic irrigation system at the farm and I'm trying to develop a small system to monitor the irrigation and make sure that our plants have enough water to survive. >> Nice. All right guys, with that we're going to end it. Thank you so much. Really fascinating and thanks to InfluxDB for making this possible. Really ground breaking stuff, enabling value at the edge, in the cloud and of course beyond, at the space. Really transformational work, that you guys are doing. So congratulations and I really appreciate the broader community. I can't wait to see what comes next from this entire eco system. Now in the moment, I'll be back to wrap up. This is Dave Vallante. And you are watching The cube, the leader in high tech enterprise coverage. (upbeat music)

Published Date : Apr 21 2022

SUMMARY :

and what you guys do of the kind of customer that we can serve. Caleb, what you guys do. So I started in the Air Force, code away on the software. so that the scientists and the public for the better part of the Dark Energy survey And you both use InfluxDB and it's kind of the super in the example that Caleb just gave, the goal is to look at the of the next gen telescopes to come online. the telescope needs to be that the system needs to keep up And it's not just the database, right. Okay, Caleb, let's bring you back in. the bus is, what you can kind of think of So talk more about how you use InfluxDB And that has, you know, does that mean to you? digging into the data to like an instant, means to you and your team? the images that we collect, I mean, you think about these that produce the high volume For example, micro seconds. that's one of the reasons we chose it. that's absolutely one of the that are closer to the interesting math. Angelo is it the same situation for you? And he had, that was really a challenge as one of the leading platforms. Yeah, it's hard to imagine, How about you Caleb? of the database growing Space is the ultimate edge. and to be in this industry as a whole. And anything else So if anyone likes to go sea kayaking All right, Angelo, bring us home. and so most of the system because of the lack of water, right. in the cloud and of course

ENTITIES

Entity	Category	Confidence
Angela	PERSON	0.99+
2015	DATE	0.99+
Dave Vallante	PERSON	0.99+
Angelo Fausti	PERSON	0.99+
1000 pictures	QUANTITY	0.99+
Loft Orbital	ORGANIZATION	0.99+
Caleb Maclachlan	PERSON	0.99+
40 moons	QUANTITY	0.99+
500 hertz	QUANTITY	0.99+
30 days	QUANTITY	0.99+
Chile	LOCATION	0.99+
SpaceX	ORGANIZATION	0.99+
Caleb	PERSON	0.99+
2018	DATE	0.99+
Antarctica	LOCATION	0.99+
10 years	QUANTITY	0.99+
15 terabytes	QUANTITY	0.99+
San Francisco Bay	LOCATION	0.99+
Earth	LOCATION	0.99+
North Pole	LOCATION	0.99+
Angelo	PERSON	0.99+
Python	TITLE	0.99+
Vera C. Rubin	PERSON	0.99+
Influx	TITLE	0.99+
10 minutes	QUANTITY	0.99+
3.2 gigapixel	QUANTITY	0.99+
InfluxDB	TITLE	0.99+
300 tons	QUANTITY	0.99+
two questions	QUANTITY	0.99+
both	QUANTITY	0.99+
Rubin Observatory	LOCATION	0.99+
last week	DATE	0.99+
each image	QUANTITY	0.99+
1.5 kilohertz	QUANTITY	0.99+
first project	QUANTITY	0.99+
eight meter	QUANTITY	0.99+
today	DATE	0.99+
next year	DATE	0.99+
Vera C Rubin Observatory	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
US	LOCATION	0.99+
one thing	QUANTITY	0.98+
an hour	QUANTITY	0.98+
first	QUANTITY	0.98+
first generation	QUANTITY	0.98+
one	QUANTITY	0.98+
three orders	QUANTITY	0.98+
one example	QUANTITY	0.97+
Two Sun, Arizona	LOCATION	0.97+
InfluxQL	TITLE	0.97+
hundreds of actuators	QUANTITY	0.97+
each measurement	QUANTITY	0.97+
about 300 pounds	QUANTITY	0.97+

Day One Kickoff | OpenSource Summit 2017

(soft rock music) >> Announcer: Live from Los Angeles, it's theCUBE. Covering Open Source Summit North America 2017. Brought to you by the Linux Foundation and Red Hat. >> Hello everyone, welcome to a special Cube coverage here in Los Angeles, California for The Linux Foundation's Open Source Summit in North America. I'm John Furrier, co-host of The Cube. This week I'll be co-hosting with Jeff Frick and Stu Miniman who will be here shortly. He's out getting data from the keynotes and scouring the community for information. Two days of coverage of line up here. Open source is changing the world. More than ever, open source is continuing to accelerate. Over 23 million developers now actively programming with open source. Where the world economy is now based on open source, relies on open source, and where open source and code is changing culture. Jeff, had a great keynote from the Linux Foundation open source community, and really this is an accumulation of many, many years of coverage for us in the developer community. Kind of sitting above all the different communities like Stack Overflow, all the different source foundational communities: Open Stack Summit, Cooper Netty's, KubeCon, now CNCF, a variety of other shows, and obviously industry shows. And this is now, we're seeing where open source is becoming so mainstream on a global scale, we're seeing something unprecedented in the history of the computer industry and that is the role of open source in society. And I think the number one message we're seeing is that the Linux software has been around for 25 plus years. Linus Torvalds was on stage today kind of like reminiscing. He's been Time Man of the Year, he's won the Nobel Prize in Computer Science, the Millennial Award I think it's called. Essentially the top award. 17th most important person in this decade. Linux is now a main force. People are relying on open source, and then look no further than the Equifax pact that has changed 150 plus million people in terms of their, potentially identity fraud out there. It's from open source software, so you're starting to see the reliance of open source, where a sustainable ecosystem is continuing to grow, but security is a concern, and which projects to join. There's so much action, I called it open bar and open source. There's so much goodness flowing in from Google, IBM, you name the companies out there. People are being paid to learn and write code at this point in history. This is a historic moment for the open source community. As society starts to be molded by the shape of code in the keynote they call it a Do-Acracy. For doers and builders who are changing democracy on a global scale. This is the big theme and obviously a slew of announcements on a project basis: Certification for Cooper Netty's, new people joining, the CNCF and a variety of different projects. But certainly from our standpoint and theCUBE, we covered a lot of the game of this past eight years. Certainly the Cloud and big data, and the software ecosystem. Software-defined Data Center to software eating the world, Data Science eating the world. This is only going to continue with things like Blockchain, virtual reality. And as fake news and bought networks in the cloud continuing to change the notion of what the source is, not just source code, source of information. More than ever, the role of communities will play a front and center role in all of this. >> Yeah I think that's as big of a deal as the software piece, John, is the role of communities that open source creates. And it's a different way of thinking about things. It's a different way of trying to get more innovation. It's acknowledging that the smartest people aren't necessarily in your four walls. So it's really an attitude, but I want to get your take 'cause there's a couple models of stewardship in the open source world. We're here at Open Source Summit in L.A. Linux Foundation event. Linux Foundation is taking on more and more of the stewardship of many of these projects, kind of bringing it under one roof. We see another model where the stewardship is kind of driven by one particular company, right, that's trying to build a commercial business around an open source stack, but there's a couple companies that have become almost the defacto steward for a new and evolving open source space. How do you see the pros and the cons against those two models. Ya know it's great is you got a great steward, it's maybe not so great is the steward is not so terrific and you get a conflict between the steward of the technology and the actual open source project. >> Well, Jeff, and this is the fundamental question on everyone's mind here, as we continue to see the communities grow. And also the scale out of communities as well as the number of overall lines of code. So a couple of key things, one is: We call it the ruling class, that's the elephant in the room here at the show is, we see it in politics, identity politics shaping our national level and certainly on a global scale. China blocking all block chain, ICOs, and all virtual currencies as of today. You're starting to see the intersection of geopolitics with code. Where the notion of a democracy, or democratization, or do-acracy, as one of the speakers has called it. You can think of code, lines of code, as a vote. You write a line of code, that's a vote into an ecosystem. And we're starting to see these notion of distributed labor, distributed control changing the face of capitalism. Ya know, it's really happening, and the value that corporations are creating in this new model is a real dynamic. And really what's happening is the change from a ruling class, even in the software world. The success of open source has always been based upon self-governance. Self-governance implies a group collective that manages and approves things. That group collective, some would argue, has not been inclusive over the years. Certainly the role of women in tech has been an issue. And so what you have developing is the potential for a ruling class of what shapes the future culture. Certainly there's a no-brainer with women in tech that there should be more women in tech because half the people in the world are women. They're users of software. Software is going to be relied on by all aspects of our world. Not just in Earth but also in Space. So, the notion of ruling class is changing and the inclusion is a huge deal. Onboarding new people. Building on individual successes, and building it together as a group relies on inclusion. It relies on inlcusion of people, and requires inclusion of how the self-governance goes forward. And again, this is a major concept in this world as it evolves because like I said, open source is relied on, people are leaning on it at a tier one level. Software that's powering the telescope in the North Pole, in the Antarctic to Space stations all use Linux. And this is, again, what we're seeing. Getting technology in the hands so people can use code to shape culture. That is ultimately a big thing, we're at a tipping point right now, were at an inflection point, whatever you want to call it. Open source is continuing to grow, and that culture-shaping notion of code equals culture, is really what it's all about, and the role of community is more important than ever. And inclusion is the number one factor in my opinion. >> The other interesting thing to get your take, John, is Android. So Linux has been around for a long time, everybody knows about Linux, and there was lots of flavors and it all kind of aggregated. Android is really growing as a significant factor, and I think it was announced here that Samsung has now joined the project. And there's a really interesting little gizmo now that you can take your Samsung phone, stick it in a docking station, and have it power a big giant screen and a keyboard. And so, ya know, as Android has developed as the power in the handheld devices, it's closer and closer, it's not surpassing what we have in these things. It's another big kind of shot in the arm towards the open source ecosystem that really wasn't as significant as it is today. >> Well I mean the Android Operating System is again, just an operating system in the minds of the tech world. Obviously consumers use it, device, huge market share iOS Android and even other operating systems. Who knows, maybe it'll be the year of Linux on the phone, at some point. But you're starting to see software powering devices. This is the internet of things phenomenon. This is where you start to see trends that build out of that notion, like Blockchain, like A.I. are going to start impacting lives. And that's one thing that Linus Torvalds was saying on stage was, the most rewarding thing in his career with all the accolades aside; the fact that he's had an impact on people's lives has been the number one thing that motivates him. That's what motivates most people. So I would say that the Android significance is one of pure numbers. More market share, more penetration for the user experience. And the user experience is a cultural issue. Back to culture equals code. And, inclusively powering everyone to get involved and be part of it, either as a user or a participant in the community or a coder, really is about deciding the future, and if people do not get involved and are not included, then the ruling class will decide what's best for the culture, and that is not the theme here today. The theme here in open source for the next level is letting the code and the technologists in an open collaborative self-governing way be in communities, be inclusive and shape the culture, letting the code shape the culture. And Android, again, is another straw in the camel's back that allows for more penetration and more influence. More relevance, and continued relevance of technology. Providers, coders, communities and certainly individuals. And again, collective intelligence is a group phenomenon. That is a community powered theme. That is what's going on here and again, this is to me, is very radical disruption to the global society. >> Get your take John, 'cause then you get kind of forking and things kind of move and groove, it's kind of like a river, finds another path, right. And you had the container and docker really drove a lot of activation on the container side. Google comes out strong with Cooper Netty's, another open source project that we just heard at the VMworld a week ago. Pivotal get on stage with Michael Dell and Pat Gelsinger talking about kind of a new derivation that they're kicking out that's not Cooper Netty's. I forget what it's called, a different, cube-something >> John: PKS. >> PKS. >> John: A little container service. >> Continues to evolve and kind of fork. So what's your take on kind of how these things continue to morph. >> Well that's a good point, I mean you're talking about vendors in industry. Industry is a term that they use here it's kind of a polite term for saying companies with a vol for capitalism. And capitalism, one of the factors involved in what's going on here: corporate value is not a bad thing. But capitalism driving the culture is not what it wants. Distributed labor, distributed control, changing the face and capitalism is about the role of open source. So there's a role for industry and corporations. The issue is that as vendors, in the old model, which is put stuff out there, control the standards bodies and influence the industry through their proprietary mechanisms. That's changed and they don't have the proprietary nature but they can try to use their muscle and money. That's not happening anymore, and I think forking, as you mentioned, the ability to take a piece of code and build on it, whether it's a framework or libraries out there. And writing custom code is what Jim Zemlin was talking about with us is the code sandwich. That 90 percent of the software out there is open source and only ten percent is highly differentiated. That is the programming model. So, to me I think forking is a wonderful democracy dynamic in open source. If you don't like it, you can fork it. And if it doesn't make it, then the Do-Acracy voted with their code. So, this a term you call voting with your code. We can use the term in marketing called people vote with their wallet, vote with their feet. In communities, in open source they vote with their code. So to me, forking if a good thing that provides great opportunity for innovation. The issue of vendors pushing stuff out there is what I call the calling the bullshit factor. Communities that are vibrant and sustainable they can call bullshit on this right away. So, companies can't operate on the old model, they have to ingratiate in, they have to make real contribution, and they have to be community citizens. Otherwise you're going to get called out for pushing their vendorware. And that is interesting, and I'm not saying that they are doing that but Pivotal is a great example. Ya know, Pivotal put out a pretty good service, makes Cooper Netty's manageable, Google Cloud engines tied directly to it. So any updates coming from the Google Cloud engine gets updated into Pivotal, that's the value to users. If it sucks, if it doesn't work well, people won't use it. So, voting with your code, voting with your feet, is what people will do. So there's now a new level of triangulation or a heat shield if you will from vendor dominance, throwing their muscle around and even Microsoft is here with Linux. It's a huge testament to the success of Linux, and that's really what it's all about. >> Yeah, Microsoft is here, Intel is here. A lot of big companies are here and a lot of, in the early days, people had issues with the big companies coming in. But, clearly they're a huge part of the ecosystem, they write big checks, they help fund nice events like this. So the last question for you John, before we get into it: Two days of wall to wall coverage, what are you looking for? What are some of the questions that you've got on top of your mind that we'd hope to get some answers over the next couple weeks, or couple days, excuse me. >> Well I saw a great quote up on stage, was called May The Source Be With You. And, it was kind of a Star Wars reference: May the force be with, may the source code be with you, if you will. I'm looking for things that changed people's lives, 'cause the theme in open source now is the reliance of code in all aspects of global life here on earth and in space now as we see it. That the quality of life for society depends on open source. And again, 90 percent of most great software is written in open source, ten percent is differentiated and unique. That's the model they call the code sandwich. It's easy to code, it's easier to get involved. There's more communities that are robust and vibrant. If it impacts the quality of life, so that's one thing. The second thing I'm looking for is, we're looking at some of these new future trends and I've been really thinking a lot about lately as you know in theCUBE, is the role of Blockchains and these really disrupted technologies. We've started to see the power of the user in communities where there's technologies empowering the individual at the same time creating a group dynamic where the groups can build. So, individual success can be part of something that contributes to a group that can build on top of it. That's an open source flywheel that works great. I'm looking for Blockchain, I'm looking for those new technologies that are going to be in that vein. And of course, the outcome is: Does it impact lives, does it make the quality of life better? >> Alright. Well you heard it there, we'll be here for two days of wall to wall coverage. We're at the Open Source Summit North America in L.A. It's pretty funny, right next to Staples Center. John, I don't think we've ever been right downtown L.A. You're watching theCUBE, we'll be back with our next guest after this short break, thanks for watching. (light electronic music)

Published Date : Sep 11 2017

SUMMARY :

Brought to you by the Linux Foundation and Red Hat. This is a historic moment for the open source community. It's acknowledging that the smartest people And inclusion is the number one factor in my opinion. It's another big kind of shot in the arm And Android, again, is another straw in the camel's back a lot of activation on the container side. these things continue to morph. and capitalism is about the role of open source. So the last question for you John, before we get into it: And of course, the outcome is: We're at the Open Source Summit North America in L.A.

ENTITIES

Entity	Category	Confidence
Jim Zemlin	PERSON	0.99+
Jeff Frick	PERSON	0.99+
IBM	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
Jeff	PERSON	0.99+
Stu Miniman	PERSON	0.99+
John	PERSON	0.99+
Red Hat	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Pat Gelsinger	PERSON	0.99+
Linux Foundation	ORGANIZATION	0.99+
Linus Torvalds	PERSON	0.99+
Samsung	ORGANIZATION	0.99+
Michael Dell	PERSON	0.99+
Staples Center	LOCATION	0.99+
Earth	LOCATION	0.99+
two days	QUANTITY	0.99+
90 percent	QUANTITY	0.99+
Microsoft	ORGANIZATION	0.99+
ten percent	QUANTITY	0.99+
North Pole	LOCATION	0.99+
17th	QUANTITY	0.99+
Antarctic	LOCATION	0.99+
L.A.	LOCATION	0.99+
Linux	TITLE	0.99+
Two days	QUANTITY	0.99+
iOS	TITLE	0.99+
two models	QUANTITY	0.99+
Android	TITLE	0.99+
second thing	QUANTITY	0.99+
Los Angeles	LOCATION	0.99+
Los Angeles, California	LOCATION	0.99+
Intel	ORGANIZATION	0.99+
today	DATE	0.98+
one	QUANTITY	0.98+
Nobel Prize	TITLE	0.98+
150 plus million people	QUANTITY	0.98+
one thing	QUANTITY	0.98+
25 plus years	QUANTITY	0.98+
Over 23 million developers	QUANTITY	0.98+
a week ago	DATE	0.98+
Open Source Summit	EVENT	0.97+
half	QUANTITY	0.97+
North America	LOCATION	0.97+
OpenSource Summit 2017	EVENT	0.96+
Star Wars	TITLE	0.96+
Google Cloud	TITLE	0.95+
Cooper Netty	ORGANIZATION	0.95+
earth	LOCATION	0.95+
Equifax	ORGANIZATION	0.95+
Open Source Summit North America 2017	EVENT	0.93+
The Cube	ORGANIZATION	0.93+
Pivotal	TITLE	0.92+
This week	DATE	0.92+
Space	LOCATION	0.92+
CNCF	ORGANIZATION	0.92+
Cooper Netty	PERSON	0.91+
Open Stack Summit	EVENT	0.9+
Millennial Award	TITLE	0.9+
May The Source Be With You	TITLE	0.88+
Day One	QUANTITY	0.87+

Reynold Xin, Databricks - #Spark Summit - #theCUBE

>> Narrator: Live from San Francisco, it's theCUBE, covering Spark Summit 2017. Brought to you by Databricks. >> Welcome back we're here at theCube at Spark Summit 2017. I'm David Goad here with George Gilbert, George. >> Good to be here. >> Thanks for hanging with us. Well here's the other man of the hour here. We just talked with Ali, the CEO at Databricks and now we have the Chief Architect and co-founder at Databricks, Reynold Xin. Reynold, how are you? >> I'm good. How are you doing? >> David: Awesome. Enjoying yourself here at the show? >> Absolutely, it's fantastic. It's the largest Summit. It's a lot interesting things, a lot of interesting people with who I meet. >> Well I know you're a really humble guy but I had to ask Ali what should I ask Reynold when he gets up here. Reynold is one of the biggest contributors to Spark. And you've been with us for a long time right? >> Yes, I've been contributing for Spark for about five or six years and that's probably the most number of commits to the project and lately more I'm working with other people to help design the roadmap for both Spark and Databricks with them. >> Well let's get started talking about some of the new developments that you want maybe our audience at theCUBE hasn't heard here in the keynote this morning. What are some of the most exciting new developments? >> So, I think in general if we look at Spark, there are three directions I would say we doubling down. One the first direction is the deep learning. Deep learning is extremely hot and it's very capable but as we alluded to earlier in a blog post, deep learning has reached sort of a mass produced point in which it shows tremendous potential but the tools are very difficult to use. And we are hoping to democratize deep learning and do what Spark did to big data, to deep learning with this new library called deep learning pipelines. What it does, it integrates different deep learning libraries directly in Spark and can actually expose models in sequel. So, even the business analysts are capable of leveraging that. So, that one area, deep learning. The second area is streaming. Streaming, again, I think that a lot of customers have aspirations to actually shorten the latency and increase the throughput in streaming. So, the structured streaming effort is going to be generally available and last month alone on Databricks platform, I think out customers processed three trillion records, last month alone using structured streaming. And we also have a new effort to actually push down the latency all the way to some millisecond range. So, you can really do blazingly fast streaming analytics. And last but not least is the SEQUEL Data Warehousing area, Data warehousing I think that it's a very mature area from the outset of big data point of view, but from a big data one it's still pretty new and there's a lot of use cases that's popping up there. And Spark with approaches like the CBO and also impact here in the database runtime with DBIO, we're actually substantially improving the performance and the capabilities of data warehousing futures. >> We're going to dig in to some of those technologies here in just a second with George. But have you heard anything here so far from anyone that's changed your mind maybe about what to focus on next? So, one thing I've heard from a few customers is actually visibility and debugability of the big data jobs. So many of them are fairly technical engineers and some of them are less sophisticated engineers and they have written jobs and sometimes the job runs slow. And so the performance engineer in me would think so how do I make the job run fast? The different way to actually solve that problem is how can we expose the right information so the customer can actually understand and figure it out themselves. This is why my job is slow and this how I can tweak it to make it faster. Rather than giving people the fish, you actually give them the tools to fish. >> If you can call that bugability. >> Reynold: Yeah, Debugability. >> Debugability. >> Reynold: And visibility, yeah. >> Alright, awesome, George. >> So, let's go back and unpack some of those kind of juicy areas that you identified, on deep learning you were able to distribute, if I understand things right, the predictions. You could put models out on a cluster but the really hard part, the compute intensive stuff, was training across a cluster. And so Deep Learning, 4J and I think Intel's BigDL, they were written for Spark to do that. But with all the excitement over some of the new frameworks, are they now at the point where they are as good citizens on Spark as they are on their native environments? >> Yeah so, this is a very interesting question, obviously a lot of other frameworks are becoming more and more popular, such as TensorFlow, MXNet, Theano, Keras and Office. What the Deep Learning Pipeline library does, is actually exposes all these single note Deep Learning tools as highly optimized for say even GPUs or CPUs, to be available as a estimator or like a module in a pipeline of the machine learning pipeline library in spark. So, now users can actually leverage Spark's capability to, for example, do hyper parameter churning. So, when you're building a machine learning model, it's fairly rare that you just run something once and you're good with it. Usually have to fiddle with a lot of the parameters. For example, you might run over a hundred experiments to actually figure out what is the best model I can get. This is where actually Spark really shines. When you combine Spark with some deep learning library be it BigDL or be it MXNet, be it TensorFlow, you could be using Spark to distribute that training and then do cross validation on it. So you can actually find the best model very quickly. And Spark takes care of all the job scheduling, all the tolerance properties and how do you read data in from different data sources. >> And without my dropping too much in the weeds, there was a version of that where Spark wouldn't take care of all the communications. It would maybe distribute the models and then do some of the averaging of what was done out on the cluster. Are you saying that all that now can be managed by Spark? >> In that library, Spark will be able to actually take care of picking the best model out of it. And there are different ways you an design how do you define the best. The best could be some average of some different models. The best could be just pick one out of this. The best could be maybe there's a tree of models that you classify it on. >> George: And that's a hyper parameter configuration choice? >> So that is actually building functionality in Sparks machine learning pipeline. And now what we're doing is now you can actually plug all those deep learning libraries directly into that as part of the pipeline to be used. Another maybe just to add, >> Yeah, yeah, >> Another really cool functionality of the deep learning pipeline is transfer learning. So as you said, deep learning takes a very long time, it's very computationally demanding. And it takes a lot of resources, expertise to train. But with transfer learning what we allow the customers to do is they can take an existing deep learning model as well train in a different domain and they we'd retrain it on a very small amount of data very quickly and they can adapt it to a different domain. That's how sort of the demo on the James Bond car. So there is a general image classifier that we train it on probably just a few thousand images. And now we can actually detect whether a car is James Bond's car or not. >> Oh, and the implications there are huge, which is you don't have to have huge training data sets for modifying a model of a similar situation. I want to, in the time we have, there's always been this debate about whether Sparks should manage state, whether it's database, key value store. Tell us how the thinking about that has evolved and then how the integration interfaces for achieving that have evolved. >> One of the, I would say, advantages of Spark is that it's unbiased and works with a variety of storage systems, be it Cassandra, be it Edgebase, be it HDFS, be is S3. There is a metadata management functionality in Spark which is the catalog of tables that customers can define. But the actual storage sits somewhere else. And I don't think that will change in the near future because we do see that the storage systems have matured significantly in the last few years and I just wrote blog post last week about the advantage of S3 over HDFS for example. The storage price is being driven down by almost a factor of 10X when you go to the cloud. I just don't think it makes sense at this point to be building storage systems for analytics. That said, I think there's a lot of building on top of existing storage system. There's actually a lot of opportunities for optimization on how you can leverage the specific properties of the underlying storage system to get to maximum performance. For example, how are you doing intelligent caching, how do you start thinking about building indexes actually against the data that's stored for scanned workloads. >> With Tungsten's, you take advantage of the latest hardware and where we get more memory intensive systems and now that the Catalyst Optimizer has a cost based optimizer or will be, and large memory. Can you change how you go about knowing what data you're managing in the underlying system and therefore, achieve a tremendous acceleration in performance? >> This is actually one area we invested in the DBIO module as part of Databricks Runtime, and what DBIO does, a lot of this are still in progress, but for example, we're adding some form of indexing capability to add to the system so we can quickly skip and prune out all the irrelevant data when the user is doing simple point look-ups. Or if the user is doing a scan heavy workload with some predicates. That actually has to do with how we think about the underlying data structure. The storage system is still the same storage system, like S3, but were adding actually indexing functionalities on top of it as part of DBIO. >> And so what would be the application profiles? Is it just for the analytic queries or can you do the point look-ups and updates in that sort of scenario too? >> So it's interesting you're talking about updates. Updates is another thing that we've got a lot of future requests on. We're actively thinking about how we will support update workload. Now, that said, I just want to emphasize for both use case of doing point look-ups and updates, we're still talking about in the context of analytic environment. So we would be talking about for example maybe bulk updates or low throughput updates rather than doing transactional updates in which every time you swipe a credit card, some record gets updated. That's probably more belongs on the transactional databases like Oracle or my SEQUEL even. >> What about when you think about people who are going to run, they started out with Spark on prem, they realize they're going to put much more of their resources in the cloud, but with IIOT, industrial IOT type applications they're going to have Spark maybe in a gateway server on the edge? What do you think that configuration looks like? >> Really interesting, it's kind of two questions maybe. The first is the hybrid on prem, cloud solution. Again, so one of the nice advantage of Spark is the couple of storage and compute. So when you want to move for example, workloads from one prem to the cloud, the one you care the most about is probably actually the data 'cause the compute, it doesn't really matter that much where you run it but data's the one that's hard to move. We do have customers that's leveraging Databricks in the cloud but actually reading data directly from on prem the reliance of the caching solution we have that minimize the data transfer over time. And is one route I would say it's pretty popular. Another on is, with Amazon you can literally give them just a show ball of functionality. You give them hard drive with trucks, the trucks will ship your data directly put in a three. With IOT, a common pattern we see is a lot of the edge devices, would be actually pushing the data directly into some some fire hose like Kinesis or Kafka or, I'm sure Google and Microsoft both have their own variance of that. And then you use Spark to directly subscribe to those topics and process them in real time with structured streaming. >> And so would Spark be down, let's say at the site level. if it's not on the device itself? >> It's a interesting thought and maybe one thing we should actually consider more in the future is how do we push Spark to the edges. Right now it's more of a centralized model in which the devices push data into Spark which is centralized somewhere. I've seen for example, I don't remember exact the use case but it has to do with some scientific experiment in the North Pole. And of course there you don't have a great uplink of all the data connecting transferring back to some national lab and rather they would do a smart parsing there and then ship the aggregated result back. There's another one but it's less common. >> Alright well just one minute now before the break so I'm going to give you a chance to address the Spark community. What's the next big technical challenge you hope people will work on for the benefit of everybody? >> In general Spark came along with two focuses. One is performance, the other one's ease of use. And I still think big data tools are too difficult to use. Deep learning tools, even harder. The barrier to entry is very high for office tools. I would say, we might have already addressed performance to a degree that I think it's actually pretty usable. The systems are fast enough. Now, we should work on actually make (mumbles) even easier to use. It's what also we focus a lot on at Databricks here. >> David: Democratizing access right? >> Absolutely. >> Alright well Reynold, I wish we could talk to you all day. This is great. We are out of time now. Want to appreciate you coming by theCUBE and sharing your insights and good luck with the rest of the show. >> Thank you very much David and George. >> Thank you all for watching here were at theCUBE at Sparks Summit 2017. Stay tuned, lots of other great guests coming up today. We'll see you in a few minutes.

Published Date : Jun 7 2017

SUMMARY :

Brought to you by Databricks. I'm David Goad here with George Gilbert, George. Well here's the other man of the hour here. How are you doing? David: Awesome. It's the largest Summit. Reynold is one of the biggest contributors to Spark. and that's probably the most number of the new developments that you want So, the structured streaming effort is going to be And so the performance engineer in me would think kind of juicy areas that you identified, all the tolerance properties and how do you read data of the averaging of what was done out on the cluster. And there are different ways you an design as part of the pipeline to be used. of the deep learning pipeline is transfer learning. Oh, and the implications there are huge, of the underlying storage system and now that the Catalyst Optimizer The storage system is still the same storage system, That's probably more belongs on the transactional databases the one you care the most about if it's not on the device itself? And of course there you don't have a great uplink so I'm going to give you a chance One is performance, the other one's ease of use. Want to appreciate you coming by theCUBE Thank you all for watching here were at theCUBE

ENTITIES

Entity	Category	Confidence
George Gilbert	PERSON	0.99+
Reynold	PERSON	0.99+
Ali	PERSON	0.99+
David	PERSON	0.99+
George	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
David Goad	PERSON	0.99+
Databricks	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
North Pole	LOCATION	0.99+
San Francisco	LOCATION	0.99+
Reynold Xin	PERSON	0.99+
last month	DATE	0.99+
10X	QUANTITY	0.99+
two questions	QUANTITY	0.99+
three trillion records	QUANTITY	0.99+
second area	QUANTITY	0.99+
today	DATE	0.99+
last week	DATE	0.99+
Spark	TITLE	0.99+
Spark Summit 2017	EVENT	0.99+
first direction	QUANTITY	0.99+
One	QUANTITY	0.99+
James Bond	PERSON	0.98+
Spark	ORGANIZATION	0.98+
both	QUANTITY	0.98+
first	QUANTITY	0.98+
one	QUANTITY	0.98+
Tungsten	ORGANIZATION	0.98+
two focuses	QUANTITY	0.97+
three directions	QUANTITY	0.97+
one minute	QUANTITY	0.97+
one area	QUANTITY	0.96+
three	QUANTITY	0.96+
about five	QUANTITY	0.96+
DBIO	ORGANIZATION	0.96+
six years	QUANTITY	0.95+
one thing	QUANTITY	0.94+
over a hundred experiments	QUANTITY	0.94+
Oracle	ORGANIZATION	0.92+
Theano	TITLE	0.92+
single note	QUANTITY	0.91+
Intel	ORGANIZATION	0.91+
one route	QUANTITY	0.89+
theCUBE	ORGANIZATION	0.88+
Office	TITLE	0.87+
TensorFlow	TITLE	0.87+
S3	TITLE	0.87+
MXNet	TITLE	0.85+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for North Pole: