Zach Booth, Explorium | AWS Startup Showcase | The Next Big Thing in AI, Security, & Life Sciences.

(gentle upbeat music) >> Everyone welcome to the AWS Startup Showcase presented by theCUBE. I'm John Furrier, host of theCUBE. We are here talking about the next big thing in cloud featuring Explorium. For the AI track, we've got AI cybersecurity and life sciences. Obviously AI is hot, machine learning powering that. Today we're joined by Zach Booth, director of global partnerships and channels like Explorium. Zach, thank you for joining me today remotely. Soon we'll be in person, but thanks for coming on. We're going to talk about rethinking external data. Thanks for coming on theCUBE. >> Absolutely, thanks so much for having us, John. >> So you guys are a hot startup. Congratulations, we just wrote about on SiliconANGLE, you're a new $75 million of fresh funding. So you're part of the Amazon partner network and growing like crazy. You guys have a unique value proposition looking at external data and that having a platform for advanced analytics and machine learning. Can you take a minute to explain what you guys do? What is this platform? What's the value proposition and why do you exist? >> Bottom line, we're bringing context to decision-making. The premise of Explorium and kind of this is consistent with the framework of advanced analytics is we're helping customers to reach better, more relevant, external data to feed into their predictive and analytical models. It's quite a challenge to actually integrate and effectively leverage data that's coming from beyond your organization's walls. It's manual, it's tedious, it's extremely time consuming and that's a problem. It's really a problem that Explorium was built to solve. And our philosophy is it shouldn't take so long. It shouldn't be such an arduous process, but it is. So we built a company, a technology that's capable for any given analytical process of connecting a customer to relevant sources that are kind of beyond their organization's walls. And this really impacts decision-making by bringing variety and context into their analytical processes. >> You know, one of the things I see a lot in my interviews with theCUBE and talking to people in the industry is that everyone talks a big game about having some machine learning and AI, they're like, "Okay, I got all this cool stuff". But at the end of the day, people are still using spreadsheets. They're wrangling data. And a lot of it's dominated by these still fenced-off data warehousing and you start to see the emergence of really companies built on the cloud. I saw the snowflake IPO, you're seeing a whole new shift of new brands emerging that are doing things differently, right? And because there's such a need for just move out of the archaic spreadsheet and data presentation layers, it's a slower antiquated, outdated. How do you guys solve that problem? You guys are on the other side of that equation, you're on the new wave of analytics. What are you guys solving? How do you make that work? How do you get on that way? >> So basically the way Explorium sees the world, and I think that most analytical practitioners these days see it in a similar way, but the key to any analytical problem is having the right data. And the challenge that we've talked about and that we're really focused on is helping companies reach that right data. Our focus is on the data part of data science. The science part is the algorithmic side. It's interesting. It was kind of the first frontier of machine learning as practitioners and experts were focused on it and cloud and compute really enabled that. The challenge today isn't so much "What's the right model for my problem?" But it's "What's the right data?" And that's the premise of what we do. Your model's only as strong as the data that it trains on. And going back to that concept of just bringing context to decision-making. Within that framework that we talked about, the key is bringing comprehensive, accurate and highly varied data into my model. But if my model is only being informed with internal data which is wonderful data, but only internal, then it's missing context. And we're helping companies to reach that external variety through a pretty elegant platform that can connect the right data for my analytical process. And this really has implications across several different industries and a multitude of use cases. We're working with companies across consumer packaged goods, insurance, financial services, retail, e-commerce, even software as a service. And the use cases can range between fraud and risk to marketing and lifetime value. Now, why is this such a challenge today with maybe some antiquated or analog means? With a spreadsheet or with a rule-based approach where we're pretty limited, it was an effective means of decision-making to generate and create actions, but it's highly limited in its ability to change, to be dynamic, to be flexible. And with modeling and using data, it's really a huge arsenal that we have at our fingertips. The trick is extracting value from within it. There's obviously latent value from within our org but every day there's more and more data that's being created outside of our org. And that is a challenge to go out and get to effectively filter and navigate and connect to. So we've basically built that tech to help us navigate and query for any given analytical question. Find me the right data rather than starting with what's the problem I'm looking for, now let me think about the right data. Which is kind of akin to going into a library and searching for a specific book. You know which book you're looking for. Instead of saying, there's a world, a universe of data outside there. I want to access it. I want to tap into what's right. Can I use a tool that can effectively query all that data, find what's relevant for me, connect it and match it with my own and distill signals or features from that data to provide more variety into my modeling efforts yielding a robust decision as an output. >> I love that paradigm of just having that searchable kind of paradigm. I got to ask you one of the big things that I've heard people talk about. I want to get your thoughts on this, is that how do I know if I even have the right data? Is the data addressable? Can I find it? Is it even, can I even be queried? How do you solve that problem for customers when they say, "I really want the best analytics but do I even have the data or is it the right data?" How do you guys look at that? >> So the way our technology was built is that it's quite relevant for a few different profile types of customers. Some of these customers, really the genesis of the company started with those cloud-based, model-driven since day one organizations, and they're working with machine learning and they have models in production. They're quite mature in fact. And the problem that they've been facing is, again, our models are only as strong as the data that they're training on. The only data that they're training on is internal data. And we're seeing diminishing returns from those decisions. So now suddenly we're looking for outside data and we're finding that to effectively use outside data, we have to spend a lot of time. 60% of our time spent thinking of data, going out and getting it, cleaning it, validating it, and only then can we actually train a model and assess if there's an ROI. That takes months. And if it doesn't push the needle from an ROI standpoint, then it's an enormous opportunity cost, which is very, very painful, which goes back to their decision-making. Is it even worth it if it doesn't push the needle? That's why there had to be a better way. And what we built is relevant for that audience as well as companies that are in the midst of their digital transformation. We're data rich, but data science poor. We have lots of data. A latent value to extract from within our own data and at the same time tons of valuable data outside of our org. Instead of waiting 18, 36 months to transform ourselves, get our infrastructure in place, our data collection in place, and really start having models in production based on our own data. You can now do this in tandem. And that's what we're seeing with a lot of our enterprise customers. By using their analysts, their data engineers, some of them in their innovation or kind of center of excellences have a data science group as well. And they're using the platform to inform a lot of their different models across lines of businesses. >> I love that expression, "data-rich". A lot of people becoming full of data too. They have a data problem. They have a lot of it. I think I want to get your thoughts but I think that connects to my next question which is as people look at the cloud, for instance, and again, all these old methods were internal, internal to the company, but now that you have this idea of cloud, more integration's happening. More people are connecting with APIs. There's more access to potentially more signals, more data. How does a company go to that next level to connect in and acquire the data and make it faster? Because I can almost imagine that the signals that come from that context of merging external data and that's the topic of this theme, re-imagining external data is extremely valuable signaling capability. And so it sounds like you guys make it go faster. So how does it work? Is it the cloud? Take us through that value proposition. >> Well, it's a real, it's amazing how fast the rate of change organizations have been moving onto the cloud over the past year during COVID and the fact that alternative or external data, depending on how you refer to it, has really, really blown up. And it's really exciting. This is coming in the form of data providers and data marketplaces, and everybody is kind of, more and more organizations are moving from rule-based decision-making to predictive decision making, and that's exciting. Now what's interesting about this company, Explorium, we're working with a lot of different types of customers but our long game has a real high upside. There's more and more companies that are starting to use data and are transformed or already are in the midst of their transformation. So they need outside data. And that challenge that I described is exists for all of them. So how does it really work? Today, if I don't have data outside, I have to think. It's based on hypothesis and it all starts with that hypothesis which is already prone to error from the get-go. You and I might be domain experts for a given use case. Let's say we're focusing on fraud. We might think about a dozen different types of data sources, but going out and getting it like I said, it takes a lot of time harmonizing it, cleaning it, and being able to use it takes even more time. And that's just for each one. So if we have to do that across dozens of data sources it's going to take far too much time and the juice isn't worth the squeeze. And so I'm going to forego using that. And a metaphor that I like to use when I try to describe what Explorium does to my mom. I basically use this connection to buying your first home. It's a very, very important financial decision. You would, when you're buying this home, you're thinking about all the different inputs in your decision-making. It's not just about the blueprint of the house and how many rooms and the criteria you're looking for. You're also thinking external variables. You're thinking about the school zone, the construction, the property value, alternative or similar neighborhoods. That's probably your most important financial decision or one of the largest at least. A machine learning model in production is an extremely important and expensive investment for an organization. Now, the problem is as a consumer buying a home, we have all this data at our fingertips to find out all of those external-based inputs. Organizations don't, which is kind of crazy when I first kind of got into this world. And so, they're making decisions with their first party data only. First party data's wonderful data. It's the best, it's representative, it's high quality, it's high value for their specific decision-making and use cases but it lacks context. And there's so much context in the form of location-based data and business information that can inform decision-making that isn't being used. It translates to sub-optimal decision-making, let's say. >> Yeah, and I think one of the insights around looking at signal data in context is if by merging it with the first party, it creates a huge value window, it gives you observational data, maybe potentially insights into customer behavior. So totally agree, I think that's a huge observation. You guys are definitely on the right side of history here. I want to get into how it plays out for the customer. You mentioned the different industries, obviously data's in every vertical. And vertical specialization with the data it has to be, is very metadata driven. I mean, metadata and oil and gas is different than fintech. I mean, some overlap, but for the most part you got to have that context, acute context, each one. How are you guys working? Take us through an example of someone getting it right, getting that right set up, taking us through the use case of how someone on boards Explorium, how they put it to use, and what are some of the benefits? >> So let's break it down into kind of a three-step phase. And let's use that example of fraud earlier. An organization would have basically past historical data of how many customers were actually fraudulent in the end of the day. So this use case, and it's a core business problem, is with an intention to reduce that fraud. So they would basically provide, going with your description earlier, something similar to an Excel file. This can be pulled from any database out there, we're working with loads of them, and they would provide this what's called training data. This training data is their historical data and would have as an output, the outcome, the conclusion, was this business fraudulent or not? Yes or no. Binary. The platform would understand that data itself to train a model with external context in the form of enrichments. These data enrichments at the end of the day are important, they're relevant, but their purpose is to generate signals. So to your point, signals is the bottom line what everyone's trying to achieve and identify and discover, and even engineer by using data that they have and data that they yet to integrate with. So the platform would connect to your data, infer and understand the meaning of that data. And based on this matching of internal plus external context, the platform automates the process of distilling signals. Or in machine learning this is called, referred to as features. And these features are really the bread and butter of your modeling efforts. If you can leverage features that are coming from data that's outside of your org, and they're quantifiably valuable which the platform measures, then you're putting yourself in a position to generate an edge in your modeling efforts. Meaning now, you might reduce your fraud rate. So your customers get a much better, more compelling offer or service or price point. It impacts your business in a lot of ways. What Explorium is bringing to the table in terms of value is a single access point to a huge universe of external data. It expedites your time to value. So rather than data analysts, data engineers, data scientists, spending a significant amount of time on data preparation, they can now spend most of their time on feature or signal engineering. That's the more fun and interesting part, less so the boring part. But they can scale their modeling efforts. So time to value, access to a huge universe of external context, and scale. >> So I see two things here. Just make sure I get this right 'cause it sounds awesome. So one, the core assets of the engineering side of it, whether it's the platform engineer or data engineering, they're more optimized for getting more signaling which is more impactful for the context acquisition, looking at contexts that might have a business outcome, versus wrangling and doing mundane, heavy lifting. >> Yeah so with it, sorry, go ahead. >> And the second one is you create a democratization for analysts or business people who just are used to dealing with spreadsheets who just want to kind of play and play with data and get a feel for it, or experiment, do querying, try to match planning with policy - >> Yeah, so the way I like to kind of communicate this is Explorium's this one, two punch. It's got this technology layer that provides entity resolution, so matching with external data, which otherwise is a manual endeavor. Explorium's automated that piece. The second is a huge universe of outside data. So this circumvents procurement. You don't have to go out and spend all of these one-off efforts on time finding data, organizing it, cleaning it, etc. You can use Explorium as your single access point to and gateway to external data and match it, so this will accelerate your time to value and ultimately the amount of valuable signals that you can discover and leverage through the platform and feed this into your own pipelines or whatever system or analytical need you have. >> Zach, great stuff. I love talking with you and I love the hot startup action here. Cause you're again, you're on the net new wave here. Like anything new, I was just talking to a colleague here. (indistinct) When you have something new, it's like driving a car for the first time. You need someone to give you some driving lessons or figure out how to operationalize it or take advantage of the one, two, punch as you pointed out. How do you guys get someone up and running? 'Cause let's just say, I'm like, okay, I'm bought into this. So no brainer, you got my attention. I still don't understand. Do you provide a marketplace of data? Do I need to get my own data? Do I bring my own data to the party? Do you guys provide relationships with other data providers? How do I get going? How do I drive this car? How do you answer that? >> So first, explorium.ai is a free trial and we're a product-focused company. So a practitioner, maybe a data analyst, a data engineer, or data scientist would use this platform to enrich their analytical, so BI decision-making or any models that they're working on either in production or being trained. Now oftentimes models that are being trained don't actually make it to production because they don't meet a minimum threshold. Meaning they're not going to have a positive business outcome if they're deployed. With Explorium you can now bring variety into that and increase your chances that your model that's being trained will actually be deployed because it's being fed with the right data. The data that you need that's not just the data that you have. So how a business would start working with us would typically be with a use case that has a high business value. Maybe this could be a fraud use case or a risk use case and B2B, or even B2SMB context. This might be a marketing use case. We're talking about LTV modeling, lookalike modeling, lead acquisition and generation for our CPGs and field sales optimization. Explore and understand your data. It would enrich that data automatically, it would generate and discover new signals from external data plus from your own and feed this into either a model that you have in-house or end to end in the platform itself. We provide customer success to generate, kind of help you build out your first model perhaps, and hold your hands through that process. But typically most of our customers are after a few months time having run in building models, multiple models in production on their own. And that's really exciting because we're helping organizations move from a more kind of rule-based decision making and being their bridge to data science. >> Awesome. I noticed that in your title you handle global partnerships and channels which I'm assuming is you guys have a network and ecosystem you're working with. What are some of the partnerships and channel relationships that you have that you bring to bear in the marketplace? >> So data and analytics, this space is very much an ecosystem. Our customers are working across different clouds, working with all sorts of vendors, technologies. Basically they have a pretty big stack. We're a part of that stack and we want to symbiotically play within our customer stack so that we can contribute value whether they sit here, there, or in another place. Our partners range from consulting and system integration firms, those that perhaps are building out the blueprint for a digital transformation or actually implementing that digital transformation. And we contribute value in both of these cases as a technology innovation layer in our product. And a customer would then consume Explorium afterwards, after that transformation is complete as a part of their stack. We're also working with a lot of the different cloud vendors. Our customers are all cloud-based and data enrichment is becoming more and more relevant with some wonderful machine-learning tools. Be they AutoML, or even some data marketplaces are popping up and very exciting. What we're bringing to the table as an edge is accelerating the connection between the data that I think I want as a company and how to actually extract value from that data. Being part of this ecosystem means that we can be working with and should be working with a lot of different partners to contribute incremental value to our end customers. >> Final question I want to ask you is if I'm in a conference room with my team and someone says, "Hey, we should be rethinking our external data." What would I say? How would I pound my fist on the table or raise my hand in saying, "Hey, I have an idea, we should be thinking this way." What would be my argument to the team, to re-imagine how we deal with external data? >> So it might be a scenario that rather than banging your hands on the table, you might be banging your heads on the table because it's such a challenging endeavor today. Companies have to think about, What's the right data for my specific use cases? I need to validate that data. Is it relevant? Is it real? Is it representative? Does it have good coverage, good depth and good quality? Then I need to procure that data. And this is about getting a license from it. I need to integrate that data with my own. That means I need to have some in-house expertise to do so. And then of course, I need to monitor and maintain that data on an ongoing basis. All of this is a pretty big thing to undertake and undergo and having a partner to facilitate that external data integration and ongoing refresh and monitoring, and being able to trust that this is all harmonized, high quality, and I can find the valuable ones without having to manually pick and choose and try to discover it myself is a huge value add, particularly the larger the organization or partner. Because there's so much data out there. And there's a lot of noise out there too. And so if I can through a single partner or access point, tap into that data and quantify what's relevant for my specific problem, then I'm putting myself in a really good position and optimizing the allocation of my very expensive and valuable data analysts and engineering resources. >> Yeah, I think one of the things you mentioned earlier I thought was a huge point was good call out was it goes beyond the first party data because and even just first party if you just in an internal view, some of the best, most successful innovators that we've been covering with cloud scale is they're extending their first party data to external providers. So they're in the value chains of solutions that share their first party data with other suppliers. And so that's just, again, more of an extension of the first party data. You're kind of taking it to a whole 'nother level of there's another external, external set of data beyond it that's even more important. I think this is a fascinating growth area and I think you guys are onto it. Great stuff. >> Thank you so much, John. >> Well, I really appreciate you coming on Zach. Final word, give a quick plug for the company. What are you up to, and what's going on? >> What's going on with Explorium? We are growing very fast. We're a very exciting company. I've been here since the very early days and I can tell you that we have a stellar working environment, a very, very, strong down to earth, high work ethic culture. We're growing in the sense of our office in San Mateo, New York, and Tel Aviv are growing rapidly. As you mentioned earlier, we raised our series C so that totals Explorium to raising I think 127 million over the past two years and some change. And whether you want to partner with Explorium, work with us as a customer, or join us as an employee, we welcome that. And I encourage everybody to go to explorium.ai. Check us out, read some of the interesting content there around data science, around the processes, around the business outcomes that a lot of our customers are seeing, as well as joining a free trial. So you can check out the platform and everything that has to offer from machine learning engine to a signal studio, as well as what type of information might be relevant for your specific use case. >> All right Zach, thanks for coming on. Zach Booth, director of global partnerships and channels that explorium.ai. The next big thing in cloud featuring Explorium and a part of our AI track, I'm John Furrier, host of theCUBE. Thanks for watching.

Published Date : Jun 24 2021

SUMMARY :

For the AI track, we've Absolutely, thanks so and that having a platform It's quite a challenge to actually of really companies built on the cloud. And that is a challenge to go out and get I got to ask you one of the big things and at the same time tons of valuable data and that's the topic of this theme, And a metaphor that I like to use of the insights around and data that they yet to integrate with. the core assets of the and gateway to external data Do I bring my own data to the party? that's not just the data that you have. What are some of the partnerships a lot of the different cloud vendors. to re-imagine how we and optimizing the allocation of the first party data. plug for the company. that has to offer from and a part of our AI track,

ENTITIES

Entity	Category	Confidence
John	PERSON	0.99+
Zach Booth	PERSON	0.99+
Explorium	ORGANIZATION	0.99+
Zach	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
60%	QUANTITY	0.99+
$75 million	QUANTITY	0.99+
John Furrier	PERSON	0.99+
San Mateo	LOCATION	0.99+
two things	QUANTITY	0.99+
Tel Aviv	LOCATION	0.99+
127 million	QUANTITY	0.99+
Excel	TITLE	0.99+
explorium.ai	OTHER	0.99+
first party	QUANTITY	0.99+
Today	DATE	0.99+
first time	QUANTITY	0.99+
first model	QUANTITY	0.98+
today	DATE	0.98+
both	QUANTITY	0.98+
first home	QUANTITY	0.98+
one	QUANTITY	0.98+
first	QUANTITY	0.98+
three-step	QUANTITY	0.98+
second	QUANTITY	0.97+
two punch	QUANTITY	0.97+
two	QUANTITY	0.97+
first frontier	QUANTITY	0.95+
New York	LOCATION	0.95+
theCUBE	ORGANIZATION	0.94+
AWS	ORGANIZATION	0.93+
explorium.ai	ORGANIZATION	0.91+
each one	QUANTITY	0.9+
second one	QUANTITY	0.9+
single partner	QUANTITY	0.89+
AWS Startup Showcase	EVENT	0.87+
dozens	QUANTITY	0.85+
past year	DATE	0.84+
single access	QUANTITY	0.84+
First party	QUANTITY	0.84+
series C	OTHER	0.79+
COVID	EVENT	0.74+
past two years	DATE	0.74+
36 months	QUANTITY	0.73+
18,	QUANTITY	0.71+
Startup Showcase	EVENT	0.7+
SiliconANGLE	ORGANIZATION	0.55+
tons	QUANTITY	0.53+
things	QUANTITY	0.53+
snowflake IPO	EVENT	0.52+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Zach Booth: