Jonathan Seckler, Dell & Cal Al-Dhubaib, Pandata | VMware Explore 2022

(gentle music) >> Welcome back to theCUBE's virtual program, covering VMware Explorer, 2022. The first time since 2019 that the VMware ecosystem is gathered in person. But in the post isolation economy, hybrid is the new format, cube plus digital, we call it. And so we're really happy to welcome Cal Al-Dhubaib who's the founder and CEO and AI strategist of Pandata. And Jonathan Seckler back in theCUBE, the senior director of product marketing at Dell Technologies. Guys, great to see you, thanks for coming on. >> Yeah, thanks a lot for having us. >> Yeah, thank you >> Cal, Pandata, cool name, what's it all about? >> Thanks for asking. Really excited to share our story. I'm a data scientist by training and I'm based here in Cleveland, Ohio. And Pandata is a company that helps organizations design and develop machine learning and AI technology. And when I started this here in Cleveland six years ago, I had people react to me with, what? So we help demystify AI and make it practical. And we specifically focus on trustworthy AI. So we work a lot in regulated industries like healthcare. And we help organizations navigate the complexities of building machine learning and AI technology when data's hard to work with, when there's risk on the potential outcomes, or high cost in the consequences. And that's what we do every day. >> Yeah, yeah timing is great given all the focus on privacy and what you're seeing with big tech and public policy, so we're going to get into that. Jonathan, I understand you guys got some hard news. What's your story around AI and AutoML? Share that with us. >> Yeah, thanks. So having the opportunity to speak with Cal today is really important because one of the hardest things that we find that our customers have is making that transition of experimenting with AI to making it really useful in real life. >> What is the tech underneath that? Are we talking VxRail here? Are you're talking servers? What do you got? >> Yeah, absolutely. So the Dell validated design for AI is a reference framework that is based on the optimized set of hardware for a given outcome. That includes it could be VxRail, VMware, vSphere and Nvidia GPUs and Nvidia software to make all of that happen. And for today, what we're working with is H2O.ai's solution to develop automatic machine learning. So take just that one more step to make it easier for customers to bring AI into production. >> Cool. >> So it's a full stack of software that includes automated machine learning, it includes NVIDIA's AI enterprise for deployment and development, and it's all built on an engineering validated set of hardware, including servers and storage and whatever else you need >> AI out of the box, I don't have to worry about cobbling it all together. >> Exactly. >> Cal, I want to come back to this trusted AI notion. A lot of people don't trust AI just by the very nature of it. I think about, okay, well how does it know it's a cat? And then you can never explain, it says black box. And so I'm like, what are they do with my data? And you mentioned healthcare, financial services, the government, they know everything about me. I just had to get a real ID and Massachusetts, I had to give all my data away. I don't trust it. So what is trusted AI? >> Well, so let me take a step back and talk about sobering statistics. There's a lot of different sources that report on this, but anywhere you look, you'll hear somewhere between 80 to 90% of AI projects fail to yield a return. That's pretty scary, that's a disappointing industry. And why is that? AI is hard. Versus traditional software, you're programming rules hard and fast. If I click this button, I expect A, B, C to happen. And we're talking about recognizing and reacting to patterns. It's not, will it be wrong? It's, when it's wrong, how wrong will it be? And what are it cost to accept related to that? So zooming back in on this lens of trustworthy AI, much of the last 10 years the development in AI has looked like this. Let's get the data, let's race to build the warehouses, okay we did that, no problem. Next was race to build the algorithms. Can we build more sophisticated models? Can we work with things like documents and images? And it used to be the exclusive domain of deep tech companies. You'd have to have teams of teams building the software, building the infrastructure, working on very specific components in this pipeline. And now we have this explosion of technologies, very much like what Jonathan was talking about with validated designs. So it removes the complexities of the infrastructure, it removes the complexities of being able to access the right data. And we have a ton of modeling capabilities and tools out there, so we can build a lot of things. Now, this is when we start to encounter risk in machine learning and AI. If you think about the models that are being used to replicate or learn from language like GPT-3 to create new content, it's training data set is everything that's on the internet. And if you haven't been on the internet recently, it's not all good. So how do you go about building technology to recognize specific patterns, pick up patterns that are desirable, and avoid unintended consequences? And no one's immune to this. So the discipline of trustworthy AI is building models that are easier to interrogate, that are useful for humans, and that minimize the risk of unintended consequences. >> I would add too, one of the good things about the Pandata solution is how it tries to enforce fairness and transparency in the models. We've done some studies recently with IDC, where we've tried to compare leaders in AI technology versus those who are just getting started. And I have to say, one of the biggest differences between a leader in AI and the rest of us is often that the leaders have a policy in place to deal with the risks and the ethics of using data through some kind of machine oriented model. And it's a really important part of making AI usable for the masses. >> You certainly hear a lot about, AI ultimately, there's algorithms which are built by humans. Although of course, there's algorithms to build algorithms, we know that today. >> Right, exactly. >> But humans are biased, there's inherent bias, and so this is a big problem. Obviously Dell, you have a giant observation space in terms of customers. But I wonder, Cal, if you can share with us how you're working with your customers at Pandata? What kind of customers are you working with? What are they asking? What problems are they asking you to solve? And how does it manifest itself? >> So when I like to talk about AI and where it's useful, it usually has to do with taking a repetitive task that humans are tasked with, but they're starting to act more like machines than humans. There's not much creativity in the process, it's handling something that's fairly routine, and it ends up being a bottleneck to scaling. And just a year ago even, we'd have to start approaching our clients with conversations around trustworthy AI, and now they're starting to approach us. Really example, this actually just happened earlier today, we're partnering with one of our clients that basically scans medical claims from insurance providers. And what they're trying to do is identify members that qualify for certain government subsidies. And this isn't as straightforward as it seems because there's a lot of complexities in how the rules are implemented, how judges look at these cases. Long story short, we help them build machine learning to identify these patients that qualify. And a question that comes up, and that we're starting to hear from the insurance companies they serve is how do you go about making sure that your decisions are fair and you're not selecting certain groups of individuals over others to get this assistance? And so clients are starting to wise up to that and ask questions. Other things that we've done include identifying potential private health information that's contained in medical images so that you can create curated research data sets. We've helped organizations identify anomalies in cybersecurity logs. And go from an exploration space of billions of eventual events to what are the top 100 that I should look at today? And so it's all about, how do you find these routine processes that humans are bottlenecked from getting to, we're starting to act more like machines and insert a little bit of outer recognition intelligence to get them to spend more time on the creative side. >> Can you talk a little bit more about how? A lot of people talk about augmented AI. AI is amazing. My daughter the other day was, I'm sure as an AI expert, you've seen it, where the machine actually creates standup comedy which it's so hilarious because it is and it isn't. Some of the jokes are actually really funny. Some of them are so funny 'cause they're not funny and they're weird. So it really underscored the gap. And so how do you do it? Is it augmented? Is it you're focusing on the mundane things that you want to take humans out of the loop? Explain how. >> So there's this great Wall Street Journal article by Jennifer Strong that she published I think four years ago now. And she says, "For AI to become more useful, it needs to become more boring." And I really truly believe in that. So you hear about these cutting edge use cases. And there's certainly some room for these generative AI applications inspiring new designs, inspiring new approaches. But the reality is, most successful use cases that we encounter in our business have to do with augmenting human decisions. How do you make arriving at a decision easier? How do you prioritize from millions of options, hundreds of thousands of options down to three or four that a human can then take the last stretch and really consider or think about? So a really cool story, I've been playing around with DALL.E 2. And for those of you who haven't heard, it's this algorithm that can create images from props. And they're just painting I really wish I had bought when I was in Paris a few years ago. And I gave it a description, skyline of the Sacre-Coeur Church in Montmartre with pink and white hues. And it came up with a handful of examples that I can now go take to an artist and say paint me this. So at the end of the day, automation, it's not really, yes, there's certain applications where you really are truly getting to that automated AI in action. But in my experience, most of the use cases have to do with using AI to make humans more effective, more creative, more valuable. >> I'd also add, I think Cal, is that the opportunity to make AI real here is to automate these things and simplify the languages so that can get what we call citizen data scientists out there. I say ordinary, ordinary employees or people who are at the front line of making these decisions, working with the data directly. We've done this with customers who have done this on farms, where the growers are able to use AI to monitor and to manage the yield of crops. I think some of the other examples that you had mentioned just recently Cal I think are great. The other examples is where you can make this technology available to anyone. And maybe that's part of the message of making it boring, it's making it so simple that any of us can use it. >> I love that. John Furrier likes to say that traditionally in IT, we solve complexity with more complexity. So anything that simplifies things is goodness. So how do you use automated machine learning at Pandata? Where does that fit in here? >> So really excited that the connection here through H2O that Jonathan had mentioned earlier. So H2O.ai is one of the leading AutoML platforms. And what's really cool is if you think about the traditional way you would approach machine learning, is you need to have data scientists. These patterns might exist in documents or images or boring old spreadsheets. And the way you'd approach this is, okay, get these expensive data scientists, and 80% of what they do is clean up the data. And I'm yet to encounter a situation where there isn't cleaning data. Now, I'll get through the cleaning up the data step, you actually have to consider, all right, am I working with language? Am I working with financial forecasts? What are the statistical modeling approaches I want to use? And there's a lot of creativity involved in that. And you have to set up a whole experiment, and that takes a lot of time and effort. And then you might test one, two or three models because you know to use those or those are the go to for this type of problem. And you see which one performs best and you iterate from there. The AutoML framework basically allows you to cut through all of that. It can reduce the amount of time you're spending on those steps to 1/10 of the time. You're able to very quickly profile data, understand anomalies, understand what data you want to work with, what data you don't want to work with. And then when it comes to the modeling steps, instead of iterating through three or four AutoML is throwing the whole kitchen sink at it. Anything that's appropriate to the task, maybe you're trying to predict a category or label something, maybe you're trying to predict a value like a financial forecast or even generate test. And it tests all of the models that it has at its disposal that are appropriate to the task and says, here are the top 10. You can use features like let me make this more explainable, let me make the model more accurate. I don't necessarily care about interrogating the results because the risk here is low, I want to a model that predicts things with a higher accuracy. So you can use these dials instead of having to approach it from a development perspective. You can approach it from more of an experimental mindset. So you still need that expertise, you still need to understand what you're looking at, but it makes it really quick. And so you're not spending all that expensive data science time cleaning up data. >> Makes sense. Last question, so Cal, obviously you guys go deep into AI, Jonathan Dell works with every customer on the planet, all sizes, all industries. So what are you hearing and doing with customers that are best practices that you can share for people that want to get into it, that are concerned about AI, they want to simplify it? What would you tell them? Go ahead, Cal. >> Okay, you go first, Cal. >> And Jonathan, you're going to bring us home. >> Sure. >> This sounds good. So as far as where people get scared, I see two sides of it. One, our data's not clean enough, not enough quality, I'm going to stay away from this. So one, I combat that with, you've got to experiment, you got to iterate, And that's the only way your data's going to improve. Two, there's organizations that worry too much about managing the risk. We don't have the data science expertise that can help us uncover potential biases we have. We are now entering a new stage of AI development and machine learning development, And I use those terms interchangeably anymore. I know some folks will differentiate between them. But machine learning is the discipline driving most of the advances. The toolkits that we have at our disposal to quickly profile and manage and mitigate against the risk that data can bring to the table is really giving organizations more comfort, should give organizations more comfort to start to build mission critical applications. The thing that I would encourage organizations to look for, is organizations that put trustworthy AI, ethical AI first as a consideration, not as an afterthought or not as a we're going to sweep this on the carpet. When you're intentional with that, when you bring that up front and you make it a part of your design, it sets you up for success. And we saw this when GDPR changed the IT world a few years ago. Organizations that built for privacy first to begin with, adapting to GDPR was relatively straightforward. Organizations that made that an afterthought or had that as an afterthought, it was a huge lift, a huge cost to adapt and adjust to those changes. >> Great example. All right, John, I said bring us home, put a bow on this. >> Last bit. So I think beyond the mechanics of how to make a AI better and more workable, one of the big challenges with the AI is this concern that you're going to isolate and spend too much effort and dollars on the infrastructure itself. And that's one of the benefits that Dell brings to the table here with validated designs. Is that our AI validated design is built on a VMware vSphere architecture. So your backup, your migration, all of the management and the operational tools that IT is most comfortable with can be used to maintain and develop and deploy artificial intelligence projects without having to create unique infrastructure, unique stacks of hardware, and then which potentially isolates the data, potentially makes things unavailable to the rest of the organization. So when you run it all in a VMware environment, that means you can put it in the cloud, you can put it in your data center. Just really makes it easier for IT to build AI into their everyday process >> Silo busting. All right, guys, thanks Cal, John. I really appreciate you guys coming on theCUBE. >> Yeah, it's been a great time, thanks. >> All right. And thank you for watching theCUBE's coverage of VMware Explorer, 2022. Keep it right there for more action from the show floor with myself, Dave Velante, John Furrier, Lisa Martin and David Nicholson, keep it right there. (gentle music)

Published Date : Aug 30 2022

SUMMARY :

that the VMware ecosystem I had people react to me with, what? given all the focus on privacy So having the opportunity that is based on the I don't have to worry about And then you can never and that minimize the risk And I have to say, one of algorithms to build algorithms, And how does it manifest itself? so that you can create And so how do you do it? that I can now go take to an the opportunity to make AI real here So how do you use automated And it tests all of the models that are best practices that you can share going to bring us home. And that's the only way your All right, John, I said bring And that's one of the benefits I really appreciate you And thank you for watching

ENTITIES

Entity	Category	Confidence
Jonathan	PERSON	0.99+
John	PERSON	0.99+
Jennifer Strong	PERSON	0.99+
Jonathan Seckler	PERSON	0.99+
Dave Velante	PERSON	0.99+
Lisa Martin	PERSON	0.99+
David Nicholson	PERSON	0.99+
Cleveland	LOCATION	0.99+
Paris	LOCATION	0.99+
John Furrier	PERSON	0.99+
Jonath	PERSON	0.99+
Jonathan Dell	PERSON	0.99+
two	QUANTITY	0.99+
80%	QUANTITY	0.99+
Pandata	ORGANIZATION	0.99+
NVIDIA	ORGANIZATION	0.99+
two sides	QUANTITY	0.99+
Nvidia	ORGANIZATION	0.99+
Dell	ORGANIZATION	0.99+
one	QUANTITY	0.99+
One	QUANTITY	0.99+
billions	QUANTITY	0.99+
Cleveland, Ohio	LOCATION	0.99+
Dell Technologies	ORGANIZATION	0.99+
six years ago	DATE	0.99+
four	QUANTITY	0.99+
Montmartre	LOCATION	0.99+
three	QUANTITY	0.99+
Two	QUANTITY	0.99+
GDPR	TITLE	0.99+
a year ago	DATE	0.99+
2022	DATE	0.99+
Cal Al-Dhubaib	PERSON	0.98+
today	DATE	0.98+
Cal	PERSON	0.98+
2019	DATE	0.98+
first time	QUANTITY	0.98+
VxRail	TITLE	0.98+
first	QUANTITY	0.97+
Massachusetts	LOCATION	0.97+
millions of options	QUANTITY	0.97+
AutoML	TITLE	0.97+
three models	QUANTITY	0.97+
four years ago	DATE	0.97+
80	QUANTITY	0.96+
IDC	ORGANIZATION	0.96+
90%	QUANTITY	0.96+
DALL.E 2	TITLE	0.96+
1/10	QUANTITY	0.95+
VMware Explorer	TITLE	0.93+
Sacre-Coeur Church	LOCATION	0.92+
earlier today	DATE	0.91+
theCUBE	ORGANIZATION	0.9+
H2O.ai	TITLE	0.9+
Pandata	PERSON	0.9+
hundreds of thousands of options	QUANTITY	0.87+
10	QUANTITY	0.86+
VMware vSphere	TITLE	0.84+
few years ago	DATE	0.83+
H2O	TITLE	0.83+
GPT	TITLE	0.82+
VMware	ORGANIZATION	0.8+
Al-Dhubaib	PERSON	0.8+
100	QUANTITY	0.79+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Montmartre: