Russ Caldwell, Dell EMC & Philipp Niemietz | CUBE Conversation, October

(calm techno music) >> Hey, welcome to this Cube Conversation. I'm Lisa Martin. I've got two guests here with me. Please Welcome Philipp Niemietz, the intermediate head of the department for the Laboratory of Machine Tools and Production Engineering or WZL. Philipp, welcome to the program. >> Thank you. >> And we have Russ Caldwell here as well, senior product manager at Dell Technologies. Russ, great to see you. >> Thanks for the invite. >> Absolutely. We're going to be talking about how the enhanced video capabilities of Dell EMC's streaming data platform are enabling manufacturing, anomaly detection, and quality control through the use of sensors, cameras, and x-ray cameras. We're going to go ahead, Philipp, and start with you. We're abbreviating the lab as you guys do as WZL. Talk to us about the lab. What types of problems are you solving? >> Yeah, thank you. In the laboratory for machine tools, we are looking at actually all the other problems that arise in production engineering in general. So that's from the actual manufacturing of work pieces and that's getting used in aerospace or automotive industries, and really dig into the specifics of how those metal parts are manufactured, how they are formed, what are the mechanics of this. So this is a very traditional area where we are coming from. We're also looking at like how to manage all those production systems, how to come up with decision-making processes that's moving those engineering environments forward. But in our department, we recently get... 10 years ago... This Industry 4.0 scenario is getting more and more pushed into authentic research. So more and more data is gathered. We have to deal with a lot of data coming from various sources, and how to actually include this in the research, how to derive new findings from this, or even maybe, even physical equations from all the data that we are gathering around this manufacturing technologies. And this is something that we're, from the research perspective, looking at. >> And talk to me about when you were founded. You're based in Germany, but when was the lab founded? >> The lab was founded 100 years ago, about 100 years ago. It's like a very long history. It is the largest institute for production engineering in Germany, or maybe even in Europe. >> Got it. Okay. Well, 100 years. Amazing innovation that I'm sure the lab has seen. Russ, let's go over to you. Talk to us about the Dell EMC streaming data platform or SDP is what referred to it. >> Yeah. Thanks Lisa. So it's interesting that Philipp brings up Industry 4.0 because this is a prime area where the streaming data platform comes into play. Industry 4.0 for manufacturing really kind of encompasses a few things. It's real-time data analysis. It's automation, machine learning. SDP pulls all that together. So it's a software solution from Dell EMC. And one of the ways we make it all happen is we've unified this concept of time in data. Historical data and real-time data are typically analyzed very, very differently. And so we're trying to support Industry 4.0 manufacturing use cases. That's really important, right? Looking at historical data and real-time data, so you can learn from the past, work you've done on the factory floor, and apply that in real-time analytics. And the platform is used to ingest store and analyze data of this real-time and historical data. It leverages a high availability and dynamic scaling with Kubernetes. So that makes it possible to have lot different projects on the platform. And it really offers a lot of methods to automate this high speed and high precision activities that Philipp's talking about here. There's a lot of examples where it comes into play. It's really exciting to work with Philipp and the team there in Germany. But what's great about it is it's a general purpose platform that supports things like construction where they're doing drones with video ingestion, tracking resources on the ground, and things like that. Predictive maintenance and safety for amusement parks, and many other use cases. But with Industry 4.0 and manufacturing, RWTH and Philipp's team has really kind of pushed the boundaries of what's possible to automate and analyze data for the manufacturing process. >> What a great background. So we understand about the lab. We understand about Dell EMC SDP. Philipp, let's go back to you. How was the lab using this technology? >> Yeah, good question. Maybe, going a little bit back to the details of the use case that we are presenting. We started maybe five, six years ago where all this Industry 4.0 was put into research where you wanted to get more data out of the process now. So we started to apply a little census to the machine, starting with the more traditional ones, like energy consumption and some control information that we get from the machine tool itself. But the sensor system are quite like not that complex. And we could deal with the amount of data fairly easy now using just a USB sticks and some local devices, just a storage. But as it's getting more sophisticated, we're getting more sensor data. We're applying new sensor systems with the tool where the extra process is taking place, throughout the year, like delicious information is hidden. So we're getting really close to the process, applying video data, bigger data streams, more sensor data, and even like are not something like an IoT scenarios. We usually have some data points per second, but we're talking here about census that have like maybe a million data points a second now. So every high frequencies that we have to deal with, and of course, then we had to come up with some system that actually have to do this, help to deal with this data. And yeah, use the classic big data stack that we then set up for ourselves in our research facility to deal with this amount of streaming data to then apply historical analysis. Like Russ just talked about on this classic Hadoop data stack where we used Kafka and Storm for ingestion, and then for streaming processing, and Spark for this traditional historical analysis. And actually, this is exactly where the streaming data platform came into play because we had a meeting with one of the techy account at the university. And we were like talking about this. We were having a chat about this problem. And he's like, "Oh, we have something going on in America, in USA with this a streaming data platform. It was still under a code name or something." And then actually, Russ and I got into contact then talking about the streaming data platform, and how we could actually use it, and get getting part. We were taking part in the alpha program, really working with the system with the developers. And it was really an amazing experience. >> Were you having scale problems with the original kind of traditional big data platform that you talked about with Hadoop, Apache, Kafka, Spark? Was that scale issues, performance issues? Is that why you looked to Dell EMC? >> Yeah. There were several issues, like one is the scaling option now. And when we were not always using all of the sensors, we are just using some of the sensors. We're thinking about account process to different manufacturing technologies, different machines that we have in our laboratory so that we can quickly add sensors. They are shut down sensors. Do not have to take care about setting up new workers or stuff so that the work balance is handled. But that's not the only thing. We also had a lot of issues with administrating this Hadoop stacks. It's quite error prone if you do it yourself, like we are still in the university even though we are very big level laboratory. We still have limited resources. So we spend a lot of time dealing with the dev ops of the system. And actually, this is something where on the streaming data platform actually helped us to reduce the time that we invested into this administration processes. We were able to take more time into the analytics, which is actually what we are interested in. And specifically, the point that Russ talked about this unified concept of time, we now can just apply one and that type of analysis on historical and streaming data, and do not have to separate domains that we have to deal with. Now we dealt with Kafka, and Storm on one side, and Spark on the other side. And now, we can just put it into one model and actually reduce the time now to maintain and handle and implement the code. >> The time reduction is critical for the overall laboratory, the workforce productivity of the folks that are using it. Russ, let's go back to you. Tell us about, first of all, how long has the Dell EMC SDP been around? And what are some of the key features that WZL is leveraging that you're also seeing benefit other industries? >> So the product actually officially launched in early 2020. So in the first quarter of 2020. But what Philipp was just talking about, his organization was actually in the alpha and the beta programs earlier than that in 2019. And that's actually where we had a cross-section of very different kinds of companies in all sorts of industries all over the world; in Japan, and Germany, in the US. And that's where we started to see this pattern of commonality of challenges, and how we could solve those. So one of those things we mentioned that unified concept of time is really powerful because with one line of code, you can actually jump to any point on the timeline of your data, whether it's the real-time data coming off of the sensors right now or something minutes, hours, years ago. And so it's really, really powerful for the developers. But we saw the common challenges that Philipp was just talking about everywhere. So the SDP, one of the great things about it is it's a single piece of software that will install, manage, secure, upgrade, and be supported of all the components that you just heard Philipp talking about. So all the pieces for the ingestion, the storage and the analytics are all in there. And that makes it easier to focus on the problem there. There was other common challenges that our customers were seeing as well. Things like this concept of derived streams, so that you can actually bring in raw streams of data, leave it in its raw form because many times, regulatory reasons, audit reasons, you want to not touch that data. But you can create parallel streams of that data that are called derived streams that are versions that you've altered for some consumption or reporting purposes without affecting the others. And that's powerful when you have multiple teams analyzing different data. And then finally, the thing that Philipp mentioned we saw everywhere, which was a unified way to interact with sensors all the same way because there's sensors for IoT sensors, telemetry log files, video, X-ray, infrared, all sorts of things. But being able to simplify that so that the developers and the data scientists can really build models to solve a business problem was really where we started to focus on how we wanted to bring to market the value of SDP. >> So you launched this, right? And you said early 2020, right before the pandemic and all of the chaos that has- >> Don't recommend that by the way. Don't recommend launching into a pandemic. But yes. >> I'm sure that a lot of lessons learned from silver linings, I'm sure. >> That's right. >> But obviously, big challenges there. I'm curious thought if you thought. One of the things that we've learned from the pandemic is that for so many industries, the access to real-time data is no longer just a nice to have. It is a critical differentiator for those that needed to pivot multiple times to survive in the early days to thrive to continue pivoting. I'm curious, what other industries you saw Russ that came to you saying, "All right, guys. We've got challenges here. Help us figure this out."? Give me a snapshot of some of the other industries that were sort of leading Edge last year. >> Sure. There was some surprising ones. I've mentioned it a little bit, but it's interesting you give me a chance to talk about them. 'cause what was also shocking about this was not only that the same problems that I just mentioned happened in multiple industries. It was actually the prevalence of certain kinds of data. So for example, the construction example I gave you where a company was using drones to ingest streaming video as well as Telemetry of all the equipment on the ground. Drones are in all sorts of industries. So it turns out that's a pattern. But even a lower level than just drone data is actually video data or any kind of media data. And so Philipp talked about they're using that kind of data as well in manufacturing. We're seeing video data in every industry combined with other sensor data. And that's what's really surprised us in the beta program. So working with Philipp, we actually altered our roadmap after we launched to realize that we needed to escalate even more features about video analysis and actually be able to take the process even closer to the Edge where the data's being generated. So the other industries, including construction, logistics, medicine, network traffic, all sorts of data, that is a continuous unbounded stream of data falls into the category of being able to be analyzed, stored, playback like a DVR with SDP. >> Playback like a DVR. I like that. Philipp, back over to you. Talk to us about what's next. Obviously, a tremendous amount of innovation in the first 100 years of WZL. Talk to me about what some of the lab's plans are for the future from a streaming data perspective, got a great foundation infrastructure there with Dell EMC. What's next? >> Like we are working together with a large industry consortium, and then we get a lot of information. Not information, but they really want to see that all this big data stuff that's coming into Industry 4.0. And Russ already talked about it. And then, I'm pretty satisfied in having all the data and the data centers that they have, but they want to push it to the Edge. So all the analytics, it's getting more and more to the Edge because they see that the more data you gather, the more data has to be transferred via the network. So we have to come up with ways on, of course, deploy all the model on the Edge, maybe do some analytics on the Edge. I don't know, something like federated learning to see. Maybe you don't even need to transfer the data to the data center. You can start learning approaches on the Edge and combine them with different data sources that are actually sharing the data, which is the specific point in like corporations that want to corporate using the different data sources, but have some privacy issues. So this is something that we are looking into. And also, working like low-code or no-code environments, like different framework that we use here just in our laboratory, but this is also something that we see in the industry. And more and more people have to interact with the data management systems. So they have to somehow get a lower access point than just some pile from script that they need to write. Maybe, they just need drag and drop environment where they can modify some ingestion or some transformation to the data. So they're not always the people and all the data engineers or the computer science experts have to deal with those kind of stuff, and other people can do as well. So this is something that we are looking into this in the next future. But, yeah. But there are a lot of different things, and there's not enough time to talk about all of them. >> So it sounds like an idea to democratize that data to allow more data citizens to leverage that, analyze it and extract value from it because we all know data is oil, it's gold, but only if you can actually get those analysis quickly and make decisions that really affect and drive the business. Russ, last question for you. Talk to us about what you see next coming in the industry. Obviously, launching this technology at a very interesting time, a lot of things have changed in the last year. You've learned a lot. You said you modified the technology based on the WZL implementation. But what are some of the things that you see coming next? >> So it's really interesting 'cause my colleague at Dell constantly reminds me that people develop solutions with the technology they have at the time, right? It's a really obvious statement, but it's really powerful to realize what customers of ours have been doing so far. It's been based on batch tools and storage tools that were available at the time, but weren't necessarily the best match for the problem that we're trying to solve. And the world is moving completely to a real-time view of their data. If you can understand that answer sooner, there's higher value for higher revenue, lower costs, safety, all sorts of reasons, right? To do that, everyone's realizing you can't really count on... Like Philipp, he can't count on moving all the data somewhere else to make that decision, that latency; or sometimes, rules around controlling what data can go. Really, we'll keep it from that. So being able to move code closer to the data is where we see things are really happening. This is actually why the streaming data platform has really focused heavily on Edge implementations. We have SDP Core for the core data center. We also have SDP Edge that runs on single node in three node configurations for a headless environments for all sorts of use cases where you need to move the code and make the decisions right when the data is generated at the sensors. The other things we see happening in the industry that are really important is everything's moving to a fully software-defined solution. This idea of being able to have software-defined stream ingestion, analytics and storage. You can deploy the solution you want in the form factor that you have available at your location is important, right? And so, fully software-defined solutions is really going to be where things are at, and which gives you this kind of cloud-like experience, but you can deploy it anywhere at the Edge, Core or cloud, right? And that's really, really powerful. Philipp picked up on the one that we see a lot of this idea of low-code, no-code whether it's things like node red in the IoT world, where you're being able to stitch together a sequence of functions to answer questions in real time or other more sophisticated tools. That ability to, like you said, democratize what people can do with the data in real time is going to be extremely valuable as things move forward. And then the biggest thing we see that we're really focused on is we need to make it as easy as possible to ingest any kind of data. The more data types that you can bring in, the more problems you can solve. And so bringing on as many on-ramps and connectivity into other solutions is really, really important. And for all that, SDP's team is really focused on trying to prioritize the customers like Philipp's team in the RWTH WZL labs there. But finding those common patterns everywhere so that we can actually kind of make it the norm to be analyzing streaming data, not just historical batch data. >> Right. That's outstanding. As you said, the world is moving to real-time analytics. Real-time data ingestion is absolutely critical on there. Just think of the problems that we don't even know about that we could solve. Guys, thank you for joining me today, talking about what WZL is doing with the Dell EMC streaming data platform, and all the innovations you've done so far, and what's coming in the future. We'll have to catch up in the next six months or so, and see what great progress you've made. Thank you for your time. >> Thanks, Lisa. >> Thank you. >> For my guests, I'm Lisa Martin. You're watching a Cube Conversation. (calm techno music)

Published Date : Oct 19 2021

SUMMARY :

for the Laboratory of Machine Tools Russ, great to see you. how the enhanced video capabilities from all the data that we are gathering And talk to me about It is the largest institute I'm sure the lab has seen. So that makes it possible to Philipp, let's go back to you. of the use case that we are presenting. so that the work balance is handled. for the overall laboratory, And that makes it easier to Don't recommend that by the way. I'm sure that a lot of lessons learned that came to you saying, that the same problems that in the first 100 years of WZL. the more data has to be Talk to us about what you see in the form factor that you have available and all the innovations I'm Lisa Martin.

ENTITIES

Entity	Category	Confidence
Lisa Martin	PERSON	0.99+
Europe	LOCATION	0.99+
Germany	LOCATION	0.99+
Lisa	PERSON	0.99+
Russ	PERSON	0.99+
America	LOCATION	0.99+
2019	DATE	0.99+
Japan	LOCATION	0.99+
Dell	ORGANIZATION	0.99+
Russ Caldwell	PERSON	0.99+
Philipp Niemietz	PERSON	0.99+
US	LOCATION	0.99+
Philipp	PERSON	0.99+
USA	LOCATION	0.99+
WZL	ORGANIZATION	0.99+
RWTH	ORGANIZATION	0.99+
last year	DATE	0.99+
Dell Technologies	ORGANIZATION	0.99+
first quarter of 2020	DATE	0.99+
October	DATE	0.99+
two guests	QUANTITY	0.99+
Dell EMC	ORGANIZATION	0.99+
early 2020	DATE	0.99+
one line	QUANTITY	0.99+
today	DATE	0.98+
10 years ago	DATE	0.98+
one	QUANTITY	0.98+
100 years ago	DATE	0.98+
first 100 years	QUANTITY	0.98+
one model	QUANTITY	0.98+
One	QUANTITY	0.98+
Apache	ORGANIZATION	0.97+
Edge	TITLE	0.96+
100 years	QUANTITY	0.96+
single piece	QUANTITY	0.96+
Dell EMC	ORGANIZATION	0.96+
Spark	TITLE	0.95+
six years ago	DATE	0.93+
about 100 years ago	DATE	0.91+
SDP Edge	TITLE	0.89+
three node	QUANTITY	0.89+
one side	QUANTITY	0.88+
SDP	ORGANIZATION	0.88+
pandemic	EVENT	0.88+
Philipp	ORGANIZATION	0.88+
Laboratory of Machine Tools and Production Engineering	ORGANIZATION	0.86+
next six months	DATE	0.84+
single node	QUANTITY	0.82+
Dell EMC SDP	ORGANIZATION	0.81+
a million data points a second	QUANTITY	0.81+
five	DATE	0.77+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Laboratory of Machine Toolsand Production Engineering: