Julian Liebl, Industrial Technologist with a BA in Game Design explains in this conversation how to neutralize errors before they even occur.
How do you move from game design to monitoring?
I studied game design not because of gaming but because of the technology. But there is overlap. I really don’t like developing something that I can’t see “moving”. I want to see the impact and result of my development and understand how it works. Monitoring compensates me for the things I miss when developing a backend software.
What exactly is your job?
I have been the member of a team dealing with POI (points of interest) services for months now. My tasks revolve around the development and implementation of monitoring for the Content Aggregators in particular. That is a software that aggregates information, POI information that is, from several providers, then compares it and only sends the relevant information to the inquirer.
To what end?
As those responsible for infotainment services in the vehicle, we have to ensure stability and speed of the services. However, here we run into the problem of not having a direct access to the protected test and integration environment of the customer. So, whenever there’s an error alert, we have to rely on the data the customer provided. Since error alerts do not always reach us immediately, it became more and more difficult for us in the past to trace back errors. That’s why our team decided to set up a monitoring tool on the customer-supplied data which allows us to assign the error to the sources at the precise time.
Why didn’t you use the usual monitoring tools?
Our approach differs in that we marry the monitoring deeper with the software, check data packets more precisely and at a higher frequency. Where is the packet? Which “stations” did it run by? Has it arrived successfully at its destination? If not, where did it get stuck? How fast did it travel from A to B to C and D? In which states was it? It is only thanks to the fact that we evaluate every 30 seconds, we obtain more information regarding stability, performance and capacity utilization. With a microservice architecture this is pure gold ─ not only for the operation but also for the developer.
The developer gets a better grasp of how well the services are running – especially after updates – where you can find bottlenecks and how to work continuously on improving performance. We can detect early warning signs and interpret them. We can neutralize errors even before they end up at the user.
How did you realize this approach technologically?
As always, the customer provides us with a Check JS – the “diagnostics page” of the respective application, to put it simply – that allows us to look at the status of the service from perspective of the test and integration environment. Unfortunately, it only supplies us with current data, so that with error alerts arriving late, we won’t get any information regarding the status of the error occurrence. To solve that problem, we decided to save customer-supplied data onto a time-based database.
We opted for the Influx database for the persistence of the monitoring data since it is perfect for time-based data sets. It allows inquiries within seconds, also the definition of time intervals and deletes old, non-relevant data sets automatically. The service we developed, regularly queries data made available and saves it onto the database. With the help of Grafana, an open-source dashboard, we visualize the states of our services and virtual machines. In addition, we defined alerts within Grafana that automatically notify us about precarious states via mail or chat. This ranges from an error case to resource shortages and allows us to respond quickly but also to focus on other things, as long as everything is running smoothly.
Back to the Roots: Your top 3 games?
Prey, Codename Eagle, Witcher 3. But I’m not a serious gamer. That was a phase I had between 16 and 17, but it passed rather quickly.
I indulge my urge to play with my drones. I started with a DJI Phantom 2 but realized soon that I couldn’t repair it. Which, being a novice in a hobby, makes it expensive fun. And so I started to build drones myself. I’m particularly interested in cinematic recordings. That’s why my two big drones are equipped with a gimbal to ensure smooth filming as best as possible. I combined it with FPV and Head Tracking. The camera image of the drone is transmitted to me on a pair of FPV goggles on the ground – the camera moves just like I move my head. As I said earlier: I want to see the impact of my developments – and in this case, quite literally I guess.