As datasets become larger and more complex, companies in the metallurgical industry are shifting from spreadsheet-based tools like Excel to code for their data analysis needs. If working in Excel can be compared to driving a car (most employees have a driver’s license), writing code is like operating a forklift. While a car gets us from point A to B quickly, for heavier workloads a forklift becomes indispensable. But like operating a forklift, writing good code is a niche skill that takes time to develop. And unlike an obvious need for forklift drivers, the metallurgical industry historically hasn’t felt the need to hire or develop lots of coding talent. In today’s red-hot labor market, it’s tough for metallurgical companies to attract coding talent. So, could existing engineers be trained to operate this new digital machine?

There’s never enough time in production

John is an experienced and slightly gray-haired Process Engineer working at a medium-sized aluminum recycler in the north-west of Germany. After attending a conference, his management team decides that advanced analytics will improve productivity and reduce downtime. They call John into a meeting and tell him he’s going to a coding boot camp in Bavaria. John thinks “what’s a python?” but promptly agrees because he hasn’t traveled on the company’s expenses since COVID. Besides, he started to get really annoyed by all the Excel files cluttering his desktop and the idea of a computer doing his work sounds great. After a week he returns to the plant with solid Python skills on top of his decades of process experience.

…when something in the cast house is on fire (literally) and one of his workers sprained an ankle during the night shift.

After a month John becomes frustrated. He’s full of ideas but he always gets distracted before he can finish them. In the software industry there’s a trend where developers are complaining that working on a shared desk in an open-office plan is too noisy for coding. Can you imagine what a process engineer working at a factory must feel like? How can John possibly worry about “giving all the variables in his code consistent names” when something in the cast house is on fire (literally) and one of his workers sprained an ankle during the night shift? John feels he doesn’t have enough time to sit down and focus on his code. And he’s right: there’s never enough time in production.

Delegating the Process Engineer’s coding

Sarah is a master student in the final year of her engineering degree. She liked math in high school and picked up coding at university to solve her math problems more quickly. She was offered an internship at the plant to write her thesis. In the first week of her internship, she meets John at lunch. He talks about all the things he wants to improve about the process using his (now a little rusty) Python skills – if only he had the time! Sarah likes coding in Python and offers to support John with some of his ambitions.

If only he had the time!

She writes Python code quickly but doesn’t know the process well and is therefore often lost. Luckily, she can count on John’s experience to guide her data analysis in the right direction. She ends up learning a lot from him. John still doesn’t have time to code himself. However, he knows enough about code to understand Sarah’s presentations. On some days he even manages to read through her code and correct some of her assumptions. Unlike John, Sarah doesn’t own a company phone and she’s not involved in day-to-day operations or meetings. Thus, she can focus uninterrupted for hours, digging deep into the data.

Picking low-hanging fruit

Over the course of Sarah’s internship, she and John developed two data applications: an automated dashboard in the control room showing important process parameters and a report with key process KPI’s that is automatically created and emailed to John at the end of each shift. On one occasion John managed to spot some abnormal values in Sarah’s KPI’s revealing a broken temperature sensor, thus preventing an expensive problem down the line.

Neither John nor Sarah have ever used AI or set up “scalable streaming data pipelines on Kubernetes”. After all, they didn’t need big data and artificial intelligence to determine that sensor TP392 hasn’t sent any data in the last four days and is therefore likely broken. While advanced analytics has its merits, at metallurgical factories there’s lots of value to be gained by simply keeping an eye on the data and picking low-hanging fruit.

The rise of the citizen data scientist

Sarah’s internship is coming to an end. John managed to convince his department to offer her a permanent position on his team. He’s disappointed that she declines the offer (she wants to work near her family in Berlin) but genuinely wishes her all the best.

One day at lunch, John is approached by Michael who works as an automation engineer at the plant in an adjacent building. He attended Sarah’s final presentation and was impressed by the tools she used and the results she achieved. Michael is convinced of the value of data analytics and wants to contribute to the digital journey of the plant. He doesn’t have a mathematical background but is excited to use and extend the data analytics solutions that Sarah developed. And so Michael becomes the company’s first so called “citizen data scientist”: an employee with a domain knowledge but no prior analytics experience who is trained to use the tools developed by data scientists.

In the end, it’s easier and more cost-effective for plants like John’s to cultivate internal talent instead of looking for expensive data scientists that are difficult to hire and retain in the current labor market.


Key takeaways

  • Process engineers like John are well suited to develop data analysis applications because of their extensive domain knowledge but they don’t have time to write code themselves.

  • Under the guidance of a process engineer, citizen data scientists like Michael and computer savvy interns like Sarah can develop data analysis applications that add value.

  • Data analysis applications don’t have to be advanced. Because the metallurgical industry is a bit behind the curve when it comes to data analytics, a lot of value can be captured with simple rule-based approaches without resorting to AI.

  • If hiring external talent is not possible, developing citizen data scientists internally is a viable alternative.