In manufacturing plants there is a multitude of systems that generate useful data. Process data produced by factory machines consists of time series and is generally saved in so called data historians (like OSIsoft’s PI system). Data about the completion of a given customer order is generated and saved in the Manufacturing Execution System or MES. Almost always, the MES is a relational database (like Oracle). Finally, data about the assigned of customer orders to different plants and machines is handled by the Enterprise Resource Planning or ERP system. Commonly, the ERP is designed in SAP.

Let’s imagine an innovative data use case where we want to inform a client in the automotive industry of the total emissions footprint of the aluminum they bought from our plant in the year 2022. This calculation requires data about process variables like gas and electricity consumption (historian), for each machine step (MES) across all the customers orders in all the plants (ERP). This requires joining the data from these three systems. Technically this is certainly possible, but tricky because of the two following problems. Firstly, the data in these systems is completely separated: it resides on different networks or even different plants and therefore has to be moved to one place. Secondly, because these systems are provided by different vendors and represent very different things (a temperature measurement looks very different from a sales order), they have completely different data models. As a result, properly combining and joining the data requires extensive engineering effort which takes a lot of time and costs a lot of money.

As a solution, manufacturing companies have started implementing (cloud) data lakes as a crucial part of their digital transformation journey. Data lakes act as a single source of truth because they combine the data from all these separate systems in a single, robust database that is easily accessible from anywhere over the internet. This solves the first problem of separate data silos. However, this solution does not address the second problem of the discrepancy in data models! This is because often these systems are simply replicated in the data lake: there is a table with MES data and another table with pressure sensor values from the historian.

We may ask ourselves: do we really need to carry the historical notions of “MES”, “ERP” and “Historian” into the future? Can we instead come up with a simpler data model by, perhaps, organizing **all** the company’s data into a single hierarchical relationship by region, plant, production line, machine, sensor, etc. Such a solution that combines all the company’s data in a single source of truth and under one common data model has recently been coined as a Unified Namespace or UNS and is quickly gaining a lot of attention in the Industry 4.0 world.