Digital Twins. What even is that?

Digital twins help us make predictions and optimise our system. They were built using domain knowledge, equations and data. But with machine learning, making them requires lesser domain knowledge.

Digital Twins Like any other buzzword, is used by many, and understood by hardly anybody. So here, we want to at least describe, what we mean when we use the term and why this concept might make sense in the first place.

Theory: How Isaac Newton almost built a digital twin

Theoretically speaking, a digital twin consists of models of and data produced by physical assets. Note, that both parts are important here. A digital twin, to us, is a virtual representation of a physical system. Such a system can be a factory, a machine, but also a city or a building. Having such a thing allows us to try out things virtually, that would not be possible or very expensive in the real world. An example that comes to mind is CAD (computer-aided design).

They also provide interesting application opportunities for energy management, but before we get into them, let’s first jump into a, hopefully, illustrating example of the theoretical backgrounds.

Legend has it, that Isaac Newton “discovered” gravitation when he observed an apple falling from a tree. Arguably, this was the birthday of yet another model that helps us describe our world.

Before his findings, there was just chaotic, unstructured data. Falling apples can have several kinds of data associated with them. Some (randomly chosen) examples include their color, age, initial height above ground, orientation relative to earth’s rotational axis, the given name of the observer’s mother in law and windspeed a week before their falling occurred. Hardly any of that helps us determine the movement of the apple. Just data, without a model, is not particularly useful.

This is where Isaac Newton’s serendipity moment allegedly changed everything. He introduced a model by omitting noise (unnecessary data), identifying and naming the important parts, and restricting their relations. Mathematically speaking, his model looks somewhat like this:

F is the force that affects the apple (or any other thing in earth’s gravitational field), m its mass, and g earth’s gravitational pull. They relate to the change of the apple’s velocity (its acceleration) ant this relation is the basis for describing an apple’s movement through space. This is great, we can derive a ton of insights from this about hypothetical cases. Derive hypothetical trajectories of hypothetical apples. We can even determine the time it takes a hypothetical apple to fall from a hypothetical tree!

But what, if we were, like most real people, interested in real apples falling from real trees? How would we go about this? Exactly, we would use data. For our example, the required data would be the height of a specific tree. And no, we would not need the apple’s mass.

Is this already a digital twin? Not quite. Isaac Newton fell short, but this was arguably not his fault, as back in 1665, there was no “digital” way to represent all of that. Luckily, we fixed that for you and built a digital twin of an apple falling from a tree. In case you ever wanted such a thing, feel free to play around with that here.

Digital Twins for industrial appliances

Ok, apples and trees aside- what, if we wanted to use this approach in an industrial setting? No worries, the principle stays pretty much the same. Using mathematical equations and physics, we can emulate the behavior of appliances, buildings, and other parts of industrial facilities. These would be our models. As soon as these models are digital and linked with data from reals world assets, they become a digital twin and can be used to, for example, simulate scenarios or optimize operation strategies.

Take, for example, a water pump in a factory. The purpose of this pump is to move water from one place to another. As an energy manager, you might be interested in the pump’s energy demand. So for you, an appropriate model- an appropriate digital twin- should cover this aspect of the pump accurately enough.

Just like Isaac Newton, you decide upon the relevant parts of reality and take them- and only them- into further consideration to answer your questions. A simple way to model the pump is to just use its technical specifications. It’s nominal electrical power demand. Our first model could look somewhat like this:

“Whenever the pump is active, it takes 2kW of electrical power. When it is not active, it takes 0kW of electrical power. The pump is active about 50% of the time”. 2kW, 0kW, and 50% would be pieces of data concerned with your pump. The rest of the sentence would be a very simple version of a model. Note, that models do not necessarily come in the form of mathematical equations.

Using this simple model, you could, for example, determine how much energy the pump converted in the last week/the last year. You might find out, that you want to understand the behavior better and refine your model to resemble this:

“The electrical power demand of the pump is 5x the amount of water provided by the pump. Per day, the pump provides us with 1000l of water.“

This model would increase the accuracy of your analysis, but not reflect the “reality”. Generally, when talking about models, it makes sense to remember the following quote:

“…no models are **true—not even the Newtonian laws. When you construct a model you leave out all the details which you, with the knowledge at your disposal, consider inessential…. Models should not be true, but it is important that they are applicable, and whether they are applicable for any given purpose must of course be investigated. This also means that a model is never accepted finally, only on trial.”**

— Rasch, G. (1960), Probabilistic Models for Some Intelligence and Attainment Tests, Copenhagen: Danmarks Paedagogiske Institut, pp. 37–38; republished in 1980 by University of Chicago Press

An important resource modern facilities provide today, however, are IoT data sources (i.e. sensors). As part of a modern facility, there might be sensors attached to the pump measuring some of the aforementioned physical quantities. Data generated by these sensors can be used to either make the models more accurate and/or provide information on how well a given model corresponds with said reality. In the next and final part, let’s investigate this further.

IoT and digital twins

Data from sensors (IoT data) provides a great resource for building digital twins. Continuing from the pump example mentioned above, this data might come from two sensors. One measuring the volume flow rate through the pump, the other the pumps electrical power demand. Looking at example data, we seem to see some kind of relation between these two measurements. Indeed, we see a relationship in the data. We also see some other things, such as the messiness of real-world data (more on that in another blog post). Now, however, we want to concentrate on building a digital twin. To do so, just as before, we need to come up with some kind of model to make sense of the data.

Until now, we built models from domain knowledge, describing things we know about the world in either words or mathematical equations. But today, there is an alternative/ complementary approach to this. We can build models from data using statistics. A fancy word for this approach would be machine learning. When talking about this topic, on a very high level, we distinguish three different approaches based on how much domain knowledge they need.

  • white-box: hardly use any data, as in the apple example, we postulate what we expect and then put parameters manually. Need a minimum amount of data, but a lot of domain knowledge.
  • black-box: basically statistical approaches to describing the behavior of complex systems. There are no domain-specific rules needed, but a lot of data. Examples mostly come from sociology. Technologically, this is the realm of neural networks, decision trees, and others.
  • gray-box: a mix of both approaches above. This is what we are going to do here now.

Following a gray box approach, we use both data and domain knowledge to describe the behavior of our pump. From an engineering perspective, we would expect that flow rate and power demand relate somehow and that the more power is put into the pump, the more water it can move. A very simple version of the model we expected to see described as a mathematical equation might look like this.

The electrical power demand of a water pump is related to the volume flow rate through the pump. The relation is linear. This equation, however, has two parameters (d and k), which we need to estimate. Here we start using IoT data. To derive them, we use statistics- linear regression specifically.

Just as before, we now have a digital twin of a real-world asset and can now, for example, predict its behavior. In contrast to a purely white-box approach, however, we only used very little domain-specific knowledge utilizing available IoT data. This makes our approach more scalable and versatile, but also needs to be used cautiously. We will cover more on this in another post. Here, we will now continue to answer the question: “Why would I be interested in any of this in the first place?”

Use Case: Operations optimization using Digital Twins

In reality, we are often interested in operating physical assets more efficiently. Being able to predict the behavior of this system (which is exactly what our digital twins allow us to do) is a large part of being able to do so.

Imagine, you had a factory producing orders for customers and a powerful piece of software that does the following. For any given Scenario (order), it determines the optimal way to operate your facility through an optimization process, that eventually provides you with a recommendation.

Digital twins are now starting to become relevant, as they are an important part of the magical orange box in the middle. The optimization part of the tool we build consists of three parts itself. For every scenario that the optimization part receives, a series of optimization runs are carried out. An optimization run starts with the creation of a plan, a digital twin is used to estimate the results of the particular plan in question, which are, in turn, evaluated. This process is carried out until a criterion (i.e. a maximum number of simulation runs) is reached. Whatever led to the “best” evaluation result, will be recommended to a user.

Depending on what your underlying factory or facility looks like, you might have to adapt the specific digital twin that you use, but the overall principles stay the same. And we will provide some examples in future posts to illustrate how this idea can be used in real-world examples. For now, we just want to answer the last, maybe most important question.

Is this really necessary?

After all that, you might ask yourself now, why one would do all this in the first place. Why would anybody go to these lengths to create a virtual representation of an asset or even a whole factory? To answer that, let’s first have a look at the alternatives- the state of the art, if you will.

Operations optimization in general, and especially energy efficiency are still very much project-driven. This means, that every now and then a group of engineers gathers to analyze a given plant or facility and define measures to make it more energy-efficient. After that, the measures are implemented until the next project starts.

Digital twins are a powerful alternative to that. Using a virtual representation of facilities, it is possible to continuously monitor the physical system. Moreover, we can carry out experiments. Asking “what if…” questions and select our operation strategy based on the answers to these questions.

In contrast to a conventional, project-driven, approach, this allows for the following main advantages:

Accommodate for changing scenarios: Both, the physical facility and the operating conditions will change in the future. If an optimization project was carried out before the time these changes came into effect, the project results will be in vain. A digital twin, on the other hand, can be adapted to reflect changing conditions and “automatically” lead to relevant solutions that reflect these changes.

Accommodate for changing targets: Similar to the aforementioned point, your target as an organization might change over time. Imagine, that your way to operate a factory was defined in a time where energy and/or emissions were irrelevant. This will change, and, in turn, your plans will become obsolete. Using the approach described here, you could, potentially, just adapt the way you evaluate simulation results and, again, come up with valid recommendations despite having changed what you define as “good”.

Make optimizations scaleable: This is an interesting one. A project is always purely manual labor. Take, for example, an energy efficiency consultant optimizes a compressed air system in one factory. If the same consultant would do “the same” in another factory, he or she would not be able to take anything from their previous assignment apart from experience. Using digital twins, however, models can be transferred from one place to another- as long as they reflect similar machinery.

This was a lot to take in. And we really appreciate it if you made it here. Let us know what you think about the ideas mentioned here. Get in touch if you had any questions. And stay tuned for follow-ups that we will post here in due time.

Further Reading

green cement – Greenwashing or feasible promise?


Philosophical investigations into engineering units


How to never mess up your unit conversions anymore