March 1, 2024

Expert explains the ‘physics of AI’

This article has been reviewed in accordance with Science X’s editorial process and policies. The editors have highlighted the following attributes, ensuring the credibility of the content:

checked

trusted source

review


Learning actions from data. We observe a physical system of interacting degrees of freedom (gray dots), whose precise interactions are unknown (shaded areas). We train a neural network on system measurements. The network learns an unsupervised estimate of the distribution of the training data. We extract the action of the network parameters layer by layer, using a diagrammatic language. The final action coefficients A(k) represent learned interactions (pink nodes). Credit: Physical Review (2023). DOI: 10.1103/PhysRevX.13.041033

× to close


Learning actions from data. We observe a physical system of interacting degrees of freedom (gray dots), whose precise interactions are unknown (shaded areas). We train a neural network on system measurements. The network learns an unsupervised estimate of the distribution of the training data. We extract the action of the network parameters layer by layer, using a diagrammatic language. The final action coefficients A(k) represent learned interactions (pink nodes). Credit: Physical Review (2023). DOI: 10.1103/PhysRevX.13.041033

The development of a new theory is normally associated with the great names in physics. You can think of Isaac Newton or Albert Einstein, for example. Many Nobel Prizes have already been awarded for new theories.

Researchers at Forschungszentrum Jülich have now programmed an artificial intelligence that has also mastered this feat. Its AI is capable of recognizing patterns in complex data sets and formulating them into a physical theory. The findings are published in the journal Physical Review.

In the following interview, Prof. Moritz Helias from the Institute for Advanced Simulation at Forschungszentrum Jülich (IAS-6) explains what “AI Physics” is all about and how far it differs from conventional approaches.

How do physicists create a new theory?

Generally you start with observations of the system before trying to propose how the different components of the system interact with each other to explain the observed behavior. New predictions are then derived from this and put to the test.

A well-known example is Isaac Newton’s law of gravitation. It not only describes the gravitational force on Earth, but can also be used to predict the movements of planets, moons and comets – as well as the orbits of modern satellites – quite accurately.

However, the way in which such hypotheses are reached always differs. You can start with general principles and basic equations of physics and derive hypotheses from them, or you can choose a phenomenological approach, limiting yourself to describing observations as accurately as possible without explaining their causes. The difficulty lies in selecting a good approach among the countless possible approaches, adapting it if necessary and simplifying it.

What approach are you taking with AI?

In general, it involves an approach known as “physics for machine learning.” In our working group, we use physics methods to analyze and understand the complex function of an AI.

The crucial new idea developed by Claudia Merger in our research group was to first use a neural network that learns to accurately map complex observed behavior onto a simpler system. In other words, AI aims to simplify all the complex interactions we observe between system components. We then use the simplified system and create an inverse mapping with the trained AI. Returning from the simplified to the complex system, we then developed the new theory.

On the way back, complex interactions are built piece by piece from simplified ones. Ultimately, the approach is therefore not that different from that of a physicist, with the difference being that the way interactions are put together is now read from the parameters of the AI. This perspective of the world – explaining it based on the interactions between its different parts that follow certain laws – is the basis of physics, hence the term “AI physics”.

In what applications has AI been used?

We use a dataset of black and white images with handwritten numbers, for example, which is often used in research when working with neural networks. As part of her doctoral thesis, Claudia Merger investigated how small substructures in images, such as the edges of numbers, are formed by interactions between pixels. Groups of pixels are found that tend to be brighter together and therefore contribute to the shape of the edge of the number.

How high is the computational effort?

Using AI is a trick that makes calculations possible in the first place. You quickly reach a large number of possible interactions. Without using this trick, you could only observe very small systems. Despite this, the computational effort involved is still high, which is due to the fact that there are many possible interactions even in systems with many components.

However, we can efficiently parameterize these interactions so that we can now visualize systems with around 1,000 interacting components, that is, image areas with up to 1,000 pixels. In the future, much larger systems should also be possible through greater optimization.

How does this approach differ from other AIs like ChatGPT?

Many AIs aim to learn a theory from the data used to train the AI. However, the theories that AIs learn often cannot be interpreted. Instead, they are implicitly hidden in the parameters of the trained AI. In contrast, our approach extracts learned theory and formulates it in the language of interactions between system components, which underlies physics.

It therefore belongs to the field of explainable AI, specifically “AI physics”, since we use the language of physics to explain what the AI ​​has learned. We can use the language of interactions to build a bridge between the complex inner workings of AI and theories that humans can understand.

More information:
Claudia Merger et al, Learning interaction theories from data, Physical Review (2023). DOI: 10.1103/PhysRevX.13.041033

Leave a Reply

Your email address will not be published. Required fields are marked *