Fusion Plasmas Meet Their Match in Reinforcement Learning
A team of researchers at DeepMind and the Swiss Federal Institute of Technology in Lausanne, Switzerland (EPFL), has used a kind of AI called deep reinforcement learning (RL) to control the magnetic coils of a tokamak, a donut-shaped reactor used for fusion research and one of the leading candidates to generate electric power from fusion. Tokamaks create a fusion reaction within a hot plasma inside a strong magnetic field, all controlled by a structure of magnetic coils. Though AI has been used in fusion research before for things like ex post facto analysis, this is the first time it’s been used to directly control a tokamak. This experimental first application of RL to tokamak control could hint at the promise of future applications of AI to help achieve higher fusion efficiencies.
“We are in early days on this,” said Martin Riedmiller, the head of the control team at DeepMind and an author of the new paper. He says that in the future, the conversation between AI and fusion researchers could lead to the development of completely new ways of achieving and sustaining fusion reactions.
RL algorithms work by using a system of trial and error, making guesses as to what could work to come up with a solution that is the most effective. To train their algorithm, the researchers exposed the algorithm to mathematical simulations of the physics of fusion.
The reinforcement learning neural net allowed scientists to “sculpt” plasma into different shapes, better enabling them to study what kinds of structure work the most effectively for fusion.
“The actual quality of the underlying physics models that we use to do the simulations of fusion reactors has really, really improved,” said Frederico Felici, a researcher at EPFL and another author of the paper.
The researchers used a method of training called an actor-critic method, where one neural network rates the data on whether it produces a high-quality solution, while the other network takes that data and uses it to control the fusion reaction.
After the algorithm was trained using the simulated environment, the researchers tested it out with an actual tokamak—the Variable Configuration Tokamak (also called Tokamak à Configuration Variable or TCV) at EPFL. First, the researchers used traditional control methods to form the plasma and establish its location and current. Then they “handed off” control to the RL system. Because making changes to the actual fusion process can be dangerous and destructive, the algorithm was not trained at all by the actual reaction and only received training from simulations—a “zero-shot” transfer from training to the real world, as the researchers write in their paper.
“It’s extremely important that they were able to show that we were able to build this model using the simulated environment and apply it—and it actually worked on the real experiment,” said Chris Hansen, a senior research scientist at the University of Washington who does research on fusion and plasma science and was not involved in the study. “You want [the algorithm] to work from day one.”
The researchers initially tested the system by starting the experiment, increasing the instability of the plasma, and bringing it back down the plasma’s initial condition. After this basic test, the group experimented with different plasma configurations. In essence, they were able to “sculpt” plasma into different shapes, better enabling them to study what kinds of structure work the most effectively for fusion. The team created more typical elongated, oval-like shapes, as well as one nicknamed the “snowflake” configuration, as well as one that looks like a triangle on its side. They also formed a pattern of two independent “droplets” of plasma in the reactor for the first time.
This cutaway rendering of the Swiss Federal Institute of Technology’s experimental tokamak fusion reactor reveals the complex layers of plasma equilibrium coils involved. DeepMind
“The shape has a fundamental effect on the quality of the plasma confinement, so how well the plasma keeps the heat inside, and on the stability of the plasma—to what extent it is prone to whatever unstable events might happen,” said Felici. Events like disruptions, in which the plasma escapes the magnetic field, interrupt the reaction and can even cause damage.
This RL-based control method is simpler than other methods for fusion control, though it’s not necessarily more effective. For instance, the usual method of controlling a tokamak uses several independent controllers working in tandem. The new method replaces this system with a single controller.
In the future, the researchers hope to come up with ways to simulate and study the internal dynamics of different plasma configurations, not only magnetic control of the reactor’s coils. Using RL also has some inherent disadvantages, they add. Any deep learning system, after all, is a “black box.” Because the ways the system comes to its conclusions aren’t obvious, there wouldn’t be a clear way to know what happened if something went wrong. Nevertheless, RL as a method of plasma control exhibits clear advantages in actually controlling the otherwise notoriously unstable system, which is why the researchers continue to express such optimism—and why they’ll be continuing to improve upon their initial successes.
“It’s just really exciting to see where this can go in the future,” said Hansen.
The results of the group’s work were published earlier this month in the journal Nature.