SymFormer End-to-end symbolic regression using transformer-based architecture

Martin Vastl Czech Technical University in Prague
Jonáš Kulhánek Czech Technical University in Prague
Jiří Kubalík Czech Technical University in Prague
Erik Derner Czech Technical University in Prague
Robert Babuška Delft University of Technology


Many real-world problems can be naturally described by mathematical formulas. Recently, neural networks have been applied to the task of finding formulas from observed data. We propose a novel transformer-based method called SymFormer which we train on a large number of formulas (hundreds of millions). After training our method is considerably faster than state-of-the-art evolutionary methods. The main novelty of our approach is that SymFormer predicts the formula by outputting the individual symbols and the corresponding constants simultaneously. SymFormer architecture overview This leads to better performance in terms of fitting the available data than alternative transformer-based models. In addition, the constants provided by SymFormer serve as a good starting point for subsequent tuning via gradient descent to further improve the performance. We show on a set of benchmarks that SymFormer outperforms two state-of-the-art methods while having faster inference.

Predictions on unseen data

In this section we present qualitative results generated on the testing dataset.

Example 1: GT: $(1+x^{-2})^{-0.5}$, Pred: $\sin(|\mathrm{atan}(x)|)$
Example 2: GT: $-60.9 \cdot x \cdot \exp(-x)$,
Pred: $0.002x^3 - 61.2 \cdot x \cdot \exp(-x)$
Example 3: GT: $x-x^3+y^{-1}\sin{(y)}$, Pred: $x-x^3+y^{-1}\sin{(y)}$

Comparison to previous approaches

We have evaluated SymFormer on common benchmarks (see the paper) and compared it to current state-of-the-art approaches: NSRS [1] and DSO [2].

Table 1: Results comparing SymFormer with state-of-the-art methods on several benchmarks. We report R2 and the average time to generate an equation.
SymFormer NSRS [1] DSO [2]
benchmark R2 time (s) R2 time (s) R2 time (s)
Nguyen 0.99998 47.50 0.96744 169.46 0.99297 140.25
R 0.99986 94.33 1.00000 95.67 0.97488 855.33
Livermore 0.99996 43.00 0.88551 193.09 0.99651 276.32
Koza 1.00000 101.00 0.99999 111.50 1.00000 217.50
Keijzer 0.99904 48.67 0.97392 255.50 0.95302 3929.50
Constant 0.99998 90.88 0.88742 230.38 1.00000 2816.19
Overall avg. 0.99978 52.95 0.92901 199.63 0.99443 326.53


[1] Biggio, L., Bendinelli, T., Neitz, A., Lucchi, A. and Parascandolo, G., 2021, July. Neural Symbolic Regression that Scales. In International Conference on Machine Learning, pages 936-945. PMLR.
[2] Mundhenk, T.N., Landajuela, M., Glatt, R., Santiago, C.P., Faissol, D.M. and Petersen, B.K., 2021. Symbolic Regression via Neural-Guided Genetic Programming Population Seeding. arXiv preprint arXiv:2111.00053.


Please use the following citation:
  title={SymFormer: End-to-end symbolic regression using transformer-based architecture},
  author={Vastl, Martin and Kulh{\'a}nek, Jon{\'a}{\v{s}} and Kubal{\'i}k, Ji{\v{r}}{\'i} and Derner, Erik and Babu{\v{s}}ka, Robert},
  journal={arXiv preprint arXiv:2205.15764},