3260 papers • 126 benchmarks • 313 datasets
producing a mathematical expression (symbolic expression) that fits a given tabular data.
(Image credit: Papersgraph)
These leaderboards are used to track progress in symbolic-regression-1
No benchmarks available.
Use these libraries to find symbolic-regression-1 models and implementations
No subtasks available.
This work proposes an exhaustive search and model selection by the minimum description length principle, which allows accuracy and complexity to be directly traded off by measuring each in units of information.
It is concluded that the best performing methods for real-world regression combine genetic algorithms with parameter estimation and/or semantic search drivers, and the best performing methods for real-world regression combine genetic algorithms with parameter estimation and/or semantic search drivers.
OccamNet is introduced, a neural network model that finds interpretable, compact, and sparse symbolic fits to data, a la Occam's razor, and defines a probability distribution over functions with efficient sampling and function evaluation.
Surprisingly, it is shown that not only does the model more often generate valid outputs, it also learns a more coherent latent space in which nearby points decode to similar discrete outputs.
An ML model of non-trivial Proxies of Human Interpretability can be learned from human feedback, then this model can be incorporated within an ML training process to directly optimize for interpretability, and the results show that the use of this model leads to formulas that are significantly more or equally accurate.
This paper tasks a Transformer to directly predict the full mathematical expression, constants included, and presents ablations to show that this end-to-end approach yields better results, sometimes even without the refinement step.
The correct known equations, including force laws and Hamiltonians, can be extracted from the neural network and a new analytic formula is discovered which can predict the concentration of dark matter from the mass distribution of nearby cosmic structures.
A new benchmark, "EmpiricalBench," is introduced to quantify the applicability of symbolic regression algorithms in science, and measures recovery of historical empirical equations from original and synthetic datasets.
This work develops a recursive multidimensional symbolic regression algorithm that combines neural network fitting with a suite of physics-inspired techniques and improves the state-of-the-art success rate.
Adding a benchmark result helps the community track progress.