Skip to content

Feynman

SRToolkit.dataset.feynman

Feynman symbolic regression benchmark.

Feynman

Feynman(dataset_directory: str = os.path.join(user_data_dir('SRToolkit'), 'feynman'), n_samples: int = 10000, seed: Optional[int] = 42, force_generate: bool = False)

Bases: SR_benchmark

The Feynman symbolic regression benchmark.

Contains 100 physics equations with up to 9 variables. Data is downloaded on first use from the SymbolicRegressionToolkit repository (10,000 samples per dataset instead of the original 1,000,000 from the paper). If the download fails, data is generated from the stored per-variable samplers using n_samples points and the given seed.

References

Udrescu & Tegmark (2020)

Examples:

>>> benchmark = Feynman()
>>> len(benchmark.list_datasets(verbose=False))
100

Parameters:

Name Type Description Default
dataset_directory str

Directory where dataset files are stored or will be downloaded to. Defaults to the platform-appropriate user data directory (e.g. ~/.local/share/SRToolkit/feynman on Linux).

join(user_data_dir('SRToolkit'), 'feynman')
n_samples int

Number of samples to generate per dataset when falling back to sampler-based data generation (i.e. when the download fails or force_generate=True). Defaults to 10000.

10000
seed Optional[int]

Random seed used for sampler-based data generation. Defaults to 42.

42
force_generate bool

If True, skip downloading/loading pre-generated data and always generate fresh data from samplers. Defaults to False.

False
Source code in SRToolkit/dataset/feynman.py
def __init__(
    self,
    dataset_directory: str = os.path.join(user_data_dir("SRToolkit"), "feynman"),
    n_samples: int = 10000,
    seed: Optional[int] = 42,
    force_generate: bool = False,
):
    super().__init__("feynman", dataset_directory)
    self._n_samples = n_samples
    self._seed = seed
    self._force_generate = force_generate
    self._populate()