Benchmarking SR Approaches
Loading a benchmark
SRToolkit ships with three benchmark collections. Data is downloaded or generated automatically on first use.
from SRToolkit.dataset import Feynman, Nguyen, SRSD_Feynman
bm = Feynman() # 100 physics equations, downloads ~10 MB on first use
bm = Nguyen() # 10 polynomial / trig expressions
bm = SRSD_Feynman() # 120 physics equations with per-variable sampling
Listing and creating datasets
# Print a summary table of all datasets
bm.list_datasets()
# Filter by number of variables
names = bm.list_datasets(num_variables=2, verbose=False)
# Create an SR_dataset ready for evaluation
dataset = bm.create_dataset(names[0])
print(dataset.X.shape) # (n_samples, n_variables)
print(dataset.y.shape) # (n_samples,)
print(dataset.original_equation)
Evaluating an approach
Pass any SR_approach to dataset.evaluate_approach():
from SRToolkit.approaches import ProGED
model = ProGED()
# Run 5 independent experiments and keep the top 20 expressions each
results = dataset.evaluate_approach(model, num_experiments=5, top_k=20, initial_seed=0)
results is an SR_results object. Inspect individual runs:
Resampling data
Each benchmark exposes resample() to draw a fresh sample using the same distribution as the original data:
For SRSD_Feynman, this uses the per-variable sampling objects (log-uniform, linear, integer) defined for each equation.
Using callbacks
Callbacks attach to the evaluator and fire events during search. Pass them to evaluate_approach():
from SRToolkit.evaluation.callbacks import (
EarlyStoppingCallback,
LoggingCallback,
ProgressBarCallback,
)
results = dataset.evaluate_approach(
model,
num_experiments=5,
callbacks=[
ProgressBarCallback(),
EarlyStoppingCallback(threshold=1e-6),
LoggingCallback(log_file="logs/{dataset_name}_{seed}.log"),
],
)
| Callback | Behaviour |
|---|---|
ProgressBarCallback |
tqdm progress bar updated after each evaluation |
EarlyStoppingCallback(threshold) |
stops the search once error drops below threshold |
LoggingCallback(log_file) |
appends each new best expression to a file or stdout |
Writing a custom callback
Subclass SRCallbacks and override only the events you need. Return False from any handler to stop the search early.