Experiments Submodule
SRToolkit.experiments
Job-based experiment runner for multi-dataset, multi-approach symbolic regression experiments.
Exports three public classes:
- ExperimentInfo — seed, result path, and adaptation-state path for a single run.
- ExperimentJob — one atomic experiment (dataset × approach × seed); can be run in-process or dispatched via the CLI.
- ExperimentGrid — full cross-product grid with serialization, HPC command-file generation, progress tracking, and result loading.
ExperimentGrid
ExperimentGrid(datasets: Union[SR_dataset, List[Union[SR_dataset, SR_benchmark]], SR_benchmark], approaches: Union[SR_approach, List[SR_approach]], num_experiments: int, results_dir: str, initial_seed: int = 0, top_k: int = 20, adapted_states: Optional[Dict[str, Dict[str, str]]] = None, callbacks: Optional[Union[SRCallbacks, List[SRCallbacks]]] = None)
Defines and manages a grid of symbolic regression experiments across multiple datasets and approaches.
Each experiment is an independent
ExperimentJob that runs one approach on one
dataset with one seed. Jobs can be executed locally (iterate and call
.run()) or on HPC clusters (generate a commands file with
save_commands).
The grid spec is persisted via save and
reloaded via load. Results are saved
per-experiment to results_dir/{dataset}/{approach}/exp_{seed}.json,
so parallel workers never write to the same file.
Seed scheme: job i (0-indexed) receives seed = initial_seed + i.
Examples:
>>> from SRToolkit.dataset import Nguyen
>>> from SRToolkit.approaches import ProGED
>>> from SRToolkit.experiments import ExperimentGrid
>>> bench = Nguyen()
>>> approach = ProGED()
>>> grid = ExperimentGrid(bench, approach, num_experiments=3,
... results_dir="/tmp/sr_run")
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
datasets
|
Union[SR_dataset, List[Union[SR_dataset, SR_benchmark]], SR_benchmark]
|
One or more datasets to run experiments on. Accepts a single
SR_dataset, a list of
|
required |
approaches
|
Union[SR_approach, List[SR_approach]]
|
One or more SR approaches. Accepts a single SR_approach or a list. |
required |
num_experiments
|
int
|
Number of independent experiments per (dataset, approach) pair. |
required |
results_dir
|
str
|
Root directory where all results and grid metadata are stored. |
required |
initial_seed
|
int
|
Seed for the first experiment. Subsequent experiments use
|
0
|
top_k
|
int
|
Number of top expressions to retain per experiment. |
20
|
adapted_states
|
Optional[Dict[str, Dict[str, str]]]
|
Optional mapping |
None
|
callbacks
|
Optional[Union[SRCallbacks, List[SRCallbacks]]]
|
Optional callback or list of callbacks forwarded to every job
created by create_jobs.
Callbacks are serialised to dicts immediately so that each job reconstructs
fresh instances in run (no shared
state between jobs). When the grid is saved via
save or
save_commands, a
|
None
|
Source code in SRToolkit/experiments/experiment_grid.py
adapt_if_missing
Pre-adapt all adaptation_scope="once" approaches where the state file is absent.
For each (approach, dataset) pair whose state file does not yet exist on disk, this method loads the dataset, calls adapt once, then persists the state via save_adapted_state. Pairs whose state file already exists are skipped.
Approaches whose adaptation_scope is not "once", or that have no entry in
the adapted_states mapping passed at construction, are skipped entirely.
Call this before create_jobs to ensure all states are ready before parallel workers start.
Source code in SRToolkit/experiments/experiment_grid.py
create_jobs
Return the list of ExperimentJob instances for this grid.
Does not trigger adaptation — call
adapt_if_missing first if any
approach has adaptation_scope="once".
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
skip_completed
|
bool
|
If |
True
|
Returns:
| Type | Description |
|---|---|
List[ExperimentJob]
|
List of jobs, one per (dataset, approach, seed) triple that has not yet completed. |
Source code in SRToolkit/experiments/experiment_grid.py
save_commands
Write a commands file with one CLI line per pending job.
Calls save first to persist the grid.
Also writes per-dataset JSON files, per-approach config JSON files, and per-job
info.json files.
Each line has the form::
python -m SRToolkit.experiments run_job \
--dataset /path/dataset.json \
--approach /path/config.json \
--info /path/exp_N/info.json \
--callbacks /path/_callbacks.json
The --callbacks flag is included only when callbacks are configured.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str
|
File path to write commands to. |
required |
python_executable
|
str
|
Python executable to use in the commands. |
'python'
|
skip_completed
|
bool
|
If |
True
|
Source code in SRToolkit/experiments/experiment_grid.py
progress
Print a dataset × approach progress table to stdout.
Each cell shows done/total experiments completed for that pair, based on
the presence of results.json files on disk.
Example output::
Dataset ProGED EDHiE
----------- ------- ------
NG-1 5/5 3/5
NG-2 2/5 0/5
Source code in SRToolkit/experiments/experiment_grid.py
load_results
Load and merge all completed per-experiment results for a (dataset, approach) pair.
Examples:
>>> results = grid.load_results("Nguyen-1", "ProGED")
>>> len(results) # number of completed experiments
5
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset_name
|
str
|
Name of the dataset. |
required |
approach_name
|
str
|
Name of the approach. |
required |
Returns:
| Type | Description |
|---|---|
SR_results
|
An SR_results object containing one EvalResult per completed experiment. Returns an empty |
Source code in SRToolkit/experiments/experiment_grid.py
save
Persist the grid specification and supporting files to results_dir.
Writes the following files (all idempotent — existing files are not overwritten):
results_dir/grid.json— the grid specification.results_dir/_datasets/{name}/{name}.json— one JSON file per dataset.results_dir/_approaches/{name}_config.json— one JSON file per approach config.results_dir/_callbacks.json— serialised callbacks, written only when callbacks are set.
save_commands calls this
automatically, so a separate save() call is only needed when checkpointing
the grid without generating a commands file.
Source code in SRToolkit/experiments/experiment_grid.py
load
staticmethod
Load an ExperimentGrid from a previously saved
grid.json.
Dataset and approach instances are not created at load time — they are reconstructed lazily when jobs are executed.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str
|
Path to the |
required |
Returns:
| Type | Description |
|---|---|
ExperimentGrid
|
A fully configured |
Raises:
| Type | Description |
|---|---|
ValueError
|
If |
Source code in SRToolkit/experiments/experiment_grid.py
ExperimentInfo
dataclass
ExperimentInfo(seed: int, result_path: str, top_k: int = 20, adapted_state_path: Optional[str] = None)
Metadata for a single experiment run.
Holds all job-specific information not contained in the dataset or approach
config: the random seed, the path where the result should be written, how many top
expressions to keep, and — for approaches with adaptation_scope="once" — where
the pre-adapted state is stored.
ExperimentGrid constructs these automatically when
you call create_jobs. When running jobs
via the CLI, info.json files are written by
save_commands and passed with
--info.
Examples:
>>> info = ExperimentInfo(seed=42, result_path="/results/exp_42.json")
>>> info.seed
42
>>> info.top_k
20
>>> d = info.to_dict()
>>> ExperimentInfo.from_dict(d) == info
True
Attributes:
| Name | Type | Description |
|---|---|---|
seed |
int
|
Random seed passed to the evaluator and the approach's |
result_path |
str
|
File path where the result JSON will be written. If a directory is
passed to ExperimentJob, the filename
|
top_k |
int
|
Number of top-ranked expressions to retain in the result. Default |
adapted_state_path |
Optional[str]
|
Base path to the pre-adapted state for |
to_dict
Serialise to a JSON-safe dictionary.
Returns:
| Type | Description |
|---|---|
dict
|
A flat dictionary with keys |
Source code in SRToolkit/experiments/experiment_grid.py
from_dict
classmethod
Restore an ExperimentInfo from a dictionary produced by to_dict.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
d
|
dict
|
Dictionary with keys |
required |
Returns:
| Type | Description |
|---|---|
ExperimentInfo
|
The reconstructed ExperimentInfo. |
Source code in SRToolkit/experiments/experiment_grid.py
ExperimentJob
ExperimentJob(dataset: Union[SR_dataset, str, dict], approach: Union[SR_approach, str, dict], info: Union[ExperimentInfo, str, dict], callbacks: Optional[Union[SRCallbacks, List[SRCallbacks], dict, List[dict]]] = None)
A single atomic experiment: one dataset × one approach × one seed.
An ExperimentJob is built from three components:
- dataset: the dataset to evaluate on — an
SR_datasetinstance, a path to aSR_dataset.to_dict()JSON file, or the dict itself. - approach: the SR approach — an
SR_approachinstance, a path to anApproachConfig.to_dict()JSON file, or the dict itself. - info: job metadata — an ExperimentInfo instance,
a path to an
ExperimentInfo.to_dict()JSON file, or the dict itself.
The three-path form makes standalone CLI execution trivial::
python -m SRToolkit.experiments run_job \
--dataset /data/DS1.json \
--approach /configs/proged_config.json \
--info /out/DS1/ProGED/exp_0/info.json
For Python use, pass instances directly::
job = ExperimentJob(my_dataset, my_approach,
ExperimentInfo(seed=0, result_path="/out/"))
job.run()
Attributes:
| Name | Type | Description |
|---|---|---|
dataset_name |
str
|
Name of the dataset, resolved at construction time. |
approach_name |
str
|
Name of the approach, resolved at construction time. |
seed |
Random seed (from |
|
result_path |
File path where the experiment result is saved (from |
|
info |
The ExperimentInfo for this job. |
|
is_complete |
bool
|
|
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset
|
Union[SR_dataset, str, dict]
|
The dataset. One of:
|
required |
approach
|
Union[SR_approach, str, dict]
|
The SR approach. One of:
|
required |
info
|
Union[ExperimentInfo, str, dict]
|
Job metadata. One of:
|
required |
callbacks
|
Optional[Union[SRCallbacks, List[SRCallbacks], dict, List[dict]]]
|
Optional callbacks to attach during
run. Accepts a single
SRCallbacks instance, a list
of instances, a single serialised callback dict, or a list of dicts.
Instances are serialised to dicts immediately so that
run always reconstructs fresh
instances (no shared state between jobs). Defaults to |
None
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If |
Source code in SRToolkit/experiments/experiment_grid.py
171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 | |
is_complete
property
True if the result file at result_path already exists on disk.
run
Execute this experiment and save the result to result_path.
Handles adaptation according to SR_approach.adaptation_scope:
"never": no adaptation."once": loads pre-adapted state from ExperimentInfo'sadapted_state_pathif a path is set and the file exists, otherwise adapts (and saves if a path is set)."experiment": adapts fresh every run.
The result is saved via
SR_results.save to result_path.