Nguyen
SRToolkit.dataset.nguyen
Nguyen symbolic regression benchmark.
Nguyen
Bases: SR_benchmark
The Nguyen symbolic regression benchmark.
Contains 10 expressions without constant parameters (first 4 are polynomials, first 8 use one variable, last 2 use two variables). The benchmark ships with pre-generated data.
For more information about the Nguyen benchmark, see: https://doi.org/10.1007/s10710-010-9121-2
Examples:
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset_directory
|
str
|
Directory where dataset files are stored or will be downloaded to.
Defaults to the platform-appropriate user data directory (e.g. |
join(user_data_dir('SRToolkit'), 'nguyen')
|
Source code in SRToolkit/dataset/nguyen.py
resample
Generate fresh data for a dataset by sampling new inputs and evaluating the ground truth.
Variable bounds are taken from _BOUNDS.
Examples:
>>> benchmark = Nguyen('data/nguyen/')
>>> X, y = benchmark.resample('NG-1', n=200, seed=42)
>>> X.shape
(200, 1)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset_name
|
str
|
Name of the dataset to resample. |
required |
n
|
int
|
Number of new samples to generate. |
required |
seed
|
Optional[int]
|
Random seed for reproducibility. |
None
|
Returns:
| Type | Description |
|---|---|
Tuple[ndarray, ndarray]
|
A tuple |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the dataset has no ground truth expression. |