Full-Factorial Sampling#

The pysmo.sampling.UniformSampling method carries out Uniform (full-factorial) sampling. This can be done in two modes:

The samples can be selected from a user-provided dataset, or
The samples can be generated from a set of provided bounds.

Available Methods#

class idaes.core.surrogate.pysmo.sampling.UniformSampling(data_input, list_of_samples_per_variable, sampling_type=None, xlabels=None, ylabels=None, edges=None)[source]#

A class that performs Uniform Sampling. Depending on the settings, the algorithm either returns samples from an input dataset which have been selected using Euclidean distance minimization after the uniform samples have been generated, or returns samples from a supplied data range.

Full-factorial samples are based on dividing the space of each variable randomly and then generating all possible variable combinations.

The number of points to be sampled per variable needs to be specified in a list.

To use: call class with inputs, and then sample_points function

Example:

# To select 50 samples on a (10 x 5) grid in a 2D space:
>>> b = rbf.UniformSampling(data, [10, 5], sampling_type="selection")
>>> samples = b.sample_points()

__init__(data_input, list_of_samples_per_variable, sampling_type=None, xlabels=None, ylabels=None, edges=None)[source]#

Initialization of UniformSampling class. Three inputs are required.

Parameters:

data_input (NumPy Array, Pandas Dataframe or list) –
The input data set or range to be sampled.
- When the aim is to select a set of samples from an existing dataset, the dataset must be a NumPy Array or a Pandas Dataframe and sampling_type option must be set to “selection”. A single output variable (y) is assumed to be supplied in the last column if xlabels and ylabels are not supplied.
- When the aim is to generate a set of samples from a data range, the dataset must be a list containing two lists of equal lengths which contain the variable bounds and sampling_type option must be set to “creation”. It is assumed that the range contains no output variable information in this case.
list_of_samples_per_variable (list) – The list containing the number of subdivisions for each variable. Each dimension (variable) must be represented by a positive integer variable greater than 1.
sampling_type (str) – Option which determines whether the algorithm selects samples from an existing dataset (“selection”) or attempts to generate sample from a supplied range (“creation”). Default is “creation”.

Keyword Arguments:

xlabels (list) – List of column names (if data_input is a dataframe) or column numbers (if data_input is an array) for the independent/input variables. Only used in “selection” mode. Default is None.
ylabels (list) – List of column names (if data_input is a dataframe) or column numbers (if data_input is an array) for the dependent/output variables. Only used in “selection” mode. Default is None.
edges (bool) – Boolean variable representing how the points should be selected. A value of True (default) indicates the points should be equally spaced edge to edge, otherwise they will be in the centres of the bins filling the unit cube

Returns:

self function containing the input information

Raises:

ValueError – The data_input is the wrong type, or list_of_samples_per_variable is of the wrong length, or list_of_samples_per_variable is invalid.
TypeError – When list_of_samples_per_variable is not a list, or list_of_samples_per_variable contains elements other than integers, sampling_type is not a string, or edges entry is not Boolean
IndexError – When invalid column names are supplied in xlabels or ylabels

sample_points()[source]#

sample_points generates or selects full-factorial designs from an input dataset or data range.

Returns:: A numpy array or Pandas dataframe containing the sample points generated or selected by full-factorial sampling.
Return type:: NumPy Array or Pandas Dataframe

References#

[1] Loeven et al paper titled “A Probabilistic Radial Basis Function Approach for Uncertainty Quantification” https://pdfs.semanticscholar.org/48a0/d3797e482e37f73e077893594e01e1c667a2.pdf

Full-Factorial Sampling

Contents

Full-Factorial Sampling#

Available Methods#

References#