Generating Kriging Models with PySMO¶
The pysmo.kriging trains Ordinary Kriging models. Interpolating kriging models assume that the outputs \(\hat{y}\in\mathbb{R}^{m\times1}\) are correlated and may be treated as a normally distributed stochastic process. For a set of input measurements \(X=\left\{ x_{1},x_{2},\ldots,x_{m}\right\} ;x_{i}\in\mathbb{R}^{n}\), the output \(\hat{y}\) is modeled as the sum of a mean \(\left(\mu\right)\) and a Gaussian process error,
Kriging models assume that the errors in the outputs \(\epsilon\) are correlated proportionally to the distance between corresponding points,
The hyperparameters of the Kriging model \(\left(\mu,\sigma^{2},\theta_{1},\ldots,\theta_{n},\tau_{1},\ldots,\tau_{n}\right)\) are selected such that the concentrated log likelihood function is maximized.
Basic Usage¶
To generate a Kriging model with PySMO, the pysmo.kriging class is first initialized, and then the function training is called on the initialized object:
# Required imports
>>> from idaes.surrogate.pysmo import kriging
>>> import pandas as pd
# Load dataset from a csv file
>>> xy_data = pd.read_csv('data.csv', header=None, index_col=0)
# Initialize the KrigingModel class, extract the list of features and train the model
>>> krg_init = kriging.KrigingModel(xy_data, *kwargs)
>>> features = krg_init.get_feature_vector()
>>> krg_init.training()
xy_data is a two-dimensional python data structure containing the input and output training data. The output values MUST be in the last column.
Optional Arguments
numerical_gradients: Whether or not numerical gradients should be used in training. This choice determines the algorithm used to solve the problem.
True: The problem is solved with BFGS using central differencing with \(\Delta=10^{-6}\) to evaluate numerical gradients.
False: The problem is solved with Basinhopping, a stochastic optimization algorithm.
regularization - Boolean option which determines whether or not regularization is considered during Kriging training. Default is True.
When regularization is turned on, the resulting model is a regressing kriging model.
When regularization is turned off, the resulting model is an interpolating kriging model.
pysmo.kriging Output¶
The result of pysmo.kriging is a python object containing information about the optimal Kriging hyperparameters \(\left(\mu,\sigma^{2},\theta_{1},\ldots,\theta_{n}\right)\) and different error and quality-of-fit metrics such as the mean-squared-error (MSE) and the \(R^{2}\) coefficient-of-fit. A Pyomo expression can be generated from the object simply passing a list of variables into the function generate_expression:
# Create a python list from the headers of the dataset supplied for training
>>> list_vars = []
>>> for i in features.keys():
>>> list_vars.append(features[i])
# Pass list to generate_expression function to obtain a Pyomo expression as output
>>> print(krg_init.generate_expression(list_vars))
Similar to the pysmo.polynomial_regression module, the output of the generate_expression function can be passed into an IDAES or Pyomo module as a constraint, objective or expression.
Prediction with pysmo.kriging models¶
Once a Kriging model has been trained, predictions for values at previously unsampled points x_unsampled can be evaluated by calling the predict_output() function on the unsampled points:
# Create a python list from the headers of the dataset supplied for training
>>> y_unsampled = kriging_init.predict_output(x_unsampled)
Further details about pysmo.kriging module may be found by consulting the examples or reading the paper […]
Available Methods¶
- class idaes.surrogate.pysmo.kriging.KrigingModel(XY_data, numerical_gradients=True, regularization=True, fname=None, overwrite=False)[source]¶
The KrigingModel class trains a Kriging model for a training data set.
The class must first be initialized by calling KrigingModel. Model training is then carried out by calling the
training
method.KrigingModel is able to generate either an interpolating or a regressing Kriging model depending on the settings used during initialization..
Example:
# Initialize the class >>> d = KrigingModel(training_data, numerical_gradients=True, regularization=True)) >>> p = d.get_feature_vector() # Train Kriging model and predict output for an test data x_test >>> d.training() >>> predictions = d.predict_output(x_test)
- Parameters
XY_data (NumPy Array or Pandas Dataframe) – The dataset for Kriging training. XY_data is expected to contain both the features (X) and output (Y) information, with the output values (Y) in the last column.
Further details about the optional inputs may be found under the
__init__
method.- __init__(XY_data, numerical_gradients=True, regularization=True, fname=None, overwrite=False)[source]¶
Initialization of KrigingModel class.
- Parameters
XY_data (NumPy Array or Pandas Dataframe) – The dataset for Kriging training. XY_data is expected to contain feature and output data, with the output values (y) in the last column.
- Keyword Arguments
numerical_gradients (bool) –
Whether or not numerical gradients should be used in training. This choice determines the algorithm used to solve the problem.
numerical_gradients = True: The problem is solved with BFGS using central differencing with a step size of \(10^{-6}\) to evaluate numerical gradients.
numerical_gradients = False: The problem is solved with Basinhopping, a stochastic optimization algorithm.
regularization (bool) –
This option determines whether or not regularization is considered during Kriging training. Default is True.
When regularization is turned off, the model generates an interpolating kriging model.
- Returns
self object with the input information and settings.
- Raises
The input dataset is of the wrong type (not a NumPy array or Pandas Dataframe)
numerical_gradients is not boolean
regularization is not boolean
Example:
# Initialize Kriging class with no numerical gradients - solution algorithm will be Basinhopping >>> d = KrigingModel(XY_data, numerical_gradients=False)
- generate_expression(variable_list)[source]¶
The
generate_expression
method returns the Pyomo expression for the Kriging model trained.The expression is constructed based on the supplied list of variables variable_list and the results of the previous Kriging training process.
- Parameters
variable_list (list) – List of input variables to be used in generating expression. This can be the a list generated from the output of
get_feature_vector
. The user can also choose to supply a new list of the appropriate length.- Returns
Pyomo expression of the Kriging model based on the variables provided in variable_list
- Return type
Pyomo Expression
- get_feature_vector()[source]¶
The
get_feature_vector
method generates the list of regression features from the column headers of the input dataset.- Returns
An indexed parameter list of the variables supplied in the original data
- Return type
Pyomo IndexedParam
- predict_output(x_pred)[source]¶
The
predict_output
method generates output predictions for input data x_pred based a previously trained Kriging model.- Parameters
x_pred (NumPy Array) – Array of designs for which the output is to be evaluated/predicted.
- Returns
Output variable predictions based on the Kriging model.
- Return type
NumPy Array
- static r2_calculation(y_true, y_predicted)[source]¶
r2_calculation
returns the \(R^{2}\) as a measure-of-fit between the true and predicted values of the output variable.- Parameters
y_true (NumPy Array) – Vector of actual values of the output variable
y_predicted (NumPy Array) – Vector of predictions for the output variable based on the surrogate
- Returns
\(R^{2}\) measure-of-fit between actual and predicted data
- Return type
- training()[source]¶
Main function for Kriging training.
- To train the Kriging model:
The Kriging exponent \(\tau_{i}\) is fixed at 2.
The optimal Kriging hyperparameters \(\left(\mu,\sigma^{2},\theta_{1},\ldots,\theta_{n}\right)\) are evaluated by calling the
optimal_parameter_evaluation
method using either BFGS or Basinhopping.The training predictions, prediction errors and r-square coefficient of fit are evaluated by calling the functions
error_calculation
andself.r2_calculation
A results object is generated by calling the
ResultsReport
class.
- Returns
- self object (results) containing the all information about the best Kriging model obtained, including:
the Kriging model hyperparameters (results.optimal_weights),
when relevant, the optimal regularization parameter found \(\lambda\) (results.regularization_parameter),
the Kriging mean (results.optimal_mean),
the Kriging variance (results.optimal_variance),
the Kriging model regularized co-variance matrix (results.optimal_covariance_matrix),
the inverse of the co-variance matrix (results.covariance_matrix_inverse),
the RBF predictions for the training data (results.output_predictions),
the RMSE of the training output predictions (results.training_rmse), and
the \(R^{2}\) value on the training data (results.R2)
- Return type
References:¶
[1] Forrester et al.’s book “Engineering Design via Surrogate Modelling: A Practical Guide”, https://onlinelibrary.wiley.com/doi/pdf/10.1002/9780470770801
[2] D. R. Jones, A taxonomy of global optimization methods based on response surfaces, Journal of Global Optimization, https://link.springer.com/article/10.1023%2FA%3A1012771025575