Generating Kriging Models with PySMO¶
The pysmo.kriging trains Ordinary Kriging models. Interpolating kriging models assume that the outputs \(\hat{y}\in\mathbb{R}^{m\times1}\) are correlated and may be treated as a normally distributed stochastic process. For a set of input measurements \(X=\left\{ x_{1},x_{2},\ldots,x_{m}\right\} ;x_{i}\in\mathbb{R}^{n}\), the output \(\hat{y}\) is modeled as the sum of a mean \(\left(\mu\right)\) and a Gaussian process error,
Kriging models assume that the errors in the outputs \(\epsilon\) are correlated proportionally to the distance between corresponding points,
The hyperparameters of the Kriging model \(\left(\mu,\sigma^{2},\theta_{1},\ldots,\theta_{n},\tau_{1},\ldots,\tau_{n}\right)\) are selected such that the concentrated log likelihood function is maximized.
Basic Usage¶
To generate a Kriging model with PySMO, the pysmo.kriging class is first initialized, and then the function kriging_training is called on the initialized object:
# Required imports
>>> from idaes.surrogates.pysmo import kriging
>>> import pandas as pd
# Load dataset from a csv file
>>> xy_data = pd.read_csv('data.csv', header=None, index_col=0)
# Initialize the KrigingModel class, extract the list of features and train the model
>>> krg_init = kriging.KrigingModel(xy_data, *kwargs)
>>> features = krg_init.get_feature_vector()
>>> krg_fit = krg_init.kriging_training()
 xy_data is a twodimensional python data structure containing the input and output training data. The output values MUST be in the last column.
Optional Arguments
numerical_gradients: Whether or not numerical gradients should be used in training. This choice determines the algorithm used to solve the problem.
 True: The problem is solved with BFGS using central differencing with \(\Delta=10^{6}\) to evaluate numerical gradients.
 False: The problem is solved with Basinhopping, a stochastic optimization algorithm.
regularization  Boolean option which determines whether or not regularization is considered during Kriging training. Default is True.
 When regularization is turned on, the resulting model is a regressing kriging model.
 When regularization is turned off, the resulting model is an interpolating kriging model.
pysmo.kriging Output¶
The result of pysmo.kriging (krg_fit in above example) is a python object containing information about the optimal Kriging hyperparameters \(\left(\mu,\sigma^{2},\theta_{1},\ldots,\theta_{n}\right)\) and different error and qualityoffit metrics such as the meansquarederror (MSE) and the \(R^{2}\) coefficientoffit. A Pyomo expression can be generated from the object simply passing a list of variables into the function kriging_generate_expression:
# Create a python list from the headers of the dataset supplied for training
>>> list_vars = []
>>> for i in features.keys():
>>> list_vars.append(features[i])
# Pass list to generate_expression function to obtain a Pyomo expression as output
>>> print(krg_fit.kriging_generate_expression(list_vars))
Similar to the pysmo.polynomial_regression module, the output of the kriging_generate_expression function can be passed into an IDAES or Pyomo module as a constraint, objective or expression.
Prediction with pysmo.kriging models¶
Once a Kriging model has been trained, predictions for values at previously unsampled points x_unsampled can be evaluated by calling the kriging_predict_output() function on the resulting Python object and the unsampled points:
# Create a python list from the headers of the dataset supplied for training
>>> y_unsampled = kriging_init.kriging_predict_output(krg_fit, x_unsampled)
Further details about pysmo.kriging module may be found by consulting the examples or reading the paper […]
Available Methods¶

class
idaes.surrogate.pysmo.kriging.
KrigingModel
(XY_data, numerical_gradients=True, regularization=True, fname=None, overwrite=False)[source]¶ The KrigingModel class trains a Kriging model for a training data set.
The class must first be initialized by calling KrigingModel. Model training is then carried out by calling the
kriging_training
method.KrigingModel is able to generate either an interpolating or a regressing Kriging model depending on the settings used during initialization..
Example:
# Initialize the class >>> d = KrigingModel(training_data, numerical_gradients=True, regularization=True)) >>> p = d.get_feature_vector() # Train Kriging model and predict output for an test data x_test >>> results = d.kriging_training() >>> predictions = d.kriging_predict_output(results, x_test)
Parameters: XY_data (NumPy Array or Pandas Dataframe) – The dataset for Kriging training. XY_data is expected to contain both the features (X) and output (Y) information, with the output values (Y) in the last column. Further details about the optional inputs may be found under the
__init__
method.
__init__
(XY_data, numerical_gradients=True, regularization=True, fname=None, overwrite=False)[source]¶ Initialization of KrigingModel class.
Parameters: XY_data (NumPy Array or Pandas Dataframe) – The dataset for Kriging training. XY_data is expected to contain feature and output data, with the output values (y) in the last column.
Keyword Arguments:  numerical_gradients (bool) –
Whether or not numerical gradients should be used in training. This choice determines the algorithm used to solve the problem.
 numerical_gradients = True: The problem is solved with BFGS using central differencing with a step size of \(10^{6}\) to evaluate numerical gradients.
 numerical_gradients = False: The problem is solved with Basinhopping, a stochastic optimization algorithm.
 regularization (bool) –
This option determines whether or not regularization is considered during Kriging training. Default is True.
 When regularization is turned off, the model generates an interpolating kriging model.
Returns: self object with the input information and settings.
Raises: ValueError
–  The input dataset is of the wrong type (not a NumPy array or Pandas Dataframe)Exception
–  numerical_gradients is not booleanException
–  regularization is not boolean
Example:
# Initialize Kriging class with no numerical gradients  solution algorithm will be Basinhopping >>> d = KrigingModel(XY_data, numerical_gradients=False)
 numerical_gradients (bool) –

get_feature_vector
()[source]¶ The
get_feature_vector
method generates the list of regression features from the column headers of the input dataset.Returns: An indexed parameter list of the variables supplied in the original data Return type: Pyomo IndexedParam

kriging_predict_output
(kriging_params, x_pred)[source]¶ The
kriging_predict_output
method generates output predictions for input data x_pred based a previously trained Kriging model.Parameters:  kriging_params (tuple) – Results of Kriging training generated by calling the
kriging_training
method.  x_pred (NumPy Array) – Array of designs for which the output is to be evaluated/predicted.
Returns: Output variable predictions based on the Kriging model.
Return type: NumPy Array
 kriging_params (tuple) – Results of Kriging training generated by calling the

kriging_training
()[source]¶ Main function for Kriging training.
 To train the Kriging model:
 The Kriging exponent \(\tau_{i}\) is fixed at 2.
 The optimal Kriging hyperparameters \(\left(\mu,\sigma^{2},\theta_{1},\ldots,\theta_{n}\right)\) are evaluated by calling the
optimal_parameter_evaluation
method using either BFGS or Basinhopping.  The training predictions, prediction errors and rsquare coefficient of fit are evaluated by calling the functions
error_calculation
andself.r2_calculation
 A results object is generated by calling the
ResultsReport
class.
Returns:  Python object (results) containing the all information about the best Kriging model obtained, including:
 the Kriging model hyperparameters (results.optimal_weights),
 when relevant, the optimal regularization parameter found \(\lambda\) (results.regularization_parameter),
 the Kriging mean (results.optimal_mean),
 the Kriging variance (results.optimal_variance),
 the Kriging model regularized covariance matrix (results.optimal_covariance_matrix),
 the inverse of the covariance matrix (results.covariance_matrix_inverse),
 the RBF predictions for the training data (results.output_predictions),
 the RMSE of the training output predictions (results.training_rmse), and
 the \(R^{2}\) value on the training data (results.R2)
Return type: tuple

static
r2_calculation
(y_true, y_predicted)[source]¶ r2_calculation
returns the \(R^{2}\) as a measureoffit between the true and predicted values of the output variable.Parameters:  y_true (NumPy Array) – Vector of actual values of the output variable
 y_predicted (NumPy Array) – Vector of predictions for the output variable based on the surrogate
Returns: \(R^{2}\) measureoffit between actual and predicted data
Return type:


class
idaes.surrogate.pysmo.kriging.
ResultReport
(theta, reg_param, mean, variance, cov_mat, cov_inv, ymu, y_training, r2_training, rmse_error, p, x_data, x_data_scaled, x_data_min, x_data_max)[source]¶ 
kriging_generate_expression
(variable_list)[source]¶ The
kriging_generate_expression
method returns the Pyomo expression for the Kriging model trained.The expression is constructed based on the supplied list of variables variable_list and the results of the previous Kriging training process.
Parameters: variable_list (list) – List of input variables to be used in generating expression. This can be the a list generated from the output of get_feature_vector
. The user can also choose to supply a new list of the appropriate length.Returns: Pyomo expression of the Kriging model based on the variables provided in variable_list Return type: Pyomo Expression

References:¶
[1] Forrester et al.’s book “Engineering Design via Surrogate Modelling: A Practical Guide”, https://onlinelibrary.wiley.com/doi/pdf/10.1002/9780470770801
[2] D. R. Jones, A taxonomy of global optimization methods based on response surfaces, Journal of Global Optimization, https://link.springer.com/article/10.1023%2FA%3A1012771025575