ALAMOPY.ALAMO Options¶
This page lists in more detail the ALAMOPY options and the relation of ALAMO and ALAMOPY.
Contents
Installing ALAMO¶
ALAMO (Automatic Learning of Algebraic MOdels) is an optional dependency developed and licensed by The Optimization Firm: https://www.minlp.com/alamo-modeling-tool. The provided link include further information on obtaining a license, installing the tool, obtaining the Baron (Branch-And-Reduce Optimization Navigator) solver which ALAMO leverages, and specific examples through a user manual and installation guide. Alternatively, users may directly access the user guide here: https://minlp.com/downloads/docs/alamo%20manual.pdf.
During installations, it is recommended that users Windows 10 users check that the ALAMO path is set. Additionally, users must place the ALAMO license file in the folder where it is installed.
More details on ALAMO options may be found in the user guide documentation linked above. If users encounter specific error codes while running the ALAMOPy tool in IDAES, the user guide contains detailed descriptions of each termination condition and error message.
Basic ALAMOPY.ALAMO options¶
Data Arguments¶
The following arguments are required by the AlamoTrainer method:
input_labels: user-specified labels given to the inputs
output_labels: user-specified labels given to the outputs
training_dataframe: dataframe (Pandas) object containing training dataset
# after reading or generating a DataFrame object called `data_training`
trainer = AlamoTrainer(input_labels=['x1', 'x2'], output_labels=['z1', 'z2'], training_dataframe=data_training)
trainer.config.[Alamo Option] = [Valid Option Choice] # see below for more details
success, alm_surr, msg = trainer.train_surrogate()
The following arguments are required by the AlamoSurrogate method:
surrogate_expressions: Pyomo expression object(s) generated by training the surrogate model(s)
input_labels: user-specified labels given to the inputs
output_labels: user-specified labels given to the outputs
input_bounds: minimum/maximum bounds for each input variable to constraint training search space
surrogate_expressions = trainer._results['Model']
input_labels = trainer._input_labels
output_labels = trainer._output_labels
xmin, xmax = [0.1, 0.8], [0.8, 1.2]
input_bounds = {input_labels[i]: (xmin[i], xmax[i]) for i in range(len(input_labels))}
alm_surr = AlamoSurrogate(surrogate_expressions, input_labels, output_labels, input_bounds)
Available Basis Functions¶
The following basis functions are allowed during regression:
constant, linfcns, expfcns, logfcns, sinfcns, cosfcns, grbfcns: 0-1 option to include constant, linear, exponential, logarithmic, sine, cosine, and Gaussian radial basis functions. For example,
trainer.config.constant = 1, trainer.config.linfcns = 1, trainer.config.expfcns = 1, trainer.config.logfcns = 1, trainer.config.sinfcns = 1, trainer.config.cosfcns = 1, trainer.config.grbfcns = 1
This results in basis functions = k, x1, exp(x1), log(x1), sin(x1), cos(x1), exp(-(\(\epsilon\) ||x1||)^2)
rbfparam: multiplicative constant \(epsilon\) used in the Gaussian radial basis functions
monomialpower, multi2power, multi3power: list of monomial, binomial, and trinomial powers. For example,
trainer.config.monomialpower = [2,3,4], trainer.config.multi2power = [1,2,3], trainer.config.multi3power = [1,2,3]
This results in the following basis functions:
Monomial functions = x^2, x^3, x^4
Binomial functions = x1*x2, (x1*x2)^2, (x1*x2)^3
Trinomial functions = (x1*x2*x3), (x1*x2*x3)^2, (x1*x2*x3)^3
ratiopower: list of ratio powers. For example,
trainer.config.ratiopower = (1,2,3)
This results in basis functions = (x1/x2), (x1/x2)^2, (x1/x2)^3
ALAMO Regression Options¶
modeler: fitness metric to beused for model building (1-8)
BIC: Bayesian infromation criterion
MallowsCp: Mallow’s Cp
AICc: the corrected Akaike’s information criterion
HQC: the Hannan-Quinn information criterion
MSE: mean square error
SSEp: sum of square error plus a penalty proportional to the model size (Note: convpen is the weight of the penalty)
RIC: the risk information criterion
MADp: the maximum absolute eviation plus a penalty proportional to model size (Note: convpen is the weight of the penalty)
screener: regularization method used to reduce the number of potential basis functions pre-optimization (0-2)
none: don’t use a regularization method
lasso: use the LASSO (Least Absolute Shrinkage and Selection Operator) regularization method
SIS: use the SIS (Sure Independence Screening) regularization method
maxterms: maximum number of terms to be fit in the model, surrogates will use fewer if possible
minterms: minimum number of terms to be fit in the model, a value of 0 means no limit is imposed
convpen: when MODELER is set to 6 or 8 the size of the model is weighted by CONVPEN.
sismult: non-negative number of basis functions retained by the SIS screener
simulator: a python function to be used as a simulator for ALAMO, a variable that is a python function (not a string)
maxiter: max iteration of runs
maxtime: max length of total execution time in seconds
datalimitterms: limit model terms to number of measurements (True/False)
numlimitbasis: eliminate infeasible basis functions (True/False)
exclude: list of inputs to exclude during building
ignore: list of outputs to ignore during building
xisint: list of inputs that should be treated as integers
zisint: list of outputs that should be treated as integers
Scaling and Metrics Options¶
xfactor: list of scaling factors for input variables
xscaling: sets XFACTORS equal to the range of each input (True/False)
scalez: scale output variables (True/False)
ncvf: number of folds for cross validation
tolrelmetric: relative tolerance for outputs
tolabsmetric: absolute tolerance for outputs
tolmeanerror: convergence tolerance for mean errors in outputs
tolsse: absolute tolerance on SSE (sum of squared errors)
mipoptca: absolute tolerance for MIP
mipoptcr: relative tolerance for MIP
linearerror: use a linear objective instead of squared error (True/False)
GAMS: complete path to GAMS executable, or name if GAMS is in the user path
solvemip: solve MIP with an optimizer (True/False)
GAMSSOLVER: name of preferred GAMS solver to solve ALAMO mip quadratic subproblems
builder: use a greedy heuristic (True/False)
backstepper: use a greedy heuristicd to build down a model by starting from the least squares model and removing one variable at a time (True/False)
File Options¶
print_to_screen: send ALAMO output to stdout (True/False)
alamo_path: path to ALAMO executable (if not in path)
filename : file name to use for ALAMO files, must be full path of a .alm file
working_directory: full path to working directory for ALAMO to use
overwrite_files: overwrite (delete) existing files when re-generating (True/False)
ALAMOPY results dictionary¶
The results from alamopy.alamo are returned as a python dictionary. The data can be accessed by using the dictionary keys listed below. For example,
# once the trainer object `trainer` has been defined, configured and trained
regression_results = trainer._results
surrogate_expressions = trainer._results['Model']
Fitness metrics¶
trainer._results[‘ModelSize’]: number of terms chosen in the regression
trainer._results[‘R2’]: R2 value of the regression
Objective value metrics: trainer._results[‘SSE’], trainer._results[‘RMSE’], trainer._results[‘MADp’]
Regression description¶
trainer._results[‘AlamoVersion’]: Version of ALAMO
trainer._results[‘xlabels’], trainer._results[‘zlabels’]: The labels used for the inputs/outputs
trainer._results[‘xdata’], trainer._results[‘zdata’]: array of xdata/zdata
trainer._results[‘ninputs’], trainer._results[‘nbas’]: number of inputs/basis functions
Performance specs¶
There are three types of regression problems that are used: ordinary linear regression (olr), classic linear regression (clr), and a mixed integer program (mip). Performance metrics include the number of each problems and the time spent on each type of problem. Additionally, the time spent on other operations and the total time are included.
trainer._results[‘numOLRs’], trainer._results[‘OLRtime’], trainer._results[‘numCLRs’], trainer._results[‘CLRtime’], trainer._results[‘numMIPs’], trainer._results[‘MIPtime’]: number of type of regression problems solved and time
trainer._results[‘OtherTime: Time spent on other operations
trainer._results[‘TotalTime’]: Total time spent on the regression
Custom Basis Functions¶
Custom basis functions can be added to the built-in functions to expand the functional forms available. In ALAMO, this can be done with the following syntax
NCUSTOMBAS #
BEGIN_CUSTOMBAS
x1^2 * x2^2
END_CUSTOMBAS
To use this advanced capability in ALAMOPY, the following function is called. Note it is necessary to use the xlabels assigned to the input parameters.
trainer.config.custom_basis_functions = ["x1^2 * x2^2", "...", "..." ...]