idaes.dmf package¶

IDAES Data Management Framework (DMF)

The DMF lets you save, search, and retrieve provenance related to your models.

Subpackages¶

idaes.dmf.ui package

Submodules¶

idaes.dmf.cli module¶

Command Line Interface for IDAES DMF.

Uses “Click” to handle command-line parsing and dispatch.

class idaes.dmf.cli.AliasedGroup(aliases=None, **attrs)[source]¶

Improved click.Group that will accept unique prefixes for the commands, as well as a set of aliases.

For example, the following code will create mycommand as a group, and alias the subcommand “info” to invoke the subcommand “status”. Any unique prefix of “info” (not conflicting with other subcommands or aliases) or “status” will work, e.g. “inf” or “stat”:

@click.group(cls=AliasedGroup, aliases={"info": "status"})
def mycommand():
    pass

get_command(ctx, cmd_name)[source]¶: Given a context and a command name, this returns a Command object if it exists or returns None.

class idaes.dmf.cli.Code[source]¶: Return codes from the CLI.

class idaes.dmf.cli.URLType[source]¶

Click type for URLs.

convert(value, param, ctx)[source]¶: Converts the value. This is not invoked for values that are None (the missing value).

idaes.dmf.codesearch module¶

Search through the code and index static information in the DMF.

class idaes.dmf.codesearch.ModuleClassWalker(from_path=None, from_pkg=None, class_expr=None, parent_class=None, suppress_warnings=False, exclude_testdirs=True, exclude_tests=True, exclude_init=True, exclude_setup=True, exclude_dirs=None)[source]¶

Walk modules from a given root (e.g. ‘idaes’), and visit all classes in those modules whose name matches a given pattern.

Example usage:

walker = ModuleClassWalker(from_pkg=idaes,
                           class_expr='_PropertyParameter.*')

walker.walk(PrintMetadataVisitor())  # see below

walk(visitor)[source]¶

Interface for walkers.

Parameters:	visitor (Visitor) – Class whose visit method will be called for each item.
Returns:	None

class idaes.dmf.codesearch.PrintPropertyMetadataVisitor[source]¶

visit_metadata(obj, meta)[source]¶: Print the module and class of the object, and then the metadata dict, to standard output.

class idaes.dmf.codesearch.PropertyMetadataVisitor[source]¶

Visit something implementing HasPropertyClassMetadata and pass that metadata, as a dict, to the visit_metadata() method, which should be implemented by the subclass.

visit(obj)[source]¶

Visit one object.

Parameters:	obj (idaes.core.property_base.HasPropertyClassMetadata) – The object
Returns:	True if visit succeeded, else False

visit_metadata(obj, meta)[source]¶

Do something with the metadata.

Parameters:	obj (object) – Object from which metadata was pulled, for context. meta (idaes.core.property_base.PropertyClassMetadata) – The metadata
Returns:	None

class idaes.dmf.codesearch.Visitor[source]¶

Interface for the ‘visitor’ class passed to Walker subclasses’ walk() method.

visit(obj)[source]¶

Visit one object.

Parameters:	obj (object) – Some object to operate on.
Returns:	True if visit succeeded, else False

idaes.dmf.commands module¶

Perform all logic, input, output of commands that is particular to the CLI.

Call functions defined in ‘api’ module to handle logic that is common to the API and CLI.

idaes.dmf.commands.init_conf(workspace)[source]¶: Initialize the workspace.

idaes.dmf.commands.list_resources(path, long_format=None, relations=False)[source]¶

List resources in a given DMF workspace.

Parameters:	path (str) – Path to the workspace long_format (bool) – List in long format flag relations (bool) – Show relationships, in long format
Returns:	None

idaes.dmf.commands.list_workspaces(root, stream=None)[source]¶

List workspaces found from a given root path.

Parameters:	root – root path stream – Output stream (must have .write() method)

idaes.dmf.commands.workspace_import(path, patterns, exit_on_error)[source]¶

Import files into workspace.

Parameters:	path (str) – Target workspace directory patterns (list) – List of Unix-style glob for files to import. Files are expected to be resource JSON or a Jupyter Notebook. exit_on_error (bool) – If False, continue trying to import resources even if one or more fail.
Returns:	Number of things imported
Return type:	int
Raises:	BadResourceError, if there is a problem

idaes.dmf.commands.workspace_init(dirname, metadata)[source]¶: Initialize from root at dirname, set environment variable for other commands, and parse config file.

idaes.dmf.dmfbase module¶

Data Management Framework

class idaes.dmf.dmfbase.DMF(path='', name=None, desc=None, create=False, save_path=False, **ws_kwargs)[source]¶

Data Management Framework (DMF).

Expected usage is to instantiate this class, once, and then use it for storing, searching, and retrieving resources that are required for the given analysis.

For details on the configuration files used by the DMF, see documentation for DMFConfig (global configuration) and idaes.dmf.workspace.Workspace.

add(rsrc)[source]¶

Add a resource and associated files.

If the resource has ‘datafiles’, there are some special values that cause those files to be copied and possibly the original removed at this point. There are attributes do_copy and is_tmp on the resource, and also potentially keys of the same name in the datafiles themselves. If present, the datafile key/value pairs will override the attributes in the resource. For do_copy, the original file will be copied into the DMF workspace. If do_copy is True, then if is_tmp is also True the original file will be removed (after the copy is made, of course).

Parameters:	rsrc (resource.Resource) – The resource
Returns:	(str) Resource ID
Raises:	DMFError, DuplicateResourceError

fetch_one(rid, id_only=False)[source]¶

Fetch one resource, from its identifier.

Parameters:	rid (str) – Resource identifier id_only (bool) – If true, return only the identifier of each resource; otherwise a Resource object is returned.
Returns:	(resource.Resource) The found resource, or None if no match

find(filter_dict=None, name=None, id_only=False, re_flags=0)[source]¶

Find and return resources matching the filter.

The filter syntax is a subset of the MongoDB filter syntax. This means that it is represented as a dictionary, where each key is an attribute or nested attribute name, and each value is the value against which to match. There are six possible types of values:

scalar string or number (int, float): Match resources that have this exact value for the given attribute.
special scalars “@<value>”:
- “@true”/”@false”: boolean (bare True/False will test existence)
date, as datetime.datetime or pendulum.Pendulum instance: Match resources that have this exact date for the given attribute.
list: Match resources that have a list value for this attribute, and for which any of the values in the provided list are in the resource’s corresponding value. If a ‘!’ is appended to the key name, then this will be interpreted as a directive to only match resources for which all values in the provided list are present.
dict: This is an inequality, with one or more key/value pairs. The key is the type of inequality and the value is the numeric value for that range. All keys begin with ‘$’. The possible inequalities are:
- “$lt”: Less than (<)
- “$le”: Less than or equal (<=)
- “$gt”: Greater than (>)
- “$ge”: Greater than or equal (>=)
- “$ne”: Not equal to (!=)
Boolean True means does the field exist, and False means does it not exist.
Regular expression, string “~<expr>” and re_flags for flags (understood: re.IGNORECASE)

Parameters:	filter_dict (dict) – Search filter. name (str) – If present, add {‘aliases’: [<name>]} to filter_dict. This is syntactic sugar for a common case. id_only (bool) – If true, return only the identifier of each resource; otherwise a Resource object is returned. re_flags (int) – Flags for regex filters
Returns:	(list of int\|Resource) Depending on the value of id_only.

find_by_id(identifier: str, id_only=False) → Generator[T_co, T_contra, V_co][source]¶: Find resources by their identifier or identifier prefix.

find_related(rsrc, filter_dict=None, maxdepth=0, meta=None, outgoing=True)[source]¶

Find related resources.

Parameters:	rsrc (resource.Resource) – Resource starting point filter_dict (dict) – See parameter of same name in `find()`. maxdepth (int) – Maximum depth of search (starts at 1) meta (List[str]) – Metadata fields to extract for meta part outgoing (bool) – If True, look at outgoing relations. Otherwise look at incoming relations. e.g. if A ‘uses’ B and if True, would find B starting from A. If False, would find A starting from B.
Returns:	Generates triples (depth, Triple, meta), where the depth is an integer (starting at 1), the Triple is a simple namedtuple wrapping (subject, object, predicate), and meta is a dict of metadata for the endpoint of the relation (the object if outgoing=True, the subject if outgoing=False) for the fields provided in the meta parameter.
Raises:	`NoSuchResourceError` – if the starting resource is not found

remove(identifier=None, filter_dict=None, update_relations=True)[source]¶

Remove one or more resources, from its identifier or a filter. Unless told otherwise, this method will scan the DB and remove all relations that involve this resource.

Parameters:	identifier (str) – Identifier for a resource. filter_dict (dict) – Filter to use instead of identifier update_relations (bool) – If True (the default), scan the DB and remove all relations that involve this identifier.

update(rsrc, sync_relations=False, upsert=False)[source]¶

Update/insert stored resource.

Parameters:	rsrc (resource.Resource) – Resource instance sync_relations (bool) – If True, and if resource exists in the DB, then the “relations” attribute of the provided resource will be changed to the stored value. upsert (bool) – If true, and the resource is not in the DMF, then insert it. If false, and the resource is not in the DMF, then do nothing.
Returns:	True if the resource was updated or added, False if nothing was done.
Return type:	bool
Raises:	`errors.DMFError` – If the input resource was invalid.

class idaes.dmf.dmfbase.DMFConfig(defaults=None)[source]¶

Global DMF configuration.

Every time you create an instance of the DMF or run a dmf command on the command-line, the library opens the global DMF configuration file to figure out the default workspace (and, eventually, other values).

The default location for this configuration file is “~/.dmf”, i.e. the file named “.dmf” in the user’s home directory. This can be modified programmatically by changing the “filename” attribute of this class.

The contents of the configuration are formatted as YAML with the following keys defined:

workspace

Path to the default workspace directory.

idaes.dmf.errors module¶

Exception classes.

exception idaes.dmf.errors.AlamoDisabledError[source]¶

exception idaes.dmf.errors.AlamoError(msg)[source]¶

exception idaes.dmf.errors.BadResourceError[source]¶

exception idaes.dmf.errors.CommandError(command, operation, details)[source]¶

exception idaes.dmf.errors.DMFError(detailed_error='No details')[source]¶

exception idaes.dmf.errors.DataFormatError(dtype, err)[source]¶

exception idaes.dmf.errors.DmfError[source]¶

exception idaes.dmf.errors.DuplicateResourceError(op, id_)[source]¶

exception idaes.dmf.errors.FileError[source]¶

exception idaes.dmf.errors.InvalidRelationError(subj, pred, obj)[source]¶

exception idaes.dmf.errors.ModuleFormatError(module_name, type_, what)[source]¶

exception idaes.dmf.errors.NoSuchResourceError(name=None, id_=None)[source]¶

exception idaes.dmf.errors.ParseError[source]¶

exception idaes.dmf.errors.ResourceError[source]¶

exception idaes.dmf.errors.SearchError(spec, problem)[source]¶

exception idaes.dmf.errors.WorkspaceCannotCreateError(path)[source]¶

exception idaes.dmf.errors.WorkspaceConfMissingField(path, name, desc)[source]¶

exception idaes.dmf.errors.WorkspaceConfNotFoundError(path)[source]¶

exception idaes.dmf.errors.WorkspaceError(detailed_error='No details')[source]¶

exception idaes.dmf.errors.WorkspaceNotFoundError(from_dir)[source]¶

idaes.dmf.experiment module¶

The ‘experiment’ is a root container for a coherent set of ‘resources’.

class idaes.dmf.experiment.Experiment(dmf, **kwargs)[source]¶

An experiment is a way of grouping resources in a way that makes sense to the user.

It is also a useful unit for passing as an argument to functions, since it has a standard ‘slot’ for the DMF instance that created it.

add(rsrc)[source]¶

Add a resource to an experiment.

This does two things:

Establishes an “experiment” type of relationship between the new resource and the experiment.
Adds the resource to the DMF

Parameters:	rsrc (resource.Resource) – The resource to add.
Returns:	Added (input) resource, for chaining calls.
Return type:	resource.Resource

copy(new_id=True, **kwargs)[source]¶

Get a copy of this experiment. The returned object will have been added to the DMF.

Parameters:

new_id (bool) – If True, generate a new unique ID for the copy.
kwargs – Values to set in new instance after copying.

Returns:

A (mostly deep) copy.

Note that the DMF instance is just a reference to the same object as in the original, and they will share state.

Return type:

Experiment

link(subj, predicate='contains', obj=None)[source]¶

Add and update relation triple in DMF.

Parameters:	subj (resource.Resource) – Subject predicate (str) – Predicate obj (resource.Resource) – Object
Returns:	None

remove()[source]¶: Remove this experiment from the associated DMF instance.

update()[source]¶: Update experiment to current values.

idaes.dmf.help module¶

Find documentation for modules and classes in the generated Sphinx documentation and return its location.

idaes.dmf.help.find_html_docs(dmf, obj=None, obj_name=None, **kw)[source]¶: Get one or more files with HTML documentation for the given object, in paths referred to by the dmf instance.

idaes.dmf.magics module¶

Jupyter magics for the DMF.

exception idaes.dmf.magics.DMFMagicError(errmsg, usermsg=None)[source]¶

class idaes.dmf.magics.DmfMagics(shell)[source]¶

Implement “magic” commands in Jupyter/IPython for interacting with the DMF and IDAES more generally.

In order to allow easier testing, the functionality is broken into two classes. This class has the decorated method(s) for invoking the ‘magics’, and DmfMagicsImpl has the state and functionality.

dmf(line)[source]¶

DMF outer command.

Example:

%dmf <subcommand> [subcommand args..]

class idaes.dmf.magics.DmfMagicsImpl(shell)[source]¶

State and implementation called by DmfMagics.

On failure of any method, a DMFMagicError is raised, that should be handled by the line or cell magic that invoked it.

dmf(line)[source]¶: DMF outer command

dmf_help(*names)[source]¶: Provide help on IDAES objects and classes. Invoking with no arguments gives general help. Invoking with one argument looks for help in the docs on the given object or class. Arguments: [name].

dmf_info(*topics)[source]¶

Provide information about DMF current state. Arguments: none

Parameters:	topics ((List[str])) – List of topics
Returns:	None

dmf_init(path, *extra)[source]¶

Initialize DMF (do this before most other commands). Arguments: path [“create”]

Parameters:	path (str) – Full path to DMF home extra (str) – Extra tokens. If ‘create’, then try to create the path if it is not found.

dmf_list()[source]¶: List resources in the current workspace. Arguments: none.

dmf_workspaces(*paths)[source]¶

List DMF workspaces. Optionally takes one or more paths to use as a starting point. By default, start from current directory. Arguments: [paths..]

Parameters:	paths (List[str]) – Paths to search, use “.” by default

idaes.dmf.magics.register()[source]¶: Register with IPython on import (once).

idaes.dmf.model_data module¶

This module contains functions to read and manage data for use in parameter esitmation, data reconciliation, and validation.

idaes.dmf.model_data.read_data(csv_file, csv_file_metadata, model=None, rename_mapper=None, unit_system=None, ambient_pressure=1.0, ambient_pressure_unit='atm')[source]¶

Read CSV data into a Pandas DataFrame.

The data should be in a form where the first row contains column headings where each column is labeled with a data tag, and the first column contains data point labels or time stamps. The metadata should be in a csv file where the first column is the tag name, the second column is the model reference ( which can be empty), the third column is the tag description, and the fourth column is the unit of measure string. Any additional information can be added to columns after the fourth column and will be ignored. The units of measure should be something that is recognized by pint, or in the aliases defined in this file. Any tags not listed in the metadata will be dropped.

Parameters:

csv_file (str) – Path of file to read
csv_file_metadata (str) – Path of csv file to read column metadata from
model (ConcreteModel) – Optional model to map tags to
rename_mapper (function) – Optional function to rename tags
unit_system (str) – Optional system of units to atempt convert to
ambient_pressure (float, numpy.array, pandas.series, str) – Optional pressure to use to convert gauge pressure to absolute if a string is supplied the corresponding data tag is assumed to be ambient pressure
ambient_pressure_unit (str) – Optional ambient pressure unit, should be a unit recognized by pint.

Returns:

A Pandas data frame with tags in columns and rows indexed: by time.
(dict): Column metadata, units of measure, description, and model: mapping information.

Return type:

(DataFrame)

idaes.dmf.model_data.unit_convert(x, frm, to=None, system=None, unit_string_map={}, ignore_units=[], gauge_pressures={}, ambient_pressure=1.0, ambient_pressure_unit='atm')[source]¶

Convert the quantity x to a different set of units. X can be a numpy array or pandas series. The from unit is translated into a string that pint can recognize by first looking in unit_string_map then looking in know aliases defined in this file. If it is neither place it will be given to pint as-is. This translation of the unit is done so that data can be read in with the original provided units.

Parameters:	x (float, numpy.array, pandas.series) – quantity to convert frm (str) – original unit string to (str) – new unit string, or specify “system” system (str) – unit system to covert to, or specify “to” unit_string_map (dict) – keys are unit strings and values are corresponding strings that pint can recognize. This only applies to the from string. ignore_units (list, or tuple) – units to not convert gauge_pressures (dict) – keys are units strings to be considered gauge pressures and the values are corresponding absolute pressure units ambient_pressure (float, numpy.array, pandas.series) – pressure to add to gauge pressure to convert it to absolute pressure. The default is 1. The unit is atm by default, but can be changed with the ambient_pressure_unit argument. ambient_pressure_unit (str) – Unit for ambient pressure, default is atm, and should be a unit recognized by pint
Returns:	quantity and unit string
Return type:	(tuple)

idaes.dmf.propdata module¶

Property data types.

Ability to import, etc. from text files is part of the methods in the type.

Import property database from textfile(s): * See PropertyData.from_csv(), for the expected format for data. * See PropertyMetadata() for the expected format for metadata.

exception idaes.dmf.propdata.AddedCSVColumnError(names, how_bad, column_type='')[source]¶: Error for :meth:PropertyData.add_csv()

class idaes.dmf.propdata.Fields[source]¶: Constants for fields.

class idaes.dmf.propdata.PropertyColumn(name, data)[source]¶: Data column for a property.

class idaes.dmf.propdata.PropertyData(data)[source]¶

Class representing property data that knows how to construct itself from a CSV file.

You can build objects from multiple CSV files as well. See the property database section of the API docs for details, or read the code in add_csv() and the tests in idaes_dmf.propdb.tests.test_mergecsv.

add_csv(file_or_path, strict=False)[source]¶

Add to existing object from a new CSV file.

Depending on the value of the strict argument (see below), the new file may or may not have the same properties as the object – but it always needs to have the same number of state columns, and in the same order.

Note

Data that is “missing” because of property columns in one CSV and not the other will be filled with float(nan) values.

Parameters:	file_or_path (file or str) – Input file. This should be in exactly the same format as expected by :meth:from_csv(). strict (bool) – If true, require that the columns in the input CSV match columns in this object. Otherwise, only require that state columns in input CSV match columns in this object. New property columns are added, and matches to existing property columns will append the data.
Raises:	`AddedCSVColumnError` – If the new CSV column headers are not the same as the ones in this object.
Returns:	(int) Number of added rows

as_arr(states=True)[source]¶

Export property data as arrays.

Parameters:	states (bool) – If False, exclude “state” data, e.g. the ambient temperature, and only include measured property values.
Returns:	(values[M,N], errors[M,N]) Two arrays of floats, each with M columns having N values.
Raises:	ValueError if the columns are not all the same length

errors_dataframe(states=False)[source]¶

Get errors as a dataframe.

Parameters:	states (bool) – If False, exclude state data. This is the default, because states do not normally have associated error information.
Returns:	Pandas dataframe for values.
Return type:	pd.DataFrame
Raises:	`ImportError` – If pandas or numpy were never successfully imported.

static from_csv(file_or_path, nstates=0)[source]¶

Import the CSV data.

Expected format of the files is a header plus data rows.

Header: Index-column, Column-name(1), Error-column(1), Column-name(2), Error-column(2), .. Data: <index>, <val>, <errval>, <val>, <errval>, ..

Column-name is in the format “Name (units)”

Error-column is in the format “<type> Error”, where “<type>” is the error type.

Parameters:	file_or_path (file-like or str) – Input file nstates (int) – Number of state columns, appearing first before property columns.
Returns:	New properties instance
Return type:	PropertyData

is_property_column(index)[source]¶: Whether given column is a property. See is_state_column().

is_state_column(index)[source]¶

Whether given column is state.

Parameters:	index (int) – Index of column
Returns:	(bool) State or property and the column number.
Raises:	`IndexError` – No column at that index.

names(states=True, properties=True)[source]¶

Get column names.

Parameters:	states (bool) – If False, exclude “state” data, e.g. the ambient temperature, and only include measured property values. properties (bool) – If False, excluse property data
Returns:	List of column names.
Return type:	list[str]

values_dataframe(states=True)[source]¶

Get values as a dataframe.

Parameters:	states (bool) – see `names()`.
Returns:	(pd.DataFrame) Pandas dataframe for values.
Raises:	`ImportError` – If pandas or numpy were never successfully imported.

class idaes.dmf.propdata.PropertyMetadata(values=None)[source]¶: Class to import property metadata.

class idaes.dmf.propdata.PropertyTable(data=None, **kwargs)[source]¶

Property data and metadata together (at last!)

classmethod load(file_or_path, validate=True)[source]¶

Create PropertyTable from JSON input.

Parameters:	file_or_path (file or str) – Filename or file object from which to read the JSON-formatted data. validate (bool) – If true, apply validation to input JSON data.

Example input:

{
    "meta": [
        {"datatype": "MEA",
         "info": "J. Chem. Eng. Data, 2009, Vol 54, pg. 306-310",
         "notes": "r is MEA weight fraction in aqueous soln.",
         "authors": "Amundsen, T.G., Lars, E.O., Eimer, D.A.",
         "title": "Density and Viscosity of ..."}
    ],
    "data": [
        {"name": "Viscosity Value",
         "units": "mPa-s",
         "values": [2.6, 6.2],
         "error_type": "absolute",
         "errors": [0.06, 0.004],
         "type": "property"},
        {"name": "r",
         "units": "",
         "values": [0.2, 1000],
         "type": "state"}
    ]
}

class idaes.dmf.propdata.StateColumn(name, data)[source]¶: Data column for a state.

idaes.dmf.propindex module¶

Index Property metadata

class idaes.dmf.propindex.DMFVisitor(dmf, default_version=None)[source]¶

INDEXED_PROPERTY_TAG = 'indexed-property'¶: Added to resource ‘tags’, so easier to find later

visit_metadata(obj, meta)[source]¶

Called for each property class encountered during the “walk”: initiated by index_property_metadata().

Parameters:	obj (property_base.PropertyParameterBase) – Property class instance meta (property_base.PropertyClassMetadata) – Associated metadata
Returns:	None
Raises:	`AttributeError` – if

idaes.dmf.propindex.index_property_metadata(dmf, pkg=<module 'idaes' from '/home/docs/checkouts/readthedocs.org/user_builds/idaes-pse/checkouts/1.4.0/idaes/__init__.py'>, expr='_PropertyMetadata.*', default_version='0.0.1', **kwargs)[source]¶

Index all the PropertyMetadata classes in this package.

Usually the defaults will be correct, but you can modify the package explored and set of classes indexed.

When you re-index the same class (in the same module), whether or not that is a “duplicate” will depend on the version found in the containing module. If there is no version in the containing module, the default version is used (so it is always the same). If it is a duplicate, nothing is done, this is not considered an error. If a new version is added, it will be explicitly connected to the highest version of the same module/code. So, for example,

Starting with (a.module.ClassName version=0.1.2)
If you then find a new version (a.module.ClassName version=1.2.3) There will be 2 resources, and you will have the relation:
```
a.module.ClassName/1.2.3 --version---> a.module.ClassName/0.1.2
```

If you add another version (a.module.ClassName version=1.2.4), you will have two relations:

a.module.ClassName/1.2.3 --version---> a.module.ClassName/0.1.2
a.module.ClassName/1.2.4 --version---> a.module.ClassName/1.2.3

Parameters:	dmf (idaes.dmf.DMF) – Data Management Framework instance in which to record the found metadata. pkg (module) – Root module (i.e. package root) from which to find the classes containing metadata. expr (str) – Regular expression pattern for the names of the classes in which to look for metadata. default_version (str) – Default version to use for modules with no explicit version. kwargs – Other keyword arguments passed to `codesearch.ModuleClassWalker`.
Returns:	Class that walked through the modules. You can call .get_indexed_classes() to see the list of classes walked, or .walk() to walk the modules again.
Return type:	codesearch.ModuleClassWalker
Raises:	This instantiated a DMFVisitor and calls its walk() method to walk/visit each found class, so any exception raised by the constructor or DMFVisitor.visit_metadata().

idaes.dmf.resource module¶

Resource representaitons.

class idaes.dmf.resource.CodeImporter(path, language, **kwargs)[source]¶

class idaes.dmf.resource.Dict(*args, **kwargs)[source]¶: Subclass of dict that has a ‘dirty’ bit.

class idaes.dmf.resource.FileImporter(path: pathlib.Path, do_copy: bool = None)[source]¶

class idaes.dmf.resource.JsonFileImporter(path: pathlib.Path, do_copy: bool = None)[source]¶

class idaes.dmf.resource.JupyterNotebookImporter(path: pathlib.Path, do_copy: bool = None)[source]¶

idaes.dmf.resource.PR_DERIVED = 'derived'¶: Constants for relation predicates

class idaes.dmf.resource.ProgLangExt[source]¶: Helper class to map from file extensions to names of the programming language.

idaes.dmf.resource.RESOURCE_TYPES = {'code', 'data', 'experiment', 'flowsheet', 'json', 'notebook', 'other', 'propertydb', 'resource_json', 'surrogate_model', 'tabular_data'}¶: Constants for resource ‘types’

class idaes.dmf.resource.Resource(value: dict = None, type_: str = None)[source]¶

Core object for the Data Management Framework.

ID_FIELD = 'id_'¶: Identifier field name constant

ID_LENGTH = 32¶: Full-length of identifier

exception InferResourceTypeError[source]¶

exception LoadResourceError(inferred_type, msg)[source]¶

TYPE_FIELD = 'type'¶: Resource type field name constant

data¶: Get JSON data for this resource.

classmethod from_file(path: str, as_type: str = None, strict: bool = True, do_copy: bool = True) → idaes.dmf.resource.Resource[source]¶

Import resource from a file.

Parameters:	path – File path as_type – Resource type. If None/empty, then inferred from path. strict – If True, fail when file extension and contents don’t match. If False, always fall through to generic resource. do_copy – If True (the default), copy the files; else do not
Raises:	`InferResourceTypeError` – if resource type does not match inferred/specified `LoadResourceError` – if resource import failed

get_datafiles(mode='r')[source]¶

Generate readable file objects for ‘datafiles’ in resource.

Parameters:	mode (str) – Mode for `open()`
Returns:	Generates `file` objects.
Return type:	generator

id¶: Get resource identifier.

name¶: Get resource name (first alias).

type¶: Get resource type.

class idaes.dmf.resource.ResourceImporter(path: pathlib.Path, do_copy: bool = None)[source]¶

Base class for Resource importers.

create() → idaes.dmf.resource.Resource[source]¶: Factory method.

class idaes.dmf.resource.SerializedResourceImporter(path, parsed, **kwargs)[source]¶

idaes.dmf.resource.TY_CODE = 'code'¶: Resource type for source code

idaes.dmf.resource.TY_DATA = 'data'¶: Resource type for generic data

idaes.dmf.resource.TY_EXPERIMENT = 'experiment'¶: Resource type for experiments

idaes.dmf.resource.TY_FLOWSHEET = 'flowsheet'¶: Resource type for a process flowsheet

idaes.dmf.resource.TY_JSON = 'json'¶: Resource type for JSON data

idaes.dmf.resource.TY_NOTEBOOK = 'notebook'¶: Resource type for a Jupyter Notebook

idaes.dmf.resource.TY_OTHER = 'other'¶: Resource type for unspecified type of resource

idaes.dmf.resource.TY_PROPERTY = 'propertydb'¶: Resource type for property data

idaes.dmf.resource.TY_RESOURCE_JSON = 'resource_json'¶: Resource type for a JSON serialized resource

idaes.dmf.resource.TY_SURRMOD = 'surrogate_model'¶: Resource type for a surrogate model

idaes.dmf.resource.TY_TABULAR = 'tabular_data'¶: Resource type for tabular data

class idaes.dmf.resource.TidyUnitData(data: dict = None, variables: List[T] = None, units: List[T] = None, observations: List[T] = None)[source]¶

Handle “tidy data” with per-column units.

This can be used to convert from a simple dictionary/json representation like this:

{
  "variables": ["compound", "pressure"],
  "units": [null|None, "Pa"],
  "observations": [
    ["benzene", 4890000.0],
    ...etc..
  ]
}

into a pandas DataFrame. A convenience method is provided for returning the data in a format easily dealt with when creating unit block parameters. Note that the keys in the preceding dictionary match the names of the parameters in the constructor (so you can pass this directly in as ‘**arg’).

units¶

Units for each column, None where no units are defined

Type:	list

table¶

The observation data

Type:	pandas.DataFrame

param_data¶

Data in a form easily consumed by unit block params.

The dictionary returned is like { (key1, key2, ..): value }, where the keys are values from all columns except the last, and the value is the last column.

class idaes.dmf.resource.Triple(subject, predicate, object)¶

Provide attribute access to an RDF subject, predicate, object triple

object¶: Alias for field number 2

predicate¶: Alias for field number 1

subject¶: Alias for field number 0

idaes.dmf.resource.create_relation(rel)[source]¶

Create a relationship between two Resource instances.

Relations are stored in both the subject and object resources, in the following way:

If R = (subject)S, (predicate)P, and (object)O
then store the following:
  In S.relations: {predicate: P, identifier:O.id, role:subject}
  In O.relations: {predicate: P, identifier:S.id, role:object}

Parameters:	rel (Triple) – Relation triple. The ‘subject’ and ‘object’ parts should be `Resource`, and the ‘predicate’ should be a simple string.
Returns:	None
Raises:	`ValueError` – if this relation already exists in the subject or object resource, or the predicate is not in the list of valid ones in RELATION_PREDICATES

idaes.dmf.resource.create_relation_args(*args)[source]¶: Syntactic sugar to take 3 args instead of a Triple.

idaes.dmf.resource.date_float(value)[source]¶: Convert a date to a floating point seconds since the UNIX epoch.

idaes.dmf.resource.identifier_str(value=None, allow_prefix=False)[source]¶

Generate or validate a unique identifier.

If generating, you will get a UUID in hex format

>>> identifier_str()  
'...'

If validating, anything that is not 32 lowercase letters or digits will fail.

>>> identifier_str('A' * 32)   
Traceback (most recent call last):
ValueError: Bad format for identifier "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA":
must match regular expression "[0-9a-f]{32}"

Parameters:	value (str) – If given, validate that it is a 32-byte str If not given or None, set new random value.
Raises:	ValuError, if a value is given, and it is invalid.

idaes.dmf.resource.schema_as_yaml()[source]¶: Export resource schema as YAML suitable for embedding into, e.g., an OpenAPI spec.

idaes.dmf.resource.triple_from_resource_relations(id_, rrel)[source]¶

Create a Triple from one entry in resource[‘relations’].

Parameters:	id (str) – Identifier of the containing resource. rrel (dict) – Stored relation with three keys, see create_relation().
Returns:	A triple
Return type:	Triple

idaes.dmf.resource.version_list(value)[source]¶

Semantic version.

Three numeric identifiers, separated by a dot. Trailing non-numeric characters allowed.

Inputs, string or tuple, may have less than three numeric identifiers, but internally the value will be padded with zeros to always be of length four.

A leading dash or underscore in the trailing non-numeric characters is removed.

Some examples of valid inputs and how they translate to 4-part versions:

>>> version_list('1')
[1, 0, 0, '']
>>> version_list('1.1')
[1, 1, 0, '']
>>> version_list('1a')
[1, 0, 0, 'a']
>>> version_list('1.12.1')
[1, 12, 1, '']
>>> version_list('1.12.13-1')
[1, 12, 13, '1']

Some examples of invalid inputs:

>>> for bad_input in ('rc3',      # too short
...                   '1.a.1.',   # non-number in middle
...                   '1.12.13.x' # too long
...     ):
...     try:
...         version_list(bad_input)
...     except ValueError:
...         print(f"failed: {bad_input}")
...
failed: rc3
failed: 1.a.1.
failed: 1.12.13.x

Returns:	[major:int, minor:int, debug:int, release-type:str]
Return type:	list

idaes.dmf.resourcedb module¶

Resource database.

class idaes.dmf.resourcedb.ResourceDB(dbfile=None, connection=None)[source]¶

A database interface to all the resources within a given DMF workspace.

delete(id_=None, idlist=None, filter_dict=None, internal_ids=False)[source]¶

Delete one or more resources with given identifiers.

Parameters:	id (Union[str,int]) – If given, delete this id. idlist (list) – If given, delete ids in this list filter_dict (dict) – If given, perform a search and delete ids it finds. internal_ids (bool) – If True, treat identifiers as numeric (internal) identifiers. Otherwise treat them as resource (string) indentifiers.
Returns:	(list[str]) Identifiers

find(filter_dict, id_only=False, flags=0)[source]¶

Find and return records based on the provided filter.

Parameters:	filter_dict (dict) – Search filter. For syntax, see docs in `dmf.DMF.find()`. id_only (bool) – If true, return only the identifier of each resource; otherwise a Resource object is returned. flags (int) – Flag values for, e.g., regex searches
Returns:	generator of int\|Resource, depending on the value of id_only

find_one(*args, **kwargs)[source]¶: Same as find(), but returning only first value or None.

find_related(id_, filter_dict=None, outgoing=True, maxdepth=0, meta=None)[source]¶

Find all resources connected to the identified one.

Parameters:	id (str) – Unique ID of target resource. filter_dict (dict) – Filter to these resources outgoing – maxdepth – meta (List[str]) – Metadata fields to extract
Returns:	Generator of (depth, relation, metadata)
Raises:	KeyError if the resource is not found.

get(identifier)[source]¶

Get a resource by identifier.

Parameters:	identifier – Internal identifier
Returns:	(Resource) A resource or None

put(resource)[source]¶

Put this resource into the database.

Parameters:	resource (Resource) – The resource to add
Returns:	None
Raises:	`errors.DuplicateResourceError` – If there is already a resource in the database with the same “id”.

update(id_, new_dict)[source]¶

Update the identified resource with new values.

Parameters:	id (int) – Identifier of resource to update new_dict (dict) – New dictionary of resource values
Returns:	None
Raises:	`ValueError` – If new resource is of wrong type `KeyError` – If old resource is not found

idaes.dmf.surrmod module¶

Surrogate modeling helper classes and functions. This is used to run ALAMO on property data.

class idaes.dmf.surrmod.SurrogateModel(experiment, **kwargs)[source]¶

Run ALAMO to generate surrogate models.

Automatically track the objects in the DMF.

Example:

model = SurrogateModel(dmf, simulator='linsim.py')
rsrc = dmf.fetch_one(1) # get resource ID 1
data = rsrc.property_table.data
model.set_input_data(data, ['temp'], 'density')
results = model.run()

PARAM_DATA_KEY = 'parameters'¶: Key in resource ‘data’ for params

run(**kwargs)[source]¶

Run ALAMO.

Parameters:	**kwargs – Additional arguments merged with those passed to the class constructor. Any duplicate values will override the earlier ones.
Returns:	The dictionary returned from `alamopy.doalamo()`
Return type:	dict

set_input_data(data, x_colnames, z_colname)[source]¶

Set input from provided dataframe or property data.

Parameters:	data (PropertyData\|pandas.DataFrame) – Input data x_colnames (List[str]\|str) – One or more column names for parameters z_colname (str) – Column for response variable
Returns:	None
Raises:	`KeyError` – if columns are not found in data

set_input_data_np(x, z, xlabels=None, zlabel='z')[source]¶

Set input data from numpy arrays.

Parameters:	x (arr) – Numpy array with parameters xlabels (List[str]) – List of labels for x zlabel (str) – Label for z z (arr) – Numpy array with response variables
Returns:	None

set_validation_data(data, x_colnames, z_colname)[source]¶

Set validation data from provided data.

Parameters:	data (PropertyData\|pandas.DataFrame) – Input data x_colnames (List[str]\|str) – One or more column names for parameters z_colname (str) – Column for response variable
Returns:	None
Raises:	`KeyError` – if columns are not found in data

set_validation_data_np(x, z, xlabels=None, zlabel='z')[source]¶

Set input data from numpy arrays.

Parameters:	x (arr) – Numpy array with parameters xlabels (List[str]) – List of labels for x zlabel (str) – Label for z z (arr) – Numpy array with response variables
Returns:	None

idaes.dmf.tabular module¶

Tabular data handling

class idaes.dmf.tabular.Column(name, data)[source]¶: Generic, abstract column

class idaes.dmf.tabular.Fields[source]¶

Constants for field names.

DATA_NAME = 'name'¶: Keys for data mapping

class idaes.dmf.tabular.Metadata(values=None)[source]¶

Class to import metadata.

author¶: Publication author(s).

date¶: Publication date

static from_csv(file_or_path)[source]¶

Import metadata from simple text format.

Example input:

Source,Han, J., Jin, J., Eimer, D.A., Melaaen, M.C.,"Density of             Water(1) + Monoethanolamine(2) + CO2(3) from (298.15 to 413.15) K            and Surface Tension of Water(1) + Monethanolamine(2) from (             303.15 to 333.15)K", J. Chem. Eng. Data, 2012, Vol. 57,             pg. 1095-1103"
Retrieval,"J. Morgan, date unknown"
Notes,r is MEA weight fraction in aqueous soln. (CO2-free basis)

Parameters:	file_or_path (str or file) – Input file
Returns:	(PropertyMetadata) New instance

info¶: Publication venue, etc.

source¶: Full publication info.

title¶: Publication title.

class idaes.dmf.tabular.Table(data=None, metadata=None)[source]¶

Tabular data and metadata together (at last!)

as_dict()[source]¶

Represent as a Python dictionary.

Returns:	(dict) Dictionary representation

dump(fp, **kwargs)[source]¶

Dump to file as JSON. Convenience method, equivalent to converting to a dict and calling json.dump().

Parameters:	fp (file) – Write output to this file **kwargs – Keywords passed to json.dump()
Returns:	see json.dump()

dumps(**kwargs)[source]¶

Dump to string as JSON. Convenience method, equivalent to converting to a dict and calling json.dumps().

Parameters:	**kwargs – Keywords passed to json.dumps()
Returns:	(str) JSON-formatted data

classmethod load(file_or_path, validate=True)[source]¶

Create from JSON input.

Parameters:	file_or_path (file or str) – Filename or file object from which to read the JSON-formatted data. validate (bool) – If true, apply validation to input JSON data.

Example input:

{
    "meta": [{
        "datatype": "MEA",
        "info": "J. Chem. Eng. Data, 2009, Vol 54, pg. 3096-30100",
        "notes": "r is MEA weight fraction in aqueous soln.",
        "authors": "Amundsen, T.G., Lars, E.O., Eimer, D.A.",
        "title": "Density and Viscosity of Monoethanolamine + etc."
    }],
    "data": [
        {
            "name": "Viscosity Value",
            "units": "mPa-s",
            "values": [2.6, 6.2],
            "error_type": "absolute",
            "errors": [0.06, 0.004],
            "type": "property"
        }
    ]
}

class idaes.dmf.tabular.TabularData(data, error_column=False)[source]¶

Class representing tabular data that knows how to construct itself from a CSV file.

You can build objects from multiple CSV files as well. See the property database section of the API docs for details, or read the code in add_csv() and the tests in idaes_dmf.propdb.tests.test_mergecsv.

as_arr()[source]¶

Export property data as arrays.

Returns:	(values[M,N], errors[M,N]) Two arrays of floats, each with M columns having N values.
Raises:	ValueError if the columns are not all the same length

as_list()[source]¶

Export the data as a list.

Output will be in same form as data passed to constructor.

Returns:	(list) List of dicts

errors_dataframe()[source]¶

Get errors as a dataframe.

Returns:	Pandas dataframe for values.
Return type:	pd.DataFrame
Raises:	`ImportError` – If pandas or numpy were never successfully imported.

static from_csv(file_or_path, error_column=False)[source]¶

Import the CSV data.

Expected format of the files is a header plus data rows.

Header: Index-column, Column-name(1), Error-column(1), Column-name(2), Error-column(2), .. Data: <index>, <val>, <errval>, <val>, <errval>, ..

Column-name is in the format “Name (units)”

Error-column is in the format “<type> Error”, where “<type>” is the error type.

Parameters:	file_or_path (file-like or str) – Input file error_column (bool) – If True, look for an error column after each value column. Otherwise, all columns are assumed to be values.
Returns:	New table of data
Return type:	TabularData

get_column(key)[source]¶

Get an object for the given named column.

Parameters:	key (str) – Name of column
Returns:	(TabularColumn) Column object.
Raises:	`KeyError` – No column by that name.

get_column_index(key)[source]¶

Get an index for the given named column.

Parameters:	key (str) – Name of column
Returns:	(int) Column number.
Raises:	`KeyError` – No column by that name.

names()[source]¶

Get column names.

Returns:	List of column names.
Return type:	list[str]

num_columns¶

Number of columns in this table.

A “column” is defined as data + error. So if there are two columns of data, each with an associated error column, then num_columns is 2 (not 4).

Returns:	Number of columns.
Return type:	int

num_rows¶

Number of rows in this table.

obj.num_rows is a synonym for len(obj)

Returns:	Number of rows.
Return type:	int

values_dataframe()[source]¶

Get values as a dataframe.

Returns:	(pd.DataFrame) Pandas dataframe for values.
Raises:	`ImportError` – If pandas or numpy were never successfully imported.

class idaes.dmf.tabular.TabularObject[source]¶

Abstract Property data class.

as_dict()[source]¶: Return Python dict representation.

idaes.dmf.userapi module¶

Data Management Framework high-level functions.

idaes.dmf.userapi.find_property_packages(dmf, properties=None)[source]¶

Find all property packages matching provided criteria.

Return the matching packages as a generator.

Parameters:

dmf (DMF) – Data Management Framework instance
properties (List[str]) – Names of properties that must be present in the returned packages.

Returns:

Each object has the property: data (properties and default units) in its .data attribute.

Return type:

Generator[idaes.dmf.resource.Resource]

idaes.dmf.userapi.get_workspace(path='', name=None, desc=None, create=False, errs=None, **kwargs)[source]¶

Create or load a DMF workspace.

If the DMF constructor, throws an exception, this catches it and prints the error to the provided stream (or stdout).

See DMF for details on arguments.

Parameters:	path (str) – Path to workspace. name (str) – Name to be used for workspace. desc (str) – Longer description of workspace. create (bool) – If the path to the workspace does not exist, this controls whether to create it. errs (object) – Stream for errors, stdout is used if None
Returns:	New instance, or None if it failed.
Return type:	DMF

idaes.dmf.util module¶

Utility functions.

class idaes.dmf.util.ColorTerm(enabled=True)[source]¶

For colorized printing, a very simple wrapper that allows colorama objects, or nothing, to be used.

class EmptyStr[source]¶: Return an empty string on any attribute requested.

class idaes.dmf.util.TempDir(*args)[source]¶: Simple context manager for mkdtemp().

idaes.dmf.util.datetime_timestamp(v)[source]¶

Get numeric timestamp. This will work under both Python 2 and 3.

Parameters:	v (datetime.datetime) – Date/time value
Returns:	(float) Floating point timestamp

idaes.dmf.util.get_file(file_or_path, mode='r')[source]¶: Open a file for reading, or simply return the file object.

idaes.dmf.util.get_module_author(mod)[source]¶

Find and return the module author.

Parameters:	mod (module) – Python module
Returns:	(str) Author string or None if not found
Raises:	`nothing`

idaes.dmf.util.get_module_version(mod)[source]¶

Find and return the module version.

Version must look like a semantic version with <a>.<b>.<c> parts; there can be arbitrary extra stuff after the <c>. For example:

0.12
3.6
2.3-alpha-rel0

Parameters:	mod (module) – Python module
Returns:	(str) Version string or None if not found
Raises:	ValueError if version is found but not valid

idaes.dmf.util.is_jupyter_notebook(filename, check_contents=True)[source]¶: See if this is a Jupyter notebook.

idaes.dmf.util.is_python(filename)[source]¶: See if this is a Python file. Do not import the source code.

idaes.dmf.util.is_resource_json(filename, max_bytes=1000000.0)[source]¶

Is this file a JSON Resource?

Parameters:	filename (str) – Full path to file max_bytes (int) – Max. allowable size. Since we try to parse the file, this saves potential DoS issues. Large files are a bad idea anyways, since this is metadata and may be stored somewhere with a record size limit (like MongoDB).
Returns:	(bool) Whether it’s a resource JSON file.

idaes.dmf.util.mkdir_p(path, *args)[source]¶

Try to create all non-existent components of a path.

Parameters:	path (str) – Path to create args – Other arguments for os.mkdir().
Returns:	None
Raises:	`os.error` – Raised from os.mkdir()

idaes.dmf.util.uuid_prefix_len(uuids, step=4, maxlen=32)[source]¶

Get smallest multiple of step len prefix that gives unique values.

The algorithm is not fancy, but good enough: build sets of the ids at increasing prefix lengths until the set has all ids (no duplicates). Experimentally this takes ~.1ms for 1000 duplicate ids (the worst case).

idaes.dmf.workspace module¶

Workspace classes and functions.

class idaes.dmf.workspace.Fields[source]¶: Workspace configuration fields.

class idaes.dmf.workspace.Workspace(path, create=False, add_defaults=False, html_paths=None)[source]¶

DMF Workspace.

In essence, a workspace is some information at the root of a directory tree, a database (currently file-based, so also in the directory tree) of Resources, and a set of files associated with these resources.

Workspace Configuration

When the DMF is initialized, the workspace is given as a path to a directory. In that directory is a special file named config.yaml, that contains metadata about the workspace. The very existence of a file by that name is taken by the DMF code as an indication that the containing directory is a DMF workspace:

/path/to/dmf: Root DMF directory
 |
 +- config.yaml: Configuration file
 +- resourcedb.json: Resource metadata "database" (uses TinyDB)
 +- files: Data files for all resources

The configuration file is a YAML formatted file

The configuration file defines the following key/value pairs:

_id

Unique identifier for the workspace. This is auto-generated by the library, of course.

name

Short name for the workspace.

description

Possibly longer text describing the workspace.

created

Date at which the workspace was created, as string in the ISO8601 format.

modified

Date at which the workspace was last modified, as string in the ISO8601 format.

htmldocs

Full path to the location of the built (not source) Sphinx HTML documentation for the idaes_dmf package. See DMF Help Configuration for more details.

There are many different possible “styles” of formatting a list of values in YAML, but we prefer the simple block-indented style, where the key is on its own line and the values are each indented with a dash:

_id: fe5372a7e51d498fb377da49704874eb
created: '2018-07-16 11:10:44'
description: A bottomless trashcan
modified: '2018-07-16 11:10:44'
name: Oscar the Grouch's Home
htmldocs:
- '{dmf_root}/doc/build/html/dmf'
- '{dmf_root}/doc/build/html/models'

Any paths in the workspace configuration, e.g., for the “htmldocs”, can use two special variables that will take on values relative to the workspace location. This avoids hardcoded paths and makes the workspace more portable across environments. {ws_root} will be replaces with the path to the workspace directory, and {dmf_root} will be replaced with the path to the (installed) DMF package.

The config.yaml file will allow keys and values it does not know about. These will be accessible, loaded into a Python dictionary, via the meta attribute on the Workspace instance. This may be useful for passing additional user-defined information into the DMF at startup.

CONF_CREATED = 'created'¶: Configuration field for created date

CONF_DESC = 'description'¶: Configuration field for description

CONF_MODIFIED = 'modified'¶: Configuration field for modified date

CONF_NAME = 'name'¶: Configuration field for name

ID_FIELD = '_id'¶: Name of ID field

WORKSPACE_CONFIG = 'config.yaml'¶: Name of configuration file placed in WORKSPACE_DIR

configuration_file¶: Configuration file path.

get_doc_paths()[source]¶

Get paths to generated HTML Sphinx docs.

Returns:	(list) Paths or empty list if not found.

meta¶

Get metadata.

This reads and parses the configuration. Therefore, one way to force a config refresh is to simply refer to this property, e.g.:

dmf = DMF(path='my-workspace')
#  ... do stuff that alters the config ...
dmf.meta  # re-read/parse the config

Returns:	(dict) Metadata for this workspace.

root¶: Root path for this workspace. This is the path containing the configuration file.

set_doc_paths(paths: List[str], replace: bool = False)[source]¶

Set paths to generated HTML Sphinx docs.

Parameters:	paths – New paths to add. replace – If True, replace any existing paths. Otherwise merge new paths with existing ones.

set_meta(values, remove=None)[source]¶

Update metadata with new values.

Parameters:	values (dict) – Values to add or change remove (list) – Keys of values to remove.

wsid¶

Get workspace identifier (from config file).

Returns:	Unique identifier.
Return type:	str

idaes.dmf.workspace.find_workspaces(root)[source]¶

Find workspaces at or below ‘root’.

Parameters:	root (str) – Path to start at
Returns:	paths, which are all workspace roots.
Return type:	List[str]