DMF Tables

Overview

Table handling for DMF.

The main class defined here is Table. It provides constructor methods for reading from Excel and CSV files. There is a convention defined for indicating units in column headers so that this code can split the unit from the column name. Other methods are defined for adding and extracting tables from DMF idaes.core.dmf.resource.Resource objects.

In the simplest case, you would create a new DMF resource for a CSV table like this:

from idaes.core.dmf.resource import Resource
resource = Resource()
resource.add_table("my_file.csv")
# you can now save this resource in the DMF

Then you could retrieve and use that table like this:

# retrieve resource from the DMF
table = resource.tables["my_file.csv"]
dataframe = table.data    # Pandas dataframe
units = table.units       # Units extracted from header row (strings)

See also, on the DMF Resource class:

  • idaes.core.dmf.resource.Resource.add_table()

  • idaes.core.dmf.resource.Resource.tables

Table class

class idaes.core.dmf.tables.Table[source]

Represent a table stored in the DMF.

Tables are expected to have a header row with optional units, which if present are encoded in [square brackets]. Whitespace is ignored between the column name and the units. For example:

T [C], P [bar], G0/RT H2O, G0/RT NaCl [-], A phi [(kg/mol^0.5]
0, 1, -23.4638, -13.836, 0.3767
UNITS_REGEX = '\n        (?P<name>[^[]+) # column name\n        (?:\\s*\\[        # start of [units] section\n        (?P<units>.*?)  # column units\n        \\])?            # end of [units] section, which is optional\n        '

Regular expression for extracting units from column names. In plain English, the following forms are expected for a column name: “Name”, “Name[Units]”, “Longer Name With $% Chars [ Units ]” For both the Name and the Units, any sequence of characters valid in the current encoding are acceptable (except, of course, a “[” in the name, which means start-of-units)

add_to_resource(rsrc)[source]

Add the current table, inline, to the given resource.

Parameters

rsrc (Resource) – A DMF Resource instance

Returns

None

as_dict(values=True)[source]

Get the representation of this table as a dict.

Parameters

values – If True, include the values in the dict. Otherwise only include the units for each column.

Returns

Dictionary with the structure accepted by from_dict(). If the “values” argument is False, that key will be missing from the dict for each column.

Return type

Dict

property data: DataFrame

Pandas dataframe for data.

classmethod from_dict(data)[source]

Create a new Table object from a dictionary of data and units.

Parameters

data (Dict) –

Dictionary with the following structure:

{
    'column-name-1': {
        'units': 'unit',
        'values': [ value, value, .. ]
    },
    'column-name-2': {
        'units': 'unit',
        'values': [ value, value, .. ]
    },
    ...etc...
}

Returns

Table object

Return type

Table

classmethod from_resource(rsrc)[source]

Get an instance of this class from data in the given resource.

Parameters

rsrc (Resource) – A DMF Resource instance

Returns

Dictionary of tables in resource. If there is only one inline table, the dictionary is of length one with only key “” (empty string). If there are multiple tables referenced by file the dictionary keys are the (relative) file names. If there are no tables in this resource, raises KeyError.

Raises

KeyError – if there are no tables in this resource

Return type

Dict[str, Table]

read_csv(filepath, **kwargs)[source]

Read the table from a CSV file using pandas’ read_csv(). See Pandas read_csv docs for details.

Existing table will be replaced.

Parameters
  • filepath – Any valid first argument to pandas read_csv

  • kwargs – Keyword arguments passed to pandas read_csv

Returns

None

Return type

None

read_excel(filepath, **kwargs)[source]

Read the table from a CSV file using pandas’ read_excel(). See Pandas read_excel docs for details.

Existing table will be replaced.

Parameters
  • filepath – Any valid first argument to pandas read_excel

  • **kwargs – Keyword arguments passed to pandas read_excel

Returns

None

Raises
  • ValueError – if more than one Excel sheet is returned

  • DataFormatError – if the input data or header is invalid

Return type

None

static read_table(filepath, inline, file_format)[source]

Determine the input file type, then construct a new Table object by calling one of Table.read_csv() or Table.read_excel().

Parameters
  • filepath – Any valid first argument to pandas read_csv

  • inline (bool) – If True, read the whole table in; otherwise just get the column names and units from the header row.

  • file_format (str) – One of ‘infer’, ‘csv’, or ‘excel’. For ‘infer’, use the file extension (and only the extension) to determine if it’s a CSV or Excel file.

Returns

Constructed Table object

Raises

IOError – If the input cannot be read or parsed

Return type

Table

property units: List[str]

Shorthand for getting list of units

property units_dict: Dict[str, str]

Units as a dict keyed by table column name.

property units_list: List[str]

Units in order of table columns.