DMF Tables¶
Overview¶
Table handling for DMF.
The main class defined here is Table
. It provides constructor methods
for reading from Excel and CSV files. There is a convention defined for
indicating units in column headers so that this code can split the unit from
the column name. Other methods are defined for adding and extracting tables
from DMF idaes.core.dmf.resource.Resource
objects.
In the simplest case, you would create a new DMF resource for a CSV table like this:
from idaes.core.dmf.resource import Resource
resource = Resource()
resource.add_table("my_file.csv")
# you can now save this resource in the DMF
Then you could retrieve and use that table like this:
# retrieve resource from the DMF
table = resource.tables["my_file.csv"]
dataframe = table.data # Pandas dataframe
units = table.units # Units extracted from header row (strings)
See also, on the DMF Resource class:
idaes.core.dmf.resource.Resource.add_table()
idaes.core.dmf.resource.Resource.tables
Table class¶
- class idaes.core.dmf.tables.Table[source]
Represent a table stored in the DMF.
Tables are expected to have a header row with optional units, which if present are encoded in [square brackets]. Whitespace is ignored between the column name and the units. For example:
T [C], P [bar], G0/RT H2O, G0/RT NaCl [-], A phi [(kg/mol^0.5] 0, 1, -23.4638, -13.836, 0.3767
- UNITS_REGEX = '\n (?P<name>[^[]+) # column name\n (?:\\s*\\[ # start of [units] section\n (?P<units>.*?) # column units\n \\])? # end of [units] section, which is optional\n '
Regular expression for extracting units from column names. In plain English, the following forms are expected for a column name: “Name”, “Name[Units]”, “Longer Name With $% Chars [ Units ]” For both the Name and the Units, any sequence of characters valid in the current encoding are acceptable (except, of course, a “[” in the name, which means start-of-units)
- add_to_resource(rsrc)[source]
Add the current table, inline, to the given resource.
- Parameters
rsrc (Resource) – A DMF
Resource
instance- Returns
None
- as_dict(values=True)[source]
Get the representation of this table as a dict.
- Parameters
values – If True, include the values in the dict. Otherwise only include the units for each column.
- Returns
Dictionary with the structure accepted by
from_dict()
. If the “values” argument is False, that key will be missing from the dict for each column.- Return type
- property data: DataFrame
Pandas dataframe for data.
- classmethod from_dict(data)[source]
Create a new Table object from a dictionary of data and units.
- Parameters
data (Dict) –
Dictionary with the following structure:
{ 'column-name-1': { 'units': 'unit', 'values': [ value, value, .. ] }, 'column-name-2': { 'units': 'unit', 'values': [ value, value, .. ] }, ...etc... }
- Returns
Table
object- Return type
Table
- classmethod from_resource(rsrc)[source]
Get an instance of this class from data in the given resource.
- Parameters
rsrc (Resource) – A DMF
Resource
instance- Returns
Dictionary of tables in resource. If there is only one inline table, the dictionary is of length one with only key “” (empty string). If there are multiple tables referenced by file the dictionary keys are the (relative) file names. If there are no tables in this resource, raises KeyError.
- Raises
KeyError – if there are no tables in this resource
- Return type
- read_csv(filepath, **kwargs)[source]
Read the table from a CSV file using pandas’ read_csv(). See Pandas read_csv docs for details.
Existing table will be replaced.
- Parameters
filepath – Any valid first argument to pandas read_csv
kwargs – Keyword arguments passed to pandas read_csv
- Returns
None
- Return type
None
- read_excel(filepath, **kwargs)[source]
Read the table from a CSV file using pandas’ read_excel(). See Pandas read_excel docs for details.
Existing table will be replaced.
- Parameters
filepath – Any valid first argument to pandas read_excel
**kwargs – Keyword arguments passed to pandas read_excel
- Returns
None
- Raises
ValueError – if more than one Excel sheet is returned
DataFormatError – if the input data or header is invalid
- Return type
None
- static read_table(filepath, inline, file_format)[source]
Determine the input file type, then construct a new Table object by calling one of
Table.read_csv()
orTable.read_excel()
.- Parameters
filepath – Any valid first argument to pandas read_csv
inline (bool) – If True, read the whole table in; otherwise just get the column names and units from the header row.
file_format (str) – One of ‘infer’, ‘csv’, or ‘excel’. For ‘infer’, use the file extension (and only the extension) to determine if it’s a CSV or Excel file.
- Returns
Constructed Table object
- Raises
IOError – If the input cannot be read or parsed
- Return type
Table