vivarium.artifact.hdf

HDF Interface

A convenience wrapper around the tables and pandas HDF interfaces.

Public Interface

The public interface consists of 5 functions:

HDF Public Interface

Function

Description

touch()

Creates an HDF file, wiping an existing file if necessary.

write()

Stores data at a key in an HDF file.

load()

Loads (potentially filtered) data from a key in an HDF file.

remove()

Clears data from a key in an HDF file.

get_keys()

Gets all available HDF keys from an HDF file.

Contracts

  • All functions in the public interface accept both pathlib.Path and normal Python str objects for paths.

  • All functions in the public interface accept only str objects as representations of the keys in the hdf file. The strings must be formatted as "type.name.measure" or "type.measure".

vivarium.artifact.hdf.touch(path)[source]

Creates an HDF file, wiping an existing file if necessary.

If the given path is proper to create a HDF file, it creates a new HDF file.

Return type:

None

Parameters:

path (Path | str) – The path to the HDF file.

Raises:

ValueError – If the non-proper path is given to create a HDF file.

vivarium.artifact.hdf.write(path, entity_key, data)[source]

Writes data to the HDF file at the given path to the given key.

Return type:

None

Parameters:
  • path (Path | str) – The path to the HDF file to write to.

  • entity_key (str) – A string representation of the internal HDF path where we want to write the data. The key must be formatted as "type.name.measure" or "type.measure".

  • data (Any) – The data to write. If it is a pandas object, it will be written using a pandas.HDFStore or pandas.DataFrame.to_hdf(). If it is some other kind of python object, it will first be encoded as json with json.dumps() and then written to the provided key.

Raises:

ValueError – If the path or entity_key are improperly formatted.

vivarium.artifact.hdf.load(path, entity_key, filter_terms, column_filters)[source]

Loads data from an HDF file.

Return type:

Any

Parameters:
  • path (Path | str) – The path to the HDF file to load the data from.

  • entity_key (str) – A representation of the internal HDF path where the data is located.

  • filter_terms (list[str] | None) – An optional list of terms used to filter the rows in the data. The terms must be formatted in a way that is suitable for use with the where argument of pandas.read_hdf(). Only filters applying to existing columns in the data are used.

  • column_filters (list[str] | None) – An optional list of columns to load from the data.

Returns:

The data stored at the the given key in the HDF file.

Raises:

ValueError – If the path or entity_key are improperly formatted.

vivarium.artifact.hdf.remove(path, entity_key)[source]

Removes a piece of data from an HDF file.

Return type:

None

Parameters:
  • path (Path | str) – The path to the HDF file to remove the data from.

  • entity_key (str) – A representation of the internal HDF path where the data is located.

Raises:

ValueError – If the path or entity_key are improperly formatted.

vivarium.artifact.hdf.get_keys(path)[source]

Gets key representation of all paths in an HDF file.

Return type:

list[str]

Parameters:

path (Path | str) – The path to the HDF file.

Returns:

A list of key representations of the internal paths in the HDF.