API

Top-level methods and classes:

configure([path])

Configure reporting globally.

Computer(**kwargs)

Class for describing and executing computations.

Key(name[, dims, tag])

A hashable key for a quantity that includes its dimensionality.

Quantity

Convert arguments to the internal Quantity data format.

Others:

genno.configure(path=None, **config)

Configure reporting globally.

Modifies global variables that affect the behaviour of all Computers and computations, namely RENAME_DIMS and REPLACE_UNITS.

Valid configuration keys—passed as config keyword arguments—include:

Other Parameters
  • units (mapping) – Configuration for handling of units. Valid sub-keys include:

  • rename_dims (mapping of str -> str) – Update RENAME_DIMS.

Warns

UserWarning – If config contains unrecognized keys.

class genno.Computer(**kwargs)

Class for describing and executing computations.

Parameters

kwargs – Passed to configure().

A Computer is used to postprocess data from from one or more ixmp.Scenario objects. The get() method can be used to:

  • Retrieve individual quantities. A quantity has zero or more dimensions and optional units. Quantities include the ‘parameters’, ‘variables’, ‘equations’, and ‘scalars’ available in an ixmp.Scenario.

  • Generate an entire report composed of multiple quantities. A report may:

    • Read in non-model or exogenous data,

    • Trigger output to files(s) or a database, or

    • Execute user-defined methods.

Every report and quantity (including the results of intermediate steps) is identified by a Key; all the keys in a Computer can be listed with keys().

Computer uses a graph data structure to keep track of computations, the atomic steps in postprocessing: for example, a single calculation that multiplies two quantities to create a third. The graph allows get() to perform only the requested computations. Advanced users may manipulate the graph directly; but common reporting tasks can be handled by using Computer methods:

add(data, *args, **kwargs)

General-purpose method to add computations.

add_file(path[, key])

Add exogenous quantities from path.

add_product(key, *quantities[, sums])

Add a computation that takes the product of quantities.

add_queue(queue[, max_tries, fail])

Add tasks from a list or queue.

add_single(key, *computation[, strict, index])

Add a single computation at key.

aggregate(qty, tag, dims_or_groups[, …])

Add a computation that aggregates qty.

apply(generator, *keys, **kwargs)

Add computations by applying generator to keys.

check_keys(*keys)

Check that keys are in the Computer.

configure([path])

Configure the Computer.

describe([key, quiet])

Return a string describing the computations that produce key.

disaggregate(qty, new_dim[, method, args])

Add a computation that disaggregates qty using method.

full_key(name_or_key)

Return the full-dimensionality key for name_or_key.

get([key])

Execute and return the result of the computation key.

keys()

Return the keys of graph.

visualize(filename, **kwargs)

Generate an image describing the reporting structure.

write(key, path)

Write the report key to the file path.

graph: Dict[str, Union[str, dict]] = {'config': {}}

A dask-format graph.

add(data, *args, **kwargs)

General-purpose method to add computations.

add() can be called in several ways; its behaviour depends on data; see below. It chains to methods such as add_single(), add_queue(), and apply(), which can also be called directly.

Parameters
  • data (various) –

  • args (various) –

Other Parameters

sums (bool, optional) – If True, all partial sums of the key data are also added to the Computer.

Returns

Some or all of the keys added to the Computer.

Return type

list of Key-like

Raises

KeyError – If a target key is already in the Computer; any key referred to by a computation does not exist; or sums=True and the key for one of the partial sums of key is already in the Computer.

add() may be called with:

  • list : data is a list of computations like [(list(args1), dict(kwargs1)), (list(args2), dict(kwargs2)), ...] that are added one-by-one.

  • the name of a function in computations (e.g. ‘select’): A computation is added with key args[0], applying the named function to args[1:] and kwargs.

  • str, the name of a Computer method (e.g. ‘apply’): the corresponding method (e.g. apply()) is called with the args and kwargs.

  • Any other str or Key: the arguments are passed to add_single().

add() may also be used to:

  • Provide an alias from one key to another:

    >>> from genno import Computer
    >>> rep = Computer()  # Create a new Computer object
    >>> rep.add('aliased name', 'original name')
    
  • Define an arbitrarily complex computation in a Python function that operates directly on the ixmp.Scenario:

    >>> def my_report(scenario):
    >>>     # many lines of code
    >>>     return 'foo'
    >>> rep.add('my report', (my_report, 'scenario'))
    >>> rep.finalize(scenario)
    >>> rep.get('my report')
    foo
    

Note

Use care when adding literal str() values as a computation argument for add(); these may conflict with keys that identify the results of other computations.

apply(generator, *keys, **kwargs)

Add computations by applying generator to keys.

Parameters
  • generator (callable) – Function to apply to keys.

  • keys (hashable) – The starting key(s).

  • kwargs – Keyword arguments to generator.

The generator may have a type annotation for Computer on its first positional argument. In this case, a reference to the Computer is supplied, and generator may use the Computer methods to add computations:

def gen0(r: ixmp.Computer, **kwargs):
    r.load_file('file0.txt', **kwargs)
    r.load_file('file1.txt', **kwargs)

# Use the generator to add several computations
rep.apply(my_gen, units='kg')

Or, generator may yield a sequence (0 or more) of (key, computation), which are added to the graph:

def gen1(**kwargs):
    op = partial(computations.load_file, **kwargs)
    yield from (f'file:{i}', op, 'file{i}.txt') for i in range(2)

rep.apply(my_gen, units='kg')
add_file(path, key=None, **kwargs)

Add exogenous quantities from path.

Reporting the key or using it in other computations causes path to be loaded and converted to Quantity.

Parameters
  • path (os.PathLike) – Path to the file, e.g. ‘/path/to/foo.ext’.

  • key (str or Key, optional) – Key for the quantity read from the file.

Other Parameters
  • dims (dict or list or set) – Either a collection of names for dimensions of the quantity, or a mapping from names appearing in the input to dimensions.

  • units (str or pint.Unit) – Units to apply to the loaded Quantity.

Returns

Either key (if given) or e.g. file:foo.ext based on the path name, without directory components.

Return type

Key

add_product(key, *quantities, sums=True)

Add a computation that takes the product of quantities.

Parameters
  • key (str or Key) – Key of the new quantity. If a Key, any dimensions are ignored; the dimensions of the product are the union of the dimensions of quantities.

  • sums (bool, optional) – If True, all partial sums of the new quantity are also added.

Returns

The full key of the new quantity.

Return type

Key

add_queue(queue, max_tries=1, fail='raise')

Add tasks from a list or queue.

Parameters
  • queue (list of 2-tuple) – The members of each tuple are the arguments (i.e. a list or tuple) and keyword arguments (i.e. a dict) to add().

  • max_tries (int, optional) – Retry adding elements up to this many times.

  • fail ('raise' or log level, optional) – Action to take when a computation from queue cannot be added after max_tries.

add_single(key, *computation, strict=False, index=False)

Add a single computation at key.

Parameters
  • key (str or Key or hashable) – A string, Key, or other value identifying the output of task.

  • computation (object) –

    Any dask computation, i.e. one of:

    1. any existing key in the Computer.

    2. any other literal value or constant.

    3. a task, i.e. a tuple with a callable followed by one or more computations.

    4. A list containing one or more of #1, #2, and/or #3.

  • strict (bool, optional) – If True, key must not already exist in the Computer, and any keys referred to by computation must exist.

  • index (bool, optional) – If True, key is added to the index as a full-resolution key, so it can be later retrieved with full_key().

aggregate(qty, tag, dims_or_groups, weights=None, keep=True, sums=False)

Add a computation that aggregates qty.

Parameters
  • qty (Key or str) – Key of the quantity to be aggregated.

  • tag (str) – Additional string to add to the end the key for the aggregated quantity.

  • dims_or_groups (str or iterable of str or dict) – Name(s) of the dimension(s) to sum over, or nested dict.

  • weights (xarray.DataArray, optional) – Weights for weighted aggregation.

  • keep (bool, optional) – Passed to computations.aggregate.

  • sums (bool, optional) – Passed to add().

Returns

The key of the newly-added node.

Return type

Key

check_keys(*keys)

Check that keys are in the Computer.

If any of keys is not in the Computer, KeyError is raised. Otherwise, a list is returned with either the key from keys, or the corresponding full_key().

configure(path=None, **config)

Configure the Computer.

Accepts a path to a configuration file and/or keyword arguments. Configuration keys loaded from file are replaced by keyword arguments.

Valid configuration keys include:

  • default: the default reporting key; sets default_key.

  • filters: a dict, passed to set_filters().

  • files: a list where every element is a dict of keyword arguments to add_file().

  • alias: a dict mapping aliases to original keys.

Warns

UserWarning – If config contains unrecognized keys.

default_key = None

The default reporting key.

describe(key=None, quiet=True)

Return a string describing the computations that produce key.

If key is not provided, all keys in the Computer are described.

The string can be printed to the console, if not quiet.

disaggregate(qty, new_dim, method='shares', args=[])

Add a computation that disaggregates qty using method.

Parameters
  • qty (hashable) – Key of the quantity to be disaggregated.

  • new_dim (str) – Name of the new dimension of the disaggregated variable.

  • method (callable or str) – Disaggregation method. If a callable, then it is applied to var with any extra args. If then a method named ‘disaggregate_{method}’ is used.

  • args (list, optional) – Additional arguments to the method. The first element should be the key for a quantity giving shares for disaggregation.

Returns

The key of the newly-added node.

Return type

Key

full_key(name_or_key)

Return the full-dimensionality key for name_or_key.

An quantity ‘foo’ with dimensions (a, c, n, q, x) is available in the Computer as 'foo:a-c-n-q-x'. This Key can be retrieved with:

rep.full_key("foo")
rep.full_key("foo:c")
# etc.
get(key=None)

Execute and return the result of the computation key.

Only key and its dependencies are computed.

Parameters

key (str, optional) – If not provided, default_key is used.

Raises

ValueError – If key and default_key are both None.

keys()

Return the keys of graph.

property unit_registry

The pint.UnitRegistry() used by the Computer.

visualize(filename, **kwargs)

Generate an image describing the reporting structure.

This is a shorthand for dask.visualize(). Requires graphviz.

write(key, path)

Write the report key to the file path.

class genno.Key(name, dims=[], tag=None)

A hashable key for a quantity that includes its dimensionality.

Quantities in a Scenario can be indexed by one or more dimensions. A Key refers to a quantity using three components:

  1. a string name,

  2. zero or more ordered dims, and

  3. an optional tag.

For example, an ixmp parameter with three dimensions can be initialized with:

>>> scenario.init_par('foo', ['a', 'b', 'c'], ['apple', 'bird', 'car'])

Key allows a specific, explicit reference to various forms of “foo”:

  • in its full resolution, i.e. indexed by a, b, and c:

    >>> k1 = Key('foo', ['a', 'b', 'c'])
    >>> k1 == 'foo:a-b-c'
    True
    

    Notice that a Key has the same hash, and compares equal (==) to its str().

  • in a partial sum over one dimension, e.g. summed along c with dimensions a and b:

    >>> k2 = k1.drop('c')
    >>> k2 == 'foo:a-b'
    True
    
  • in a partial sum over multiple dimensions, etc.:

    >>> k1.drop('a', 'c') == k2.drop('a') == 'foo:b'
    True
    

Note

Some remarks:

  • repr(key) prints the Key in angle brackets (‘<>’) to signify it is a Key object.

    >>> repr(k1)
    <foo:a-b-c>
    
  • Keys are immutable: the properties name, dims, and tag are read-only, and the methods append(), drop(), and add_tag() return new Key objects.

  • Keys may be generated concisely by defining a convenience method:

    >>> def foo(dims):
    >>>     return Key('foo', dims.split())
    >>> foo('a b c')
    foo:a-b-c
    
add_tag(tag)

Return a new Key with tag appended.

append(*dims)

Return a new Key with additional dimensions dims.

property dims

Dimensions of the quantity, tuple of str.

drop(*dims)

Return a new Key with dims dropped.

classmethod from_str_or_key(value, drop=[], append=[], tag=None)

Return a new Key from value.

Parameters
  • value (str or Key) – Value to use to generate a new Key.

  • drop (list of str or True, optional) – Existing dimensions of value to drop. See drop().

  • append (list of str, optional.) – New dimensions to append to the returned Key. See append().

  • tag (str, optional) – Tag for returned Key. If value has a tag, the two are joined using a ‘+’ character. See add_tag().

Returns

Return type

Key

iter_sums()

Generate (key, task) for all possible partial sums of the Key.

property name

Name of the quantity, str.

classmethod product(new_name, *keys, tag=None)

Return a new Key that has the union of dimensions on keys.

Dimensions are ordered by their first appearance:

  1. First, the dimensions of the first of the keys.

  2. Next, any additional dimensions in the second of the keys that were not already added in step 1.

  3. etc.

Parameters

new_name (str) – Name for the new Key. The names of keys are discarded.

property tag

Quantity tag, str.

genno.Quantity(data, *args, **kwargs)

Convert arguments to the internal Quantity data format.

Parameters
Other Parameters
  • name (str, optional) – Quantity name.

  • units (str, optional) – Quantity units.

  • attrs (dict, optional) – Dictionary of attributes; similar to attrs.

The Quantity constructor converts its arguments to an internal, xarray.DataArray-like data format:

# Existing data
data = pd.Series(...)

# Convert to a Quantity for use in reporting calculations
qty = Quantity(data, name="Quantity name", units="kg")
rep.add("new_qty", qty)

Common genno usage, e.g. in message_ix, creates large, sparse data frames (billions of possible elements, but <1% populated); DataArray’s default, ‘dense’ storage format would be too large for available memory.

  • Currently, Quantity is AttrSeries, a wrapped pandas.Series that behaves like a DataArray.

  • In the future, genno will use SparseDataArray, and eventually DataArray backed by sparse data, directly.

The goal is that reporting code, including built-in and user computations, can treat quantity arguments as if they were DataArray.

Computations

Elementary computations for reporting.

Unless otherwise specified, these methods accept and return Quantity objects for data arguments/return values.

Calculations:

add(*quantities[, fill_value])

Sum across multiple quantities.

aggregate(quantity, groups, keep)

Aggregate quantity by groups.

apply_units(qty, units[, quiet])

Simply apply units to qty.

disaggregate_shares(quantity, shares)

Disaggregate quantity by shares.

product(*quantities)

Return the product of any number of quantities.

ratio(numerator, denominator)

Return the ratio numerator / denominator.

select(qty, indexers[, inverse])

Select from qty based on indexers.

sum(quantity[, weights, dimensions])

Sum quantity over dimensions, with optional weights.

Input and output:

load_file(path[, dims, units])

Read the file at path and return its contents as a Quantity.

write_report(quantity, path)

Write a quantity to a file.

Data manipulation:

concat(*objs, **kwargs)

Concatenate Quantity objs.

genno.computations.aggregate(quantity, groups, keep)

Aggregate quantity by groups.

Parameters
  • quantity (Quantity) –

  • groups (dict of dict) – Top-level keys are the names of dimensions in quantity. Second-level keys are group names; second-level values are lists of labels along the dimension to sum into a group.

  • keep (bool) – If True, the members that are aggregated into a group are returned with the group sums. If False, they are discarded.

Returns

Same dimensionality as quantity.

Return type

Quantity

genno.computations.apply_units(qty, units, quiet=False)

Simply apply units to qty.

Logs on level WARNING if qty already has existing units.

Parameters
  • qty (Quantity) –

  • units (str or pint.Unit) – Units to apply to qty

  • quiet (bool, optional) – If True log on level DEBUG.

genno.computations.concat(*objs, **kwargs)

Concatenate Quantity objs.

Any strings included amongst args are discarded, with a logged warning; these usually indicate that a quantity is referenced which is not in the Reporter.

genno.computations.disaggregate_shares(quantity, shares)

Disaggregate quantity by shares.

genno.computations.load_file(path, dims={}, units=None)

Read the file at path and return its contents as a Quantity.

Some file formats are automatically converted into objects for direct use in reporting code:

.csv:

Converted to Quantity. CSV files must have a ‘value’ column; all others are treated as indices, except as given by dims. Lines beginning with ‘#’ are ignored.

Parameters
  • path (pathlib.Path) – Path to the file to read.

  • dims (collections.abc.Collection or collections.abc.Mapping, optional) – If a collection of names, other columns besides these and ‘value’ are discarded. If a mapping, the keys are the column labels in path, and the values are the target dimension names.

  • units (str or pint.Unit) – Units to apply to the loaded Quantity.

genno.computations.product(*quantities)

Return the product of any number of quantities.

genno.computations.ratio(numerator, denominator)

Return the ratio numerator / denominator.

Parameters
  • numerator (Quantity) –

  • denominator (Quantity) –

genno.computations.select(qty, indexers, inverse=False)

Select from qty based on indexers.

Parameters
  • qty (Quantity) –

  • indexers (dict (str -> list of str)) – Elements to be selected from qty. Mapping from dimension names to labels along each dimension.

  • inverse (bool, optional) – If True, remove the items in indexers instead of keeping them.

genno.computations.sum(quantity, weights=None, dimensions=None)

Sum quantity over dimensions, with optional weights.

Parameters
  • quantity (Quantity) –

  • weights (Quantity, optional) – If dimensions is given, weights must have at least these dimensions. Otherwise, any dimensions are valid.

  • dimensions (list of str, optional) – If not provided, sum over all dimensions. If provided, sum over these dimensions.

genno.computations.write_report(quantity, path)

Write a quantity to a file.

Parameters

path (str or Path) – Path to the file to be written.

Internal format for quantities

genno.core.quantity.assert_quantity(*args)

Assert that each of args is a Quantity object.

Raises

TypeError – with a indicative message.

class genno.core.attrseries.AttrSeries(data=None, *args, name=None, attrs=None, **kwargs)

pandas.Series subclass imitating xarray.DataArray.

The AttrSeries class provides similar methods and behaviour to xarray.DataArray, so that genno.computations methods can use xarray-like syntax.

Parameters
  • units (str or pint.Unit, optional) – Set the units attribute. The value is converted to pint.Unit and added to attrs.

  • attrs (Mapping, optional) – Set the attrs of the AttrSeries. This attribute was added in pandas 1.0, but is not currently supported by the Series constructor.

align_levels(other)

Work around https://github.com/pandas-dev/pandas/issues/25760.

Return a copy of obj with common levels in the same order as ref.

assign_coords(**kwargs)

Like xarray.DataArray.assign_coords().

property coords

Like xarray.DataArray.coords. Read-only.

property dims

Like xarray.DataArray.dims.

drop(label)

Like xarray.DataArray.drop().

classmethod from_series(series, sparse=None)

Like xarray.DataArray.from_series().

item(*args)

Like xarray.DataArray.item().

rename(new_name_or_name_dict)

Like xarray.DataArray.rename().

sel(indexers=None, drop=False, **indexers_kwargs)

Like xarray.DataArray.sel().

squeeze(dim=None, *args, **kwargs)

Like xarray.DataArray.squeeze().

sum(*args, **kwargs)

Like xarray.DataArray.sum().

to_dataframe()

Like xarray.DataArray.to_dataframe().

to_series()

Like xarray.DataArray.to_series().

transpose(*dims)

Like xarray.DataArray.transpose().

class genno.core.sparsedataarray.SparseAccessor(obj)

xarray accessor to help SparseDataArray.

See the xarray accessor documentation, e.g. register_dataarray_accessor().

property COO_data

True if the DataArray has sparse.COO data.

convert()

Return a SparseDataArray instance.

property dense

Return a copy with dense (ndarray) data.

property dense_super

Return a proxy to a ndarray-backed DataArray.

class genno.core.sparsedataarray.SparseDataArray(data: Any = <NA>, coords: Optional[Union[Sequence[Tuple], Mapping[Hashable, Any]]] = None, dims: Optional[Union[Hashable, Sequence[Hashable]]] = None, name: Optional[Hashable] = None, attrs: Optional[Mapping] = None, indexes: Optional[Dict[Hashable, pandas.core.indexes.base.Index]] = None, fastpath: bool = False)

DataArray with sparse data.

SparseDataArray uses sparse.COO for storage with numpy.nan as its sparse.COO.fill_value. Some methods of DataArray are overridden to ensure data is in sparse, or dense, format as necessary, to provide expected functionality not currently supported by sparse, and to avoid exhausting memory for some operations that require dense data.

equals(other)bool

True if two SparseDataArrays have the same dims, coords, and values.

Overrides equals() for sparse data.

classmethod from_series(obj, sparse=True)

Convert a pandas.Series into a SparseDataArray.

property loc

Attribute for location based indexing like pandas.

Note

This version does not allow assignment, since the underlying sparse array is read-only. To modify the contents, create a copy or perform an operation that returns a new array.

sel(indexers=None, method=None, tolerance=None, drop=False, **indexers_kwargs)genno.core.sparsedataarray.SparseDataArray

Return a new array by selecting labels along the specified dim(s).

Overrides sel() to handle >1-D indexers with sparse data.

to_dataframe(name=None)

Convert this array and its coords into a DataFrame.

Overrides to_dataframe().

to_series() → pandas.core.series.Series

Convert this array into a Series.

Overrides to_series() to create the series without first converting to a potentially very large numpy.ndarray.

Utilities

genno.util.RENAME_DIMS: Dict[str, str] = {}

Dimensions to rename when extracting raw data from Scenario objects. Mapping from Scenario dimension name -> preferred dimension name.

genno.util.REPLACE_UNITS = {'%': 'percent'}

Replacements to apply to quantity units before parsing by pint. Mapping from original unit -> preferred unit.

genno.util.clean_units(input_string)

Tolerate messy strings for units.

Handles two specific cases found in MESSAGEix test cases:

  • Dimensions enclosed in ‘[]’ have these characters stripped.

  • The ‘%’ symbol cannot be supported by pint, because it is a Python operator; it is translated to ‘percent’.

genno.util.collect_units(*args)

Return an list of ‘_unit’ attributes for args.

genno.util.dims_for_qty(data)

Return the list of dimensions for data.

If data is a pandas.DataFrame, its columns are processed; otherwise it must be a list.

genno.RENAME_DIMS is used to rename dimensions.

genno.util.filter_concat_args(args)

Filter out str and Key from args.

A warning is logged for each element removed.

genno.util.parse_units(units_series)

Return a pint.Unit for a pd.Series of strings.

genno.util.partial_split(func, kwargs)

Forgiving version of functools.partial().

Returns a partial object and leftover kwargs not applicable to func.