API¶

Top-level methods and classes:

`configure`([path])	Configure reporting globally.
`Computer`(**kwargs)	Class for describing and executing computations.
`Key`(name[, dims, tag])	A hashable key for a quantity that includes its dimensionality.
`Quantity`	Convert arguments to the internal Quantity data format.

Others:

Computations
Internal format for quantities
Utilities

genno.configure(path=None, **config)¶

Configure reporting globally.

Modifies global variables that affect the behaviour of all Computers and computations, namely RENAME_DIMS and REPLACE_UNITS.

Valid configuration keys—passed as config keyword arguments—include:

Other Parameters

units (mapping) – Configuration for handling of units. Valid sub-keys include:
- replace (mapping of str -> str): replace units before they are parsed by pint. Added to REPLACE_UNITS.
- define (str): block of unit definitions, added to the pint application registry so that units are recognized. See the pint documentation on defining units.
rename_dims (mapping of str -> str) – Update RENAME_DIMS.

Warns

UserWarning – If config contains unrecognized keys.

class genno.Computer(**kwargs)¶

Class for describing and executing computations.

Parameters: kwargs – Passed to configure().

A Computer is used to postprocess data from from one or more ixmp.Scenario objects. The get() method can be used to:

Retrieve individual quantities. A quantity has zero or more dimensions and optional units. Quantities include the ‘parameters’, ‘variables’, ‘equations’, and ‘scalars’ available in an ixmp.Scenario.
Generate an entire report composed of multiple quantities. A report may:
- Read in non-model or exogenous data,
- Trigger output to files(s) or a database, or
- Execute user-defined methods.

Every report and quantity (including the results of intermediate steps) is identified by a Key; all the keys in a Computer can be listed with keys().

Computer uses a graph data structure to keep track of computations, the atomic steps in postprocessing: for example, a single calculation that multiplies two quantities to create a third. The graph allows get() to perform only the requested computations. Advanced users may manipulate the graph directly; but common reporting tasks can be handled by using Computer methods:

`add`(data, args, *kwargs)	General-purpose method to add computations.
`add_file`(path[, key])	Add exogenous quantities from path.
`add_product`(key, *quantities[, sums])	Add a computation that takes the product of quantities.
`add_queue`(queue[, max_tries, fail])	Add tasks from a list or queue.
`add_single`(key, *computation[, strict, index])	Add a single computation at key.
`aggregate`(qty, tag, dims_or_groups[, …])	Add a computation that aggregates qty.
`apply`(generator, keys, *kwargs)	Add computations by applying generator to keys.
`check_keys`(*keys)	Check that keys are in the Computer.
`configure`([path])	Configure the Computer.
`describe`([key, quiet])	Return a string describing the computations that produce key.
`disaggregate`(qty, new_dim[, method, args])	Add a computation that disaggregates qty using method.
`full_key`(name_or_key)	Return the full-dimensionality key for name_or_key.
`get`([key])	Execute and return the result of the computation key.
`keys`()	Return the keys of `graph`.
`visualize`(filename, **kwargs)	Generate an image describing the reporting structure.
`write`(key, path)	Write the report key to the file path.

graph: Dict[str, Union[str, dict]] = {'config': {}}¶: A dask-format graph.

add(data, *args, **kwargs)¶

General-purpose method to add computations.

add() can be called in several ways; its behaviour depends on data; see below. It chains to methods such as add_single(), add_queue(), and apply(), which can also be called directly.

Parameters

data (various) –
args (various) –

Other Parameters

sums (bool, optional) – If True, all partial sums of the key data are also added to the Computer.

Returns

Some or all of the keys added to the Computer.

Return type

list of Key-like

Raises

KeyError – If a target key is already in the Computer; any key referred to by a computation does not exist; or sums=True and the key for one of the partial sums of key is already in the Computer.

See also

genno.computations.load_file

add_product(key, *quantities, sums=True)¶

Add a computation that takes the product of quantities.

Parameters

key (str or Key) – Key of the new quantity. If a Key, any dimensions are ignored; the dimensions of the product are the union of the dimensions of quantities.
sums (bool, optional) – If True, all partial sums of the new quantity are also added.

Returns

The full key of the new quantity.

Return type

Key

add_queue(queue, max_tries=1, fail='raise')¶

Add tasks from a list or queue.

Parameters

queue (list of 2-tuple) – The members of each tuple are the arguments (i.e. a list or tuple) and keyword arguments (i.e. a dict) to add().
max_tries (int, optional) – Retry adding elements up to this many times.
fail ('raise' or log level, optional) – Action to take when a computation from queue cannot be added after max_tries.

add_single(key, *computation, strict=False, index=False)¶

Add a single computation at key.

Parameters

key (str or Key or hashable) – A string, Key, or other value identifying the output of task.
computation (object) –
Any dask computation, i.e. one of:
1. any existing key in the Computer.
2. any other literal value or constant.
3. a task, i.e. a tuple with a callable followed by one or more computations.
4. A list containing one or more of #1, #2, and/or #3.
strict (bool, optional) – If True, key must not already exist in the Computer, and any keys referred to by computation must exist.
index (bool, optional) – If True, key is added to the index as a full-resolution key, so it can be later retrieved with full_key().

aggregate(qty, tag, dims_or_groups, weights=None, keep=True, sums=False)¶

Add a computation that aggregates qty.

Parameters

qty (Key or str) – Key of the quantity to be aggregated.
tag (str) – Additional string to add to the end the key for the aggregated quantity.
dims_or_groups (str or iterable of str or dict) – Name(s) of the dimension(s) to sum over, or nested dict.
weights (xarray.DataArray, optional) – Weights for weighted aggregation.
keep (bool, optional) – Passed to computations.aggregate.
sums (bool, optional) – Passed to add().

Returns

The key of the newly-added node.

Return type

Key

check_keys(*keys)¶

Check that keys are in the Computer.

If any of keys is not in the Computer, KeyError is raised. Otherwise, a list is returned with either the key from keys, or the corresponding full_key().

configure(path=None, **config)¶

Configure the Computer.

Accepts a path to a configuration file and/or keyword arguments. Configuration keys loaded from file are replaced by keyword arguments.

Valid configuration keys include:

default: the default reporting key; sets default_key.
filters: a dict, passed to set_filters().
files: a list where every element is a dict of keyword arguments to add_file().
alias: a dict mapping aliases to original keys.

Warns: UserWarning – If config contains unrecognized keys.

default_key = None¶: The default reporting key.

describe(key=None, quiet=True)¶

Return a string describing the computations that produce key.

If key is not provided, all keys in the Computer are described.

The string can be printed to the console, if not quiet.

disaggregate(qty, new_dim, method='shares', args=[])¶

Add a computation that disaggregates qty using method.

Parameters

qty (hashable) – Key of the quantity to be disaggregated.
new_dim (str) – Name of the new dimension of the disaggregated variable.
method (callable or str) – Disaggregation method. If a callable, then it is applied to var with any extra args. If then a method named ‘disaggregate_{method}’ is used.
args (list, optional) – Additional arguments to the method. The first element should be the key for a quantity giving shares for disaggregation.

Returns

The key of the newly-added node.

Return type

Key

full_key(name_or_key)¶

Return the full-dimensionality key for name_or_key.

An quantity ‘foo’ with dimensions (a, c, n, q, x) is available in the Computer as 'foo:a-c-n-q-x'. This Key can be retrieved with:

rep.full_key("foo")
rep.full_key("foo:c")
# etc.

get(key=None)¶

Execute and return the result of the computation key.

Only key and its dependencies are computed.

Parameters: key (str, optional) – If not provided, default_key is used.
Raises: ValueError – If key and default_key are both None.

keys()¶: Return the keys of graph.

property unit_registry¶: The pint.UnitRegistry() used by the Computer.

visualize(filename, **kwargs)¶

Generate an image describing the reporting structure.

This is a shorthand for dask.visualize(). Requires graphviz.

write(key, path)¶: Write the report key to the file path.

class genno.Key(name, dims=[], tag=None)¶

A hashable key for a quantity that includes its dimensionality.

Quantities in a Scenario can be indexed by one or more dimensions. A Key refers to a quantity using three components:

a string name,
zero or more ordered dims, and
an optional tag.

For example, an ixmp parameter with three dimensions can be initialized with:

>>> scenario.init_par('foo', ['a', 'b', 'c'], ['apple', 'bird', 'car'])

Key allows a specific, explicit reference to various forms of “foo”:

in its full resolution, i.e. indexed by a, b, and c:
```
>>> k1 = Key('foo', ['a', 'b', 'c'])
>>> k1 == 'foo:a-b-c'
True
```
Notice that a Key has the same hash, and compares equal (==) to its str().
in a partial sum over one dimension, e.g. summed along c with dimensions a and b:
```
>>> k2 = k1.drop('c')
>>> k2 == 'foo:a-b'
True
```

in a partial sum over multiple dimensions, etc.:

>>> k1.drop('a', 'c') == k2.drop('a') == 'foo:b'
True

Note

Some remarks:

repr(key) prints the Key in angle brackets (‘<>’) to signify it is a Key object.
```
>>> repr(k1)
<foo:a-b-c>
```
Keys are immutable: the properties name, dims, and tag are read-only, and the methods append(), drop(), and add_tag() return new Key objects.

Keys may be generated concisely by defining a convenience method:

>>> def foo(dims):
>>>     return Key('foo', dims.split())
>>> foo('a b c')
foo:a-b-c

add_tag(tag)¶: Return a new Key with tag appended.

append(*dims)¶: Return a new Key with additional dimensions dims.

property dims¶: Dimensions of the quantity, tuple of str.

drop(*dims)¶: Return a new Key with dims dropped.

classmethod from_str_or_key(value, drop=[], append=[], tag=None)¶

Return a new Key from value.

Parameters

value (str or Key) – Value to use to generate a new Key.
drop (list of str or True, optional) – Existing dimensions of value to drop. See drop().
append (list of str, optional.) – New dimensions to append to the returned Key. See append().
tag (str, optional) – Tag for returned Key. If value has a tag, the two are joined using a ‘+’ character. See add_tag().

Returns

Return type

Key

iter_sums()¶: Generate (key, task) for all possible partial sums of the Key.

property name¶: Name of the quantity, str.

classmethod product(new_name, *keys, tag=None)¶

Return a new Key that has the union of dimensions on keys.

Dimensions are ordered by their first appearance:

First, the dimensions of the first of the keys.
Next, any additional dimensions in the second of the keys that were not already added in step 1.
etc.

Parameters: new_name (str) – Name for the new Key. The names of keys are discarded.

property tag¶: Quantity tag, str.

genno.Quantity(data, *args, **kwargs)¶

Convert arguments to the internal Quantity data format.

Parameters

data – Quantity data.
args – Positional arguments, passed to AttrSeries or SparseDataArray.
kwargs – Keyword arguments, passed to AttrSeries or SparseDataArray.

Other Parameters

name (str, optional) – Quantity name.
units (str, optional) – Quantity units.
attrs (dict, optional) – Dictionary of attributes; similar to attrs.

The Quantity constructor converts its arguments to an internal, xarray.DataArray-like data format:

# Existing data
data = pd.Series(...)

# Convert to a Quantity for use in reporting calculations
qty = Quantity(data, name="Quantity name", units="kg")
rep.add("new_qty", qty)

Common genno usage, e.g. in message_ix, creates large, sparse data frames (billions of possible elements, but <1% populated); DataArray’s default, ‘dense’ storage format would be too large for available memory.

Currently, Quantity is AttrSeries, a wrapped pandas.Series that behaves like a DataArray.
In the future, genno will use SparseDataArray, and eventually DataArray backed by sparse data, directly.

The goal is that reporting code, including built-in and user computations, can treat quantity arguments as if they were DataArray.

Computations ¶

Elementary computations for reporting.

Unless otherwise specified, these methods accept and return Quantity objects for data arguments/return values.

Calculations:

`add`(*quantities[, fill_value])	Sum across multiple quantities.
`aggregate`(quantity, groups, keep)	Aggregate quantity by groups.
`apply_units`(qty, units[, quiet])	Simply apply units to qty.
`disaggregate_shares`(quantity, shares)	Disaggregate quantity by shares.
`product`(*quantities)	Return the product of any number of quantities.
`ratio`(numerator, denominator)	Return the ratio numerator / denominator.
`select`(qty, indexers[, inverse])	Select from qty based on indexers.
`sum`(quantity[, weights, dimensions])	Sum quantity over dimensions, with optional weights.

Input and output:

`load_file`(path[, dims, units])	Read the file at path and return its contents as a `Quantity`.
`write_report`(quantity, path)	Write a quantity to a file.

Data manipulation:

concat(*objs, **kwargs)

Concatenate Quantity objs.

genno.computations.aggregate(quantity, groups, keep)¶

Aggregate quantity by groups.

Parameters

quantity (Quantity) –
groups (dict of dict) – Top-level keys are the names of dimensions in quantity. Second-level keys are group names; second-level values are lists of labels along the dimension to sum into a group.
keep (bool) – If True, the members that are aggregated into a group are returned with the group sums. If False, they are discarded.

Returns

Same dimensionality as quantity.

Return type

Quantity

genno.computations.apply_units(qty, units, quiet=False)¶

Simply apply units to qty.

Logs on level WARNING if qty already has existing units.

Parameters

qty (Quantity) –
units (str or pint.Unit) – Units to apply to qty
quiet (bool, optional) – If True log on level DEBUG.

genno.computations.concat(*objs, **kwargs)¶

Concatenate Quantity objs.

Any strings included amongst args are discarded, with a logged warning; these usually indicate that a quantity is referenced which is not in the Reporter.

genno.computations.disaggregate_shares(quantity, shares)¶: Disaggregate quantity by shares.

genno.computations.load_file(path, dims={}, units=None)¶

Read the file at path and return its contents as a Quantity.

Some file formats are automatically converted into objects for direct use in reporting code:

.csv:: Converted to Quantity. CSV files must have a ‘value’ column; all others are treated as indices, except as given by dims. Lines beginning with ‘#’ are ignored.

Parameters

path (pathlib.Path) – Path to the file to read.
dims (collections.abc.Collection or collections.abc.Mapping, optional) – If a collection of names, other columns besides these and ‘value’ are discarded. If a mapping, the keys are the column labels in path, and the values are the target dimension names.
units (str or pint.Unit) – Units to apply to the loaded Quantity.

genno.computations.product(*quantities)¶: Return the product of any number of quantities.

genno.computations.ratio(numerator, denominator)¶

Return the ratio numerator / denominator.

Parameters

numerator (Quantity) –
denominator (Quantity) –

genno.computations.select(qty, indexers, inverse=False)¶

Select from qty based on indexers.

Parameters

qty (Quantity) –
indexers (dict (str -> list of str)) – Elements to be selected from qty. Mapping from dimension names to labels along each dimension.
inverse (bool, optional) – If True, remove the items in indexers instead of keeping them.

genno.computations.sum(quantity, weights=None, dimensions=None)¶

Sum quantity over dimensions, with optional weights.

Parameters

quantity (Quantity) –
weights (Quantity, optional) – If dimensions is given, weights must have at least these dimensions. Otherwise, any dimensions are valid.
dimensions (list of str, optional) – If not provided, sum over all dimensions. If provided, sum over these dimensions.

genno.computations.write_report(quantity, path)¶

Write a quantity to a file.

Parameters: path (str or Path) – Path to the file to be written.

Internal format for quantities ¶

genno.core.quantity.assert_quantity(*args)¶

Assert that each of args is a Quantity object.

Raises: TypeError – with a indicative message.

class genno.core.attrseries.AttrSeries(data=None, *args, name=None, attrs=None, **kwargs)¶

pandas.Series subclass imitating xarray.DataArray.

The AttrSeries class provides similar methods and behaviour to xarray.DataArray, so that genno.computations methods can use xarray-like syntax.

Parameters

units (str or pint.Unit, optional) – Set the units attribute. The value is converted to pint.Unit and added to attrs.
attrs (Mapping, optional) – Set the attrs of the AttrSeries. This attribute was added in pandas 1.0, but is not currently supported by the Series constructor.

align_levels(other)¶

Work around https://github.com/pandas-dev/pandas/issues/25760.

Return a copy of obj with common levels in the same order as ref.

assign_coords(**kwargs)¶: Like xarray.DataArray.assign_coords().

property coords¶: Like xarray.DataArray.coords. Read-only.

property dims¶: Like xarray.DataArray.dims.

drop(label)¶: Like xarray.DataArray.drop().

classmethod from_series(series, sparse=None)¶: Like xarray.DataArray.from_series().

item(*args)¶: Like xarray.DataArray.item().

rename(new_name_or_name_dict)¶: Like xarray.DataArray.rename().

sel(indexers=None, drop=False, **indexers_kwargs)¶: Like xarray.DataArray.sel().

squeeze(dim=None, *args, **kwargs)¶: Like xarray.DataArray.squeeze().

sum(*args, **kwargs)¶: Like xarray.DataArray.sum().

to_dataframe()¶: Like xarray.DataArray.to_dataframe().

to_series()¶: Like xarray.DataArray.to_series().

transpose(*dims)¶: Like xarray.DataArray.transpose().

class genno.core.sparsedataarray.SparseAccessor(obj)¶

xarray accessor to help SparseDataArray.

See the xarray accessor documentation, e.g. register_dataarray_accessor().

property COO_data¶: True if the DataArray has sparse.COO data.

convert()¶: Return a SparseDataArray instance.

property dense¶: Return a copy with dense (ndarray) data.

property dense_super¶: Return a proxy to a ndarray-backed DataArray.

class genno.core.sparsedataarray.SparseDataArray(data: Any = <NA>, coords: Optional[Union[Sequence[Tuple], Mapping[Hashable, Any]]] = None, dims: Optional[Union[Hashable, Sequence[Hashable]]] = None, name: Optional[Hashable] = None, attrs: Optional[Mapping] = None, indexes: Optional[Dict[Hashable, pandas.core.indexes.base.Index]] = None, fastpath: bool = False)¶

DataArray with sparse data.

SparseDataArray uses sparse.COO for storage with numpy.nan as its sparse.COO.fill_value. Some methods of DataArray are overridden to ensure data is in sparse, or dense, format as necessary, to provide expected functionality not currently supported by sparse, and to avoid exhausting memory for some operations that require dense data.

equals(other) → bool ¶

True if two SparseDataArrays have the same dims, coords, and values.

Overrides equals() for sparse data.

classmethod from_series(obj, sparse=True)¶: Convert a pandas.Series into a SparseDataArray.

property loc¶: Attribute for location based indexing like pandas.

Note

This version does not allow assignment, since the underlying sparse array is read-only. To modify the contents, create a copy or perform an operation that returns a new array.

sel(indexers=None, method=None, tolerance=None, drop=False, **indexers_kwargs) → genno.core.sparsedataarray.SparseDataArray ¶

Return a new array by selecting labels along the specified dim(s).

Overrides sel() to handle >1-D indexers with sparse data.

to_dataframe(name=None)¶

Convert this array and its coords into a DataFrame.

Overrides to_dataframe().

to_series() → pandas.core.series.Series¶

Convert this array into a Series.

Overrides to_series() to create the series without first converting to a potentially very large numpy.ndarray.

Utilities ¶

genno.util.RENAME_DIMS: Dict[str, str] = {}¶: Dimensions to rename when extracting raw data from Scenario objects. Mapping from Scenario dimension name -> preferred dimension name.

genno.util.REPLACE_UNITS = {'%': 'percent'}¶: Replacements to apply to quantity units before parsing by pint. Mapping from original unit -> preferred unit.

genno.util.clean_units(input_string)¶

Tolerate messy strings for units.

Handles two specific cases found in MESSAGEix test cases:

Dimensions enclosed in ‘[]’ have these characters stripped.
The ‘%’ symbol cannot be supported by pint, because it is a Python operator; it is translated to ‘percent’.

genno.util.collect_units(*args)¶: Return an list of ‘_unit’ attributes for args.

genno.util.dims_for_qty(data)¶

Return the list of dimensions for data.

If data is a pandas.DataFrame, its columns are processed; otherwise it must be a list.

genno.RENAME_DIMS is used to rename dimensions.

genno.util.filter_concat_args(args)¶

Filter out str and Key from args.

A warning is logged for each element removed.

genno.util.parse_units(units_series)¶: Return a pint.Unit for a pd.Series of strings.

genno.util.partial_split(func, kwargs)¶

Forgiving version of functools.partial().

Returns a partial object and leftover kwargs not applicable to func.

API¶

Computations¶

Internal format for quantities¶

Utilities¶

Computations ¶

Internal format for quantities ¶

Utilities ¶