API¶
Top-level methods and classes:
|
Configure reporting globally. |
|
Class for describing and executing computations. |
|
A hashable key for a quantity that includes its dimensionality. |
Convert arguments to the internal Quantity data format. |
Others:
-
genno.
configure
(path=None, **config)¶ Configure reporting globally.
Modifies global variables that affect the behaviour of all Computers and computations, namely
RENAME_DIMS
andREPLACE_UNITS
.Valid configuration keys—passed as config keyword arguments—include:
- Other Parameters
units (mapping) – Configuration for handling of units. Valid sub-keys include:
replace (mapping of str -> str): replace units before they are parsed by pint. Added to
REPLACE_UNITS
.define (
str
): block of unit definitions, added to thepint
application registry so that units are recognized. See the pint documentation on defining units.
rename_dims (mapping of str -> str) – Update
RENAME_DIMS
.
- Warns
UserWarning – If config contains unrecognized keys.
-
class
genno.
Computer
(**kwargs)¶ Class for describing and executing computations.
- Parameters
kwargs – Passed to
configure()
.
A Computer is used to postprocess data from from one or more
ixmp.Scenario
objects. Theget()
method can be used to:Retrieve individual quantities. A quantity has zero or more dimensions and optional units. Quantities include the ‘parameters’, ‘variables’, ‘equations’, and ‘scalars’ available in an
ixmp.Scenario
.Generate an entire report composed of multiple quantities. A report may:
Read in non-model or exogenous data,
Trigger output to files(s) or a database, or
Execute user-defined methods.
Every report and quantity (including the results of intermediate steps) is identified by a
Key
; all the keys in a Computer can be listed withkeys()
.Computer uses a graph data structure to keep track of computations, the atomic steps in postprocessing: for example, a single calculation that multiplies two quantities to create a third. The graph allows
get()
to perform only the requested computations. Advanced users may manipulate the graph directly; but common reporting tasks can be handled by using Computer methods:add
(data, *args, **kwargs)General-purpose method to add computations.
add_file
(path[, key])Add exogenous quantities from path.
add_product
(key, *quantities[, sums])Add a computation that takes the product of quantities.
add_queue
(queue[, max_tries, fail])Add tasks from a list or queue.
add_single
(key, *computation[, strict, index])Add a single computation at key.
aggregate
(qty, tag, dims_or_groups[, …])Add a computation that aggregates qty.
apply
(generator, *keys, **kwargs)Add computations by applying generator to keys.
check_keys
(*keys)Check that keys are in the Computer.
configure
([path])Configure the Computer.
describe
([key, quiet])Return a string describing the computations that produce key.
disaggregate
(qty, new_dim[, method, args])Add a computation that disaggregates qty using method.
full_key
(name_or_key)Return the full-dimensionality key for name_or_key.
get
([key])Execute and return the result of the computation key.
keys
()Return the keys of
graph
.visualize
(filename, **kwargs)Generate an image describing the reporting structure.
write
(key, path)Write the report key to the file path.
-
add
(data, *args, **kwargs)¶ General-purpose method to add computations.
add()
can be called in several ways; its behaviour depends on data; see below. It chains to methods such asadd_single()
,add_queue()
, andapply()
, which can also be called directly.- Parameters
data (various) –
args (various) –
- Other Parameters
sums (bool, optional) – If
True
, all partial sums of the key data are also added to the Computer.- Returns
Some or all of the keys added to the Computer.
- Return type
list of Key-like
- Raises
KeyError – If a target key is already in the Computer; any key referred to by a computation does not exist; or
sums=True
and the key for one of the partial sums of key is already in the Computer.
See also
add()
may be called with:list
: data is a list of computations like[(list(args1), dict(kwargs1)), (list(args2), dict(kwargs2)), ...]
that are added one-by-one.the name of a function in
computations
(e.g. ‘select’): A computation is added with keyargs[0]
, applying the named function toargs[1:]
and kwargs.str
, the name of aComputer
method (e.g. ‘apply’): the corresponding method (e.g.apply()
) is called with the args and kwargs.Any other
str
orKey
: the arguments are passed toadd_single()
.
add()
may also be used to:Provide an alias from one key to another:
>>> from genno import Computer >>> rep = Computer() # Create a new Computer object >>> rep.add('aliased name', 'original name')
Define an arbitrarily complex computation in a Python function that operates directly on the
ixmp.Scenario
:>>> def my_report(scenario): >>> # many lines of code >>> return 'foo' >>> rep.add('my report', (my_report, 'scenario')) >>> rep.finalize(scenario) >>> rep.get('my report') foo
Note
Use care when adding literal
str()
values as a computation argument foradd()
; these may conflict with keys that identify the results of other computations.
-
apply
(generator, *keys, **kwargs)¶ Add computations by applying generator to keys.
- Parameters
generator (callable) – Function to apply to keys.
keys (hashable) – The starting key(s).
kwargs – Keyword arguments to generator.
The generator may have a type annotation for Computer on its first positional argument. In this case, a reference to the Computer is supplied, and generator may use the Computer methods to add computations:
def gen0(r: ixmp.Computer, **kwargs): r.load_file('file0.txt', **kwargs) r.load_file('file1.txt', **kwargs) # Use the generator to add several computations rep.apply(my_gen, units='kg')
Or, generator may
yield
a sequence (0 or more) of (key, computation), which are added to thegraph
:def gen1(**kwargs): op = partial(computations.load_file, **kwargs) yield from (f'file:{i}', op, 'file{i}.txt') for i in range(2) rep.apply(my_gen, units='kg')
-
add_file
(path, key=None, **kwargs)¶ Add exogenous quantities from path.
Reporting the key or using it in other computations causes path to be loaded and converted to
Quantity
.- Parameters
path (os.PathLike) – Path to the file, e.g. ‘/path/to/foo.ext’.
key (str or Key, optional) – Key for the quantity read from the file.
- Other Parameters
dims (dict or list or set) – Either a collection of names for dimensions of the quantity, or a mapping from names appearing in the input to dimensions.
units (str or pint.Unit) – Units to apply to the loaded Quantity.
- Returns
Either key (if given) or e.g.
file:foo.ext
based on the path name, without directory components.- Return type
See also
-
add_product
(key, *quantities, sums=True)¶ Add a computation that takes the product of quantities.
- Parameters
- Returns
The full key of the new quantity.
- Return type
-
add_queue
(queue, max_tries=1, fail='raise')¶ Add tasks from a list or queue.
- Parameters
queue (list of 2-tuple) – The members of each tuple are the arguments (i.e. a list or tuple) and keyword arguments (i.e. a dict) to
add()
.max_tries (int, optional) – Retry adding elements up to this many times.
fail ('raise' or log level, optional) – Action to take when a computation from queue cannot be added after max_tries.
-
add_single
(key, *computation, strict=False, index=False)¶ Add a single computation at key.
- Parameters
key (str or Key or hashable) – A string, Key, or other value identifying the output of task.
computation (object) –
Any dask computation, i.e. one of:
any existing key in the Computer.
any other literal value or constant.
a task, i.e. a tuple with a callable followed by one or more computations.
A list containing one or more of #1, #2, and/or #3.
strict (bool, optional) – If True, key must not already exist in the Computer, and any keys referred to by computation must exist.
index (bool, optional) – If True, key is added to the index as a full-resolution key, so it can be later retrieved with
full_key()
.
-
aggregate
(qty, tag, dims_or_groups, weights=None, keep=True, sums=False)¶ Add a computation that aggregates qty.
- Parameters
qty (
Key
or str) – Key of the quantity to be aggregated.tag (str) – Additional string to add to the end the key for the aggregated quantity.
dims_or_groups (str or iterable of str or dict) – Name(s) of the dimension(s) to sum over, or nested dict.
weights (
xarray.DataArray
, optional) – Weights for weighted aggregation.keep (bool, optional) – Passed to
computations.aggregate
.
- Returns
The key of the newly-added node.
- Return type
-
check_keys
(*keys)¶ Check that keys are in the Computer.
If any of keys is not in the Computer, KeyError is raised. Otherwise, a list is returned with either the key from keys, or the corresponding
full_key()
.
-
configure
(path=None, **config)¶ Configure the Computer.
Accepts a path to a configuration file and/or keyword arguments. Configuration keys loaded from file are replaced by keyword arguments.
Valid configuration keys include:
default: the default reporting key; sets
default_key
.filters: a
dict
, passed toset_filters()
.files: a
list
where every element is adict
of keyword arguments toadd_file()
.alias: a
dict
mapping aliases to original keys.
- Warns
UserWarning – If config contains unrecognized keys.
-
default_key
= None¶ The default reporting key.
-
describe
(key=None, quiet=True)¶ Return a string describing the computations that produce key.
If key is not provided, all keys in the Computer are described.
The string can be printed to the console, if not quiet.
-
disaggregate
(qty, new_dim, method='shares', args=[])¶ Add a computation that disaggregates qty using method.
- Parameters
qty (hashable) – Key of the quantity to be disaggregated.
new_dim (str) – Name of the new dimension of the disaggregated variable.
method (callable or str) – Disaggregation method. If a callable, then it is applied to var with any extra args. If then a method named ‘disaggregate_{method}’ is used.
args (list, optional) – Additional arguments to the method. The first element should be the key for a quantity giving shares for disaggregation.
- Returns
The key of the newly-added node.
- Return type
-
full_key
(name_or_key)¶ Return the full-dimensionality key for name_or_key.
An quantity ‘foo’ with dimensions (a, c, n, q, x) is available in the Computer as
'foo:a-c-n-q-x'
. ThisKey
can be retrieved with:rep.full_key("foo") rep.full_key("foo:c") # etc.
-
get
(key=None)¶ Execute and return the result of the computation key.
Only key and its dependencies are computed.
- Parameters
key (str, optional) – If not provided,
default_key
is used.- Raises
ValueError – If key and
default_key
are bothNone
.
-
property
unit_registry
¶ The
pint.UnitRegistry()
used by the Computer.
-
visualize
(filename, **kwargs)¶ Generate an image describing the reporting structure.
This is a shorthand for
dask.visualize()
. Requires graphviz.
-
write
(key, path)¶ Write the report key to the file path.
-
class
genno.
Key
(name, dims=[], tag=None)¶ A hashable key for a quantity that includes its dimensionality.
Quantities in a
Scenario
can be indexed by one or more dimensions. A Key refers to a quantity using three components:For example, an ixmp parameter with three dimensions can be initialized with:
>>> scenario.init_par('foo', ['a', 'b', 'c'], ['apple', 'bird', 'car'])
Key allows a specific, explicit reference to various forms of “foo”:
in its full resolution, i.e. indexed by a, b, and c:
>>> k1 = Key('foo', ['a', 'b', 'c']) >>> k1 == 'foo:a-b-c' True
Notice that a Key has the same hash, and compares equal (==) to its
str()
.in a partial sum over one dimension, e.g. summed along c with dimensions a and b:
>>> k2 = k1.drop('c') >>> k2 == 'foo:a-b' True
in a partial sum over multiple dimensions, etc.:
>>> k1.drop('a', 'c') == k2.drop('a') == 'foo:b' True
Note
Some remarks:
repr(key)
prints the Key in angle brackets (‘<>’) to signify it is a Key object.>>> repr(k1) <foo:a-b-c>
Keys are immutable: the properties
name
,dims
, andtag
are read-only, and the methodsappend()
,drop()
, andadd_tag()
return new Key objects.Keys may be generated concisely by defining a convenience method:
>>> def foo(dims): >>> return Key('foo', dims.split()) >>> foo('a b c') foo:a-b-c
-
add_tag
(tag)¶ Return a new Key with tag appended.
-
append
(*dims)¶ Return a new Key with additional dimensions dims.
-
drop
(*dims)¶ Return a new Key with dims dropped.
-
classmethod
from_str_or_key
(value, drop=[], append=[], tag=None)¶ Return a new Key from value.
- Parameters
drop (list of str or
True
, optional) – Existing dimensions of value to drop. Seedrop()
.append (list of str, optional.) – New dimensions to append to the returned Key. See
append()
.tag (str, optional) – Tag for returned Key. If value has a tag, the two are joined using a ‘+’ character. See
add_tag()
.
- Returns
- Return type
-
iter_sums
()¶ Generate (key, task) for all possible partial sums of the Key.
-
classmethod
product
(new_name, *keys, tag=None)¶ Return a new Key that has the union of dimensions on keys.
Dimensions are ordered by their first appearance:
First, the dimensions of the first of the keys.
Next, any additional dimensions in the second of the keys that were not already added in step 1.
etc.
- Parameters
new_name (str) – Name for the new Key. The names of keys are discarded.
-
genno.
Quantity
(data, *args, **kwargs)¶ Convert arguments to the internal Quantity data format.
- Parameters
data – Quantity data.
args – Positional arguments, passed to
AttrSeries
orSparseDataArray
.kwargs – Keyword arguments, passed to
AttrSeries
orSparseDataArray
.
- Other Parameters
name (str, optional) – Quantity name.
units (str, optional) – Quantity units.
attrs (dict, optional) – Dictionary of attributes; similar to
attrs
.
The Quantity
constructor converts its arguments to an internal, xarray.DataArray
-like data format:
# Existing data
data = pd.Series(...)
# Convert to a Quantity for use in reporting calculations
qty = Quantity(data, name="Quantity name", units="kg")
rep.add("new_qty", qty)
Common genno
usage, e.g. in message_ix
, creates large, sparse data frames (billions of possible elements, but <1% populated); DataArray
’s default, ‘dense’ storage format would be too large for available memory.
Currently, Quantity is
AttrSeries
, a wrappedpandas.Series
that behaves like aDataArray
.In the future,
genno
will useSparseDataArray
, and eventuallyDataArray
backed by sparse data, directly.
The goal is that reporting code, including built-in and user computations, can treat quantity arguments as if they were DataArray
.
Computations¶
Elementary computations for reporting.
Unless otherwise specified, these methods accept and return
Quantity
objects for data
arguments/return values.
Calculations:
|
Sum across multiple quantities. |
|
Aggregate quantity by groups. |
|
Simply apply units to qty. |
|
Disaggregate quantity by shares. |
|
Return the product of any number of quantities. |
|
Return the ratio numerator / denominator. |
|
Select from qty based on indexers. |
|
Sum quantity over dimensions, with optional weights. |
Input and output:
|
Read the file at path and return its contents as a |
|
Write a quantity to a file. |
Data manipulation:
|
Concatenate Quantity objs. |
-
genno.computations.
aggregate
(quantity, groups, keep)¶ Aggregate quantity by groups.
- Parameters
quantity (
Quantity
) –groups (dict of dict) – Top-level keys are the names of dimensions in quantity. Second-level keys are group names; second-level values are lists of labels along the dimension to sum into a group.
keep (bool) – If True, the members that are aggregated into a group are returned with the group sums. If False, they are discarded.
- Returns
Same dimensionality as quantity.
- Return type
Quantity
-
genno.computations.
apply_units
(qty, units, quiet=False)¶ Simply apply units to qty.
Logs on level
WARNING
if qty already has existing units.
-
genno.computations.
concat
(*objs, **kwargs)¶ Concatenate Quantity objs.
Any strings included amongst args are discarded, with a logged warning; these usually indicate that a quantity is referenced which is not in the Reporter.
Disaggregate quantity by shares.
-
genno.computations.
load_file
(path, dims={}, units=None)¶ Read the file at path and return its contents as a
Quantity
.Some file formats are automatically converted into objects for direct use in reporting code:
.csv
:Converted to
Quantity
. CSV files must have a ‘value’ column; all others are treated as indices, except as given by dims. Lines beginning with ‘#’ are ignored.
- Parameters
path (pathlib.Path) – Path to the file to read.
dims (collections.abc.Collection or collections.abc.Mapping, optional) – If a collection of names, other columns besides these and ‘value’ are discarded. If a mapping, the keys are the column labels in path, and the values are the target dimension names.
units (str or pint.Unit) – Units to apply to the loaded Quantity.
-
genno.computations.
product
(*quantities)¶ Return the product of any number of quantities.
-
genno.computations.
ratio
(numerator, denominator)¶ Return the ratio numerator / denominator.
- Parameters
numerator (Quantity) –
denominator (Quantity) –
-
genno.computations.
select
(qty, indexers, inverse=False)¶ Select from qty based on indexers.
-
genno.computations.
sum
(quantity, weights=None, dimensions=None)¶ Sum quantity over dimensions, with optional weights.
- Parameters
quantity (Quantity) –
weights (Quantity, optional) – If dimensions is given, weights must have at least these dimensions. Otherwise, any dimensions are valid.
dimensions (list of str, optional) – If not provided, sum over all dimensions. If provided, sum over these dimensions.
Internal format for quantities¶
-
genno.core.quantity.
assert_quantity
(*args)¶ Assert that each of args is a Quantity object.
- Raises
TypeError – with a indicative message.
-
class
genno.core.attrseries.
AttrSeries
(data=None, *args, name=None, attrs=None, **kwargs)¶ pandas.Series
subclass imitatingxarray.DataArray
.The AttrSeries class provides similar methods and behaviour to
xarray.DataArray
, so thatgenno.computations
methods can use xarray-like syntax.- Parameters
units (str or pint.Unit, optional) – Set the units attribute. The value is converted to
pint.Unit
and added to attrs.attrs (
Mapping
, optional) – Set theattrs
of the AttrSeries. This attribute was added in pandas 1.0, but is not currently supported by the Series constructor.
-
align_levels
(other)¶ Work around https://github.com/pandas-dev/pandas/issues/25760.
Return a copy of obj with common levels in the same order as ref.
-
assign_coords
(**kwargs)¶ Like
xarray.DataArray.assign_coords()
.
-
property
coords
¶ Like
xarray.DataArray.coords
. Read-only.
-
property
dims
¶ Like
xarray.DataArray.dims
.
-
drop
(label)¶ Like
xarray.DataArray.drop()
.
-
classmethod
from_series
(series, sparse=None)¶ Like
xarray.DataArray.from_series()
.
-
item
(*args)¶ Like
xarray.DataArray.item()
.
-
rename
(new_name_or_name_dict)¶ Like
xarray.DataArray.rename()
.
-
sel
(indexers=None, drop=False, **indexers_kwargs)¶ Like
xarray.DataArray.sel()
.
-
squeeze
(dim=None, *args, **kwargs)¶ Like
xarray.DataArray.squeeze()
.
-
sum
(*args, **kwargs)¶ Like
xarray.DataArray.sum()
.
-
to_dataframe
()¶ Like
xarray.DataArray.to_dataframe()
.
-
to_series
()¶ Like
xarray.DataArray.to_series()
.
-
transpose
(*dims)¶ Like
xarray.DataArray.transpose()
.
-
class
genno.core.sparsedataarray.
SparseAccessor
(obj)¶ xarray
accessor to helpSparseDataArray
.See the xarray accessor documentation, e.g.
register_dataarray_accessor()
.-
convert
()¶ Return a
SparseDataArray
instance.
-
property
dense
¶ Return a copy with dense (
ndarray
) data.
-
property
dense_super
¶ Return a proxy to a
ndarray
-backedDataArray
.
-
-
class
genno.core.sparsedataarray.
SparseDataArray
(data: Any = <NA>, coords: Optional[Union[Sequence[Tuple], Mapping[Hashable, Any]]] = None, dims: Optional[Union[Hashable, Sequence[Hashable]]] = None, name: Optional[Hashable] = None, attrs: Optional[Mapping] = None, indexes: Optional[Dict[Hashable, pandas.core.indexes.base.Index]] = None, fastpath: bool = False)¶ DataArray
with sparse data.SparseDataArray uses
sparse.COO
for storage withnumpy.nan
as itssparse.COO.fill_value
. Some methods ofDataArray
are overridden to ensure data is in sparse, or dense, format as necessary, to provide expected functionality not currently supported bysparse
, and to avoid exhausting memory for some operations that require dense data.-
equals
(other) → bool¶ True if two SparseDataArrays have the same dims, coords, and values.
Overrides
equals()
for sparse data.
-
classmethod
from_series
(obj, sparse=True)¶ Convert a pandas.Series into a SparseDataArray.
-
property
loc
¶ Attribute for location based indexing like pandas.
Note
This version does not allow assignment, since the underlying sparse array is read-only. To modify the contents, create a copy or perform an operation that returns a new array.
-
sel
(indexers=None, method=None, tolerance=None, drop=False, **indexers_kwargs) → genno.core.sparsedataarray.SparseDataArray¶ Return a new array by selecting labels along the specified dim(s).
Overrides
sel()
to handle >1-D indexers with sparse data.
-
to_dataframe
(name=None)¶ Convert this array and its coords into a
DataFrame
.Overrides
to_dataframe()
.
-
to_series
() → pandas.core.series.Series¶ Convert this array into a
Series
.Overrides
to_series()
to create the series without first converting to a potentially very largenumpy.ndarray
.
-
Utilities¶
-
genno.util.
RENAME_DIMS
: Dict[str, str] = {}¶ Dimensions to rename when extracting raw data from Scenario objects. Mapping from Scenario dimension name -> preferred dimension name.
-
genno.util.
REPLACE_UNITS
= {'%': 'percent'}¶ Replacements to apply to quantity units before parsing by pint. Mapping from original unit -> preferred unit.
-
genno.util.
clean_units
(input_string)¶ Tolerate messy strings for units.
Handles two specific cases found in MESSAGEix test cases:
Dimensions enclosed in ‘[]’ have these characters stripped.
The ‘%’ symbol cannot be supported by pint, because it is a Python operator; it is translated to ‘percent’.
-
genno.util.
collect_units
(*args)¶ Return an list of ‘_unit’ attributes for args.
-
genno.util.
dims_for_qty
(data)¶ Return the list of dimensions for data.
If data is a
pandas.DataFrame
, its columns are processed; otherwise it must be a list.genno.RENAME_DIMS is used to rename dimensions.
-
genno.util.
filter_concat_args
(args)¶ Filter out str and Key from args.
A warning is logged for each element removed.
-
genno.util.
partial_split
(func, kwargs)¶ Forgiving version of
functools.partial()
.Returns a partial object and leftover kwargs not applicable to func.