iris.analysis#
A package providing iris.cube.Cube
analysis support.
This module defines a suite of Aggregator
instances,
which are used to specify the statistical measure to calculate over a
Cube
, using methods such as
aggregated_by()
and collapsed()
.
The Aggregator
is a convenience class that allows
specific statistical aggregation operators to be defined and instantiated.
These operators can then be used to collapse, or partially collapse, one or
more dimensions of a Cube
, as discussed in
Cube Statistics.
In particular, Collapsing Entire Data Dimensions discusses how to use
MEAN
to average over one dimension of a Cube
,
and also how to perform weighted Area Averaging.
While Partially Reducing Data Dimensions shows how to aggregate similar
groups of data points along a single dimension, to result in fewer points
in that dimension.
The gallery contains several interesting worked examples of how an
Aggregator
may be used, including:
In this module:
The Aggregator
class provides common aggregation functionality.
- class iris.analysis.Aggregator(cell_method, call_func, units_func=None, lazy_func=None, **kwargs)[source]#
Create an aggregator for the given
call_func
.Args:
- cell_method (string):
Cell method definition formatter. Used in the fashion “cell_method.format(**kwargs)”, to produce a cell-method string which can include keyword values.
- call_func (callable):
Call signature: (data, axis=None, **kwargs)Data aggregation function. Returns an aggregation result, collapsing the ‘axis’ dimension of the ‘data’ argument.
Kwargs:
- units_func (callable):
Call signature: (units)If provided, called to convert a cube’s units. Returns an
cf_units.Unit
, or a value that can be made into one.
- lazy_func (callable or None):
An alternative to
call_func
implementing a lazy aggregation. Note that, it need not support all features of the main operation, but should raise an error in unhandled cases.Aggregators are used by cube aggregation methods such as
collapsed()
andaggregated_by()
. For example:result = cube.collapsed('longitude', iris.analysis.MEAN)A variety of ready-made aggregators are provided in this module, such as
MEAN
andMAX
. Custom aggregators can also be created for special purposes, see Calculating a Custom Statistic for a worked example.
- aggregate(data, axis, **kwargs)#
Perform the aggregation function given the data.
Keyword arguments are passed through to the data aggregation function (for example, the “percent” keyword for a percentile aggregator). This function is usually used in conjunction with update_metadata(), which should be passed the same keyword arguments.
Args:
- data (array):
Data array.
- axis (int):
Axis to aggregate over.
Kwargs:
- mdtol (float):
Tolerance of missing data. The value returned will be masked if the fraction of data to missing data is less than or equal to mdtol. mdtol=0 means no missing data is tolerated while mdtol=1 will return the resulting value from the aggregation function. Defaults to 1.
- kwargs:
All keyword arguments apart from those specified above, are passed through to the data aggregation function.
- Returns
The aggregated data.
- aggregate_shape(**kwargs)#
The shape of the new dimension/s created by the aggregator.
Kwargs:
This function is intended to be used in conjunction with aggregate() and should be passed the same keywords.
- Returns
A tuple of the aggregate shape.
- lazy_aggregate(data, axis, **kwargs)#
Perform aggregation over the data with a lazy operation, analogous to the ‘aggregate’ result.
Keyword arguments are passed through to the data aggregation function (for example, the “percent” keyword for a percentile aggregator). This function is usually used in conjunction with update_metadata(), which should be passed the same keyword arguments.
Args:
- data (array):
A lazy array (
dask.array.Array
).
- axis (int or list of int):
The dimensions to aggregate over – note that this is defined differently to the ‘aggregate’ method ‘axis’ argument, which only accepts a single dimension index.
Kwargs:
- kwargs:
All keyword arguments are passed through to the data aggregation function.
- Returns
A lazy array representing the aggregation operation (
dask.array.Array
).
- name()#
Returns the name of the aggregator.
- post_process(collapsed_cube, data_result, coords, **kwargs)#
Process the result from
iris.analysis.Aggregator.aggregate()
.Args:
- collapsed_cube:
- data_result:
Result from
iris.analysis.Aggregator.aggregate()
- coords:
The one or more coordinates that were aggregated over.
Kwargs:
This function is intended to be used in conjunction with aggregate() and should be passed the same keywords (for example, the “ddof” keyword from a standard deviation aggregator).
- Returns
The collapsed cube with its aggregated data payload.
- update_metadata(cube, coords, **kwargs)[source]#
Update cube cell method metadata w.r.t the aggregation function.
Args:
- cube (
iris.cube.Cube
):Source cube that requires metadata update.
- coords (
iris.coords.Coord
):The one or more coordinates that were aggregated.
Kwargs:
This function is intended to be used in conjunction with aggregate() and should be passed the same keywords (for example, the “ddof” keyword for a standard deviation aggregator).
- call_func#
Data aggregation function.
- cell_method#
Cube cell method string.
- lazy_func#
Lazy aggregation function, may be None to indicate that a lazy operation is not available.
- units_func#
Unit conversion function.
This class describes an area-weighted regridding scheme for regridding
between ‘ordinary’ horizontal grids with separated X and Y coordinates in a
common coordinate system.
Typically for use with iris.cube.Cube.regrid()
.
- class iris.analysis.AreaWeighted(mdtol=1)[source]#
Area-weighted regridding scheme suitable for regridding between different orthogonal XY grids in the same coordinate system.
Kwargs:
- mdtol (float):
Tolerance of missing data. The value returned in each element of the returned array will be masked if the fraction of missing data exceeds mdtol. This fraction is calculated based on the area of masked cells within each target cell. mdtol=0 means no masked data is tolerated while mdtol=1 will mean the resulting element will be masked if and only if all the overlapping elements of the source grid are masked. Defaults to 1.
- regridder(src_grid_cube, target_grid_cube)[source]#
Creates an area-weighted regridder to perform regridding from the source grid to the target grid.
Typically you should use
iris.cube.Cube.regrid()
for regridding a cube. There are, however, some situations when constructing your own regridder is preferable. These are detailed in the user guide.Supports lazy regridding. Any chunks in horizontal dimensions will be combined before regridding.
Args:
- src_grid_cube:
The
Cube
defining the source grid.
- target_grid_cube:
The
Cube
defining the target grid.
- Returns
callable(cube)
where cube is a cube with the same grid as src_grid_cube that is to be regridded to the grid of target_grid_cube.
- Return type
A callable with the interface
- iris.analysis.COUNT Aggregator instance. #
An
Aggregator
instance that counts the number ofCube
data occurrences that satisfy a particular criterion, as defined by a user supplied function.Required kwargs associated with the use of this aggregator:
- function (callable):
A function which converts an array of data values into a corresponding array of True/False values.
For example:
To compute the number of ensemble members with precipitation exceeding 10 (in cube data units) could be calculated with:
result = precip_cube.collapsed('ensemble_member', iris.analysis.COUNT, function=lambda values: values > 10)
See also
The
PROPORTION()
aggregator.This aggregator handles masked data.
- iris.analysis.GMEAN Aggregator instance. #
An
Aggregator
instance that calculates the geometric mean over aCube
, as computed byscipy.stats.mstats.gmean()
.For example:
To compute zonal geometric means over the longitude axis of a cube:
result = cube.collapsed('longitude', iris.analysis.GMEAN)
This aggregator handles masked data.
- iris.analysis.HMEAN Aggregator instance. #
An
Aggregator
instance that calculates the harmonic mean over aCube
, as computed byscipy.stats.mstats.hmean()
.For example:
To compute zonal harmonic mean over the longitude axis of a cube:
result = cube.collapsed('longitude', iris.analysis.HMEAN)
Note
The harmonic mean is only valid if all data values are greater than zero.
This aggregator handles masked data.
This class describes the linear interpolation and regridding scheme for
interpolating or regridding over one or more orthogonal coordinates,
typically for use with iris.cube.Cube.interpolate()
or
iris.cube.Cube.regrid()
.
- class iris.analysis.Linear(extrapolation_mode='linear')[source]#
Linear interpolation and regridding scheme suitable for interpolating or regridding over one or more orthogonal coordinates.
Kwargs:
- extrapolation_mode:
Must be one of the following strings:
‘extrapolate’ or ‘linear’ - The extrapolation points will be calculated by extending the gradient of the closest two points.
‘nan’ - The extrapolation points will be be set to NaN.
‘error’ - A ValueError exception will be raised, notifying an attempt to extrapolate.
‘mask’ - The extrapolation points will always be masked, even if the source data is not a MaskedArray.
‘nanmask’ - If the source data is a MaskedArray the extrapolation points will be masked. Otherwise they will be set to NaN.
The default mode of extrapolation is ‘linear’.
- interpolator(cube, coords)[source]#
Creates a linear interpolator to perform interpolation over the given
Cube
specified by the dimensions of the given coordinates.Typically you should use
iris.cube.Cube.interpolate()
for interpolating a cube. There are, however, some situations when constructing your own interpolator is preferable. These are detailed in the user guide.Args:
- cube:
The source
iris.cube.Cube
to be interpolated.
- coords:
The names or coordinate instances that are to be interpolated over.
- Returns
callable(sample_points, collapse_scalar=True)
where sample_points is a sequence containing an array of values for each of the coordinates passed to this method, and collapse_scalar determines whether to remove length one dimensions in the result cube caused by scalar values in sample_points.
The values for coordinates that correspond to date/times may optionally be supplied as datetime.datetime or cftime.datetime instances.
For example, for the callable returned by: Linear().interpolator(cube, [‘latitude’, ‘longitude’]), sample_points must have the form [new_lat_values, new_lon_values].
- Return type
A callable with the interface
- regridder(src_grid, target_grid)[source]#
Creates a linear regridder to perform regridding from the source grid to the target grid.
Typically you should use
iris.cube.Cube.regrid()
for regridding a cube. There are, however, some situations when constructing your own regridder is preferable. These are detailed in the user guide.Supports lazy regridding. Any chunks in horizontal dimensions will be combined before regridding.
Args:
- Returns
callable(cube)
where cube is a cube with the same grid as src_grid that is to be regridded to the target_grid.
- Return type
A callable with the interface
- LINEAR_EXTRAPOLATION_MODES = ['extrapolate', 'error', 'nan', 'mask', 'nanmask', 'linear']#
- iris.analysis.MAX Aggregator instance. #
An
Aggregator
instance that calculates the maximum over aCube
, as computed bynumpy.ma.max()
.For example:
To compute zonal maximums over the longitude axis of a cube:
result = cube.collapsed('longitude', iris.analysis.MAX)
This aggregator handles masked data.
- iris.analysis.MAX_RUN Aggregator instance. #
An
Aggregator
instance that finds the longest run ofCube
data occurrences that satisfy a particular criterion, as defined by a user supplied function, along the given axis.Required kwargs associated with the use of this aggregator:
- function (callable):
A function which converts an array of data values into a corresponding array of True/False values.
For example:
The longest run of days with precipitation exceeding 10 (in cube data units) at each grid location could be calculated with:
result = precip_cube.collapsed('time', iris.analysis.MAX_RUN, function=lambda values: values > 10)
This aggregator handles masked data, which it treats as interrupting a run.
- iris.analysis.MEAN WeightedAggregator instance. #
An
Aggregator
instance that calculates the mean over aCube
, as computed bynumpy.ma.average()
.Additional kwargs associated with the use of this aggregator:
- mdtol (float):
Tolerance of missing data. The value returned in each element of the returned array will be masked if the fraction of masked data contributing to that element exceeds mdtol. This fraction is calculated based on the number of masked elements. mdtol=0 means no missing data is tolerated while mdtol=1 means the resulting element will be masked if and only if all the contributing elements are masked. Defaults to 1.
- weights (float ndarray):
Weights matching the shape of the cube or the length of the window for rolling window operations. Note that, latitude/longitude area weights can be calculated using
iris.analysis.cartography.area_weights()
.
- returned (boolean):
Set this to True to indicate that the collapsed weights are to be returned along with the collapsed data. Defaults to False.
For example:
To compute zonal means over the longitude axis of a cube:
result = cube.collapsed('longitude', iris.analysis.MEAN)
To compute a weighted area average:
coords = ('longitude', 'latitude') collapsed_cube, collapsed_weights = cube.collapsed(coords, iris.analysis.MEAN, weights=weights, returned=True)
Note
Lazy operation is supported, via
dask.array.ma.average()
.This aggregator handles masked data.
- iris.analysis.MEDIAN Aggregator instance. #
An
Aggregator
instance that calculates the median over aCube
, as computed bynumpy.ma.median()
.For example:
To compute zonal medians over the longitude axis of a cube:
result = cube.collapsed('longitude', iris.analysis.MEDIAN)
This aggregator handles masked data.
- iris.analysis.MIN Aggregator instance. #
An
Aggregator
instance that calculates the minimum over aCube
, as computed bynumpy.ma.min()
.For example:
To compute zonal minimums over the longitude axis of a cube:
result = cube.collapsed('longitude', iris.analysis.MIN)
This aggregator handles masked data.
This class describes the nearest-neighbour interpolation and regridding
scheme for interpolating or regridding over one or more orthogonal
coordinates, typically for use with iris.cube.Cube.interpolate()
or iris.cube.Cube.regrid()
.
- class iris.analysis.Nearest(extrapolation_mode='extrapolate')[source]#
Nearest-neighbour interpolation and regridding scheme suitable for interpolating or regridding over one or more orthogonal coordinates.
Kwargs:
- extrapolation_mode:
Must be one of the following strings:
‘extrapolate’ - The extrapolation points will take their value from the nearest source point.
‘nan’ - The extrapolation points will be be set to NaN.
‘error’ - A ValueError exception will be raised, notifying an attempt to extrapolate.
‘mask’ - The extrapolation points will always be masked, even if the source data is not a MaskedArray.
‘nanmask’ - If the source data is a MaskedArray the extrapolation points will be masked. Otherwise they will be set to NaN.
The default mode of extrapolation is ‘extrapolate’.
- interpolator(cube, coords)[source]#
Creates a nearest-neighbour interpolator to perform interpolation over the given
Cube
specified by the dimensions of the specified coordinates.Typically you should use
iris.cube.Cube.interpolate()
for interpolating a cube. There are, however, some situations when constructing your own interpolator is preferable. These are detailed in the user guide.Args:
- cube:
The source
iris.cube.Cube
to be interpolated.
- coords:
The names or coordinate instances that are to be interpolated over.
- Returns
callable(sample_points, collapse_scalar=True)
where sample_points is a sequence containing an array of values for each of the coordinates passed to this method, and collapse_scalar determines whether to remove length one dimensions in the result cube caused by scalar values in sample_points.
The values for coordinates that correspond to date/times may optionally be supplied as datetime.datetime or cftime.datetime instances.
For example, for the callable returned by: Nearest().interpolator(cube, [‘latitude’, ‘longitude’]), sample_points must have the form [new_lat_values, new_lon_values].
- Return type
A callable with the interface
- regridder(src_grid, target_grid)[source]#
Creates a nearest-neighbour regridder to perform regridding from the source grid to the target grid.
Typically you should use
iris.cube.Cube.regrid()
for regridding a cube. There are, however, some situations when constructing your own regridder is preferable. These are detailed in the user guide.Supports lazy regridding. Any chunks in horizontal dimensions will be combined before regridding.
Args:
- Returns
callable(cube)
where cube is a cube with the same grid as src_grid that is to be regridded to the target_grid.
- Return type
A callable with the interface
- iris.analysis.PEAK Aggregator instance. #
An
Aggregator
instance that calculates the peak value derived from a spline interpolation over aCube
.The peak calculation takes into account nan values. Therefore, if the number of non-nan values is zero the result itself will be an array of nan values.
The peak calculation also takes into account masked values. Therefore, if the number of non-masked values is zero the result itself will be a masked array.
If multiple coordinates are specified, then the peak calculations are performed individually, in sequence, for each coordinate specified.
For example:
To compute the peak over the time axis of a cube:
result = cube.collapsed('time', iris.analysis.PEAK)
This aggregator handles masked data.
- iris.analysis.PERCENTILE#
A
PercentileAggregator
instance that calculates the percentile over aCube
, as computed byscipy.stats.mstats.mquantiles()
(default) ornumpy.percentile()
(if fast_percentile_method is True).Required kwargs associated with the use of this aggregator:
- percent (float or sequence of floats):
Percentile rank/s at which to extract value/s.
Additional kwargs associated with the use of this aggregator:
- alphap (float):
Plotting positions parameter, see
scipy.stats.mstats.mquantiles()
. Defaults to 1.
- betap (float):
Plotting positions parameter, see
scipy.stats.mstats.mquantiles()
. Defaults to 1.
- fast_percentile_method (boolean):
When set to True, uses
numpy.percentile()
method as a faster alternative to thescipy.stats.mstats.mquantiles()
method. alphap and betap are ignored. An exception is raised if the data are masked. Defaults to False.
For example:
To compute the 10th and 90th percentile over time:
result = cube.collapsed('time', iris.analysis.PERCENTILE, percent=[10, 90])
This aggregator handles masked data and lazy data.
Note
Performance of this aggregator on lazy data is particularly sensitive to the dask array chunking, so it may be useful to test with various chunk sizes for a given application. Any chunking along the dimensions to be aggregated is removed by the aggregator prior to calculating the percentiles.
- iris.analysis.PROPORTION Aggregator instance. #
An
Aggregator
instance that calculates the proportion, as a fraction, ofCube
data occurrences that satisfy a particular criterion, as defined by a user supplied function.Required kwargs associated with the use of this aggregator:
- function (callable):
A function which converts an array of data values into a corresponding array of True/False values.
For example:
To compute the probability of precipitation exceeding 10 (in cube data units) across ensemble members could be calculated with:
result = precip_cube.collapsed('ensemble_member', iris.analysis.PROPORTION, function=lambda values: values > 10)
Similarly, the proportion of time precipitation exceeded 10 (in cube data units) could be calculated with:
result = precip_cube.collapsed('time', iris.analysis.PROPORTION, function=lambda values: values > 10)
See also
The
COUNT()
aggregator.This aggregator handles masked data.
This class describes the point-in-cell regridding scheme for use
typically with iris.cube.Cube.regrid()
.
Each result datapoint is an average over all source points that fall inside that (bounded) target cell.
The PointInCell regridder can regrid data from a source grid of any dimensionality and in any coordinate system. The location of each source point is specified by X and Y coordinates mapped over the same cube dimensions, aka “grid dimensions” : the grid may have any dimensionality. The X and Y coordinates must also have the same, defined coord_system. The weights, if specified, must have the same shape as the X and Y coordinates. The output grid can be any ‘normal’ XY grid, specified by separate X and Y coordinates : That is, X and Y have two different cube dimensions. The output X and Y coordinates must also have a common, specified coord_system.
- class iris.analysis.PointInCell(weights=None)[source]#
Point-in-cell regridding scheme suitable for regridding from a source cube with X and Y coordinates all on the same dimensions, to a target cube with bounded X and Y coordinates on separate X and Y dimensions.
Each result datapoint is an average over all source points that fall inside that (bounded) target cell.
Optional Args:
- weights:
A
numpy.ndarray
instance that defines the weights for the grid cells of the source grid. Must have the same shape as the data of the source grid. If unspecified, equal weighting is assumed.
- regridder(src_grid, target_grid)[source]#
Creates a point-in-cell regridder to perform regridding from the source grid to the target grid.
Typically you should use
iris.cube.Cube.regrid()
for regridding a cube. There are, however, some situations when constructing your own regridder is preferable. These are detailed in the user guide.Does not support lazy regridding.
Args:
- Returns
callable(cube)
where cube is a cube with the same grid as src_grid that is to be regridded to the target_grid.
- Return type
A callable with the interface
- iris.analysis.RMS WeightedAggregator instance. #
An
Aggregator
instance that calculates the root mean square over aCube
, as computed by ((x0**2 + x1**2 + … + xN-1**2) / N) ** 0.5.Additional kwargs associated with the use of this aggregator:
- weights (float ndarray):
Weights matching the shape of the cube or the length of the window for rolling window operations. The weights are applied to the squares when taking the mean.
For example:
To compute the zonal root mean square over the longitude axis of a cube:
result = cube.collapsed('longitude', iris.analysis.RMS)
This aggregator handles masked data.
- iris.analysis.STD_DEV Aggregator instance. #
An
Aggregator
instance that calculates the standard deviation over aCube
, as computed bynumpy.ma.std()
.Additional kwargs associated with the use of this aggregator:
- ddof (integer):
Delta degrees of freedom. The divisor used in calculations is N - ddof, where N represents the number of elements. Defaults to 1.
For example:
To compute zonal standard deviations over the longitude axis of a cube:
result = cube.collapsed('longitude', iris.analysis.STD_DEV)
To obtain the biased standard deviation:
result = cube.collapsed('longitude', iris.analysis.STD_DEV, ddof=0)
Note
Lazy operation is supported, via
dask.array.nanstd()
.This aggregator handles masked data.
- iris.analysis.SUM WeightedAggregator instance. #
An
Aggregator
instance that calculates the sum over aCube
, as computed bynumpy.ma.sum()
.Additional kwargs associated with the use of this aggregator:
- weights (float ndarray):
Weights matching the shape of the cube, or the length of the window for rolling window operations. Weights should be normalized before using them with this aggregator if scaling is not intended.
- returned (boolean):
Set this to True to indicate the collapsed weights are to be returned along with the collapsed data. Defaults to False.
For example:
To compute an accumulation over the time axis of a cube:
result = cube.collapsed('time', iris.analysis.SUM)
To compute a weighted rolling sum e.g. to apply a digital filter:
weights = np.array([.1, .2, .4, .2, .1]) result = cube.rolling_window('time', iris.analysis.SUM, len(weights), weights=weights)
This aggregator handles masked data.
This is a nearest-neighbour regridding scheme for regridding data whose horizontal (X- and Y-axis) coordinates are mapped to the same dimensions, rather than being orthogonal on independent dimensions.
For latitude-longitude coordinates, the nearest-neighbour distances are computed on the sphere, otherwise flat Euclidean distances are used.
The source X and Y coordinates can have any shape.
The target grid must be of the “normal” kind, i.e. it has separate, 1-dimensional X and Y coordinates.
Source and target XY coordinates must have the same coordinate system, which may also be None. If any of the XY coordinates are latitudes or longitudes, then they all must be. Otherwise, the corresponding X and Y coordinates must have the same units in the source and grid cubes.
Note
Currently only supports regridding, not interpolation.
- class iris.analysis.UnstructuredNearest[source]#
Nearest-neighbour interpolation and regridding scheme suitable for interpolating or regridding from un-gridded data such as trajectories or other data where the X and Y coordinates share the same dimensions.
- regridder(src_cube, target_grid)[source]#
Creates a nearest-neighbour regridder, of the
UnstructuredNearestNeigbourRegridder
type, to perform regridding from the source grid to the target grid.This can then be applied to any source data with the same structure as the original ‘src_cube’.
Typically you should use
iris.cube.Cube.regrid()
for regridding a cube. There are, however, some situations when constructing your own regridder is preferable. These are detailed in the user guide.Does not support lazy regridding.
Args:
- src_cube:
The
Cube
defining the source grid. The X and Y coordinates can have any shape, but must be mapped over the same cube dimensions.
- target_grid:
The
Cube
defining the target grid. The X and Y coordinates must be one-dimensional dimension coordinates, mapped to different dimensions. All other cube components are ignored.
- Returns
callable(cube)
where cube is a cube with the same grid as src_cube that is to be regridded to the target_grid.
- Return type
A callable with the interface
- iris.analysis.VARIANCE Aggregator instance. #
An
Aggregator
instance that calculates the variance over aCube
, as computed bynumpy.ma.var()
.Additional kwargs associated with the use of this aggregator:
- ddof (integer):
Delta degrees of freedom. The divisor used in calculations is N - ddof, where N represents the number of elements. Defaults to 1.
For example:
To compute zonal variance over the longitude axis of a cube:
result = cube.collapsed('longitude', iris.analysis.VARIANCE)
To obtain the biased variance:
result = cube.collapsed('longitude', iris.analysis.VARIANCE, ddof=0)
Note
Lazy operation is supported, via
dask.array.nanvar()
.This aggregator handles masked data.
- iris.analysis.WPERCENTILE#
An
WeightedPercentileAggregator
instance that calculates the weighted percentile over aCube
.Required kwargs associated with the use of this aggregator:
- percent (float or sequence of floats):
Percentile rank/s at which to extract value/s.
- weights (float ndarray):
Weights matching the shape of the cube or the length of the window for rolling window operations. Note that, latitude/longitude area weights can be calculated using
iris.analysis.cartography.area_weights()
.
Additional kwargs associated with the use of this aggregator:
- returned (boolean):
Set this to True to indicate that the collapsed weights are to be returned along with the collapsed data. Defaults to False.
- kind (string or int):
Specifies the kind of interpolation used, see
scipy.interpolate.interp1d()
Defaults to “linear”, which is equivalent to alphap=0.5, betap=0.5 in iris.analysis.PERCENTILE
Convenience class that supports common weighted aggregation functionality.
- class iris.analysis.WeightedAggregator(cell_method, call_func, units_func=None, lazy_func=None, **kwargs)[source]#
Create a weighted aggregator for the given
call_func
.Args:
- cell_method (string):
Cell method string that supports string format substitution.
- call_func (callable):
Data aggregation function. Call signature (data, axis, **kwargs).
Kwargs:
- units_func (callable):
Units conversion function.
- lazy_func (callable or None):
An alternative to
call_func
implementing a lazy aggregation. Note that, it need not support all features of the main operation, but should raise an error in unhandled cases.
- aggregate(data, axis, **kwargs)#
Perform the aggregation function given the data.
Keyword arguments are passed through to the data aggregation function (for example, the “percent” keyword for a percentile aggregator). This function is usually used in conjunction with update_metadata(), which should be passed the same keyword arguments.
Args:
- data (array):
Data array.
- axis (int):
Axis to aggregate over.
Kwargs:
- mdtol (float):
Tolerance of missing data. The value returned will be masked if the fraction of data to missing data is less than or equal to mdtol. mdtol=0 means no missing data is tolerated while mdtol=1 will return the resulting value from the aggregation function. Defaults to 1.
- kwargs:
All keyword arguments apart from those specified above, are passed through to the data aggregation function.
- Returns
The aggregated data.
- aggregate_shape(**kwargs)#
The shape of the new dimension/s created by the aggregator.
Kwargs:
This function is intended to be used in conjunction with aggregate() and should be passed the same keywords.
- Returns
A tuple of the aggregate shape.
- lazy_aggregate(data, axis, **kwargs)#
Perform aggregation over the data with a lazy operation, analogous to the ‘aggregate’ result.
Keyword arguments are passed through to the data aggregation function (for example, the “percent” keyword for a percentile aggregator). This function is usually used in conjunction with update_metadata(), which should be passed the same keyword arguments.
Args:
- data (array):
A lazy array (
dask.array.Array
).
- axis (int or list of int):
The dimensions to aggregate over – note that this is defined differently to the ‘aggregate’ method ‘axis’ argument, which only accepts a single dimension index.
Kwargs:
- kwargs:
All keyword arguments are passed through to the data aggregation function.
- Returns
A lazy array representing the aggregation operation (
dask.array.Array
).
- name()#
Returns the name of the aggregator.
- post_process(collapsed_cube, data_result, coords, **kwargs)[source]#
Process the result from
iris.analysis.Aggregator.aggregate()
.Returns a tuple(cube, weights) if a tuple(data, weights) was returned from
iris.analysis.Aggregator.aggregate()
.Args:
- collapsed_cube:
- data_result:
Result from
iris.analysis.Aggregator.aggregate()
- coords:
The one or more coordinates that were aggregated over.
Kwargs:
This function is intended to be used in conjunction with aggregate() and should be passed the same keywords (for example, the “weights” keywords from a mean aggregator).
- Returns
The collapsed cube with it’s aggregated data payload. Or a tuple pair of (cube, weights) if the keyword “returned” is specified and True.
- update_metadata(cube, coords, **kwargs)#
Update cube cell method metadata w.r.t the aggregation function.
Args:
- cube (
iris.cube.Cube
):Source cube that requires metadata update.
- coords (
iris.coords.Coord
):The one or more coordinates that were aggregated.
Kwargs:
This function is intended to be used in conjunction with aggregate() and should be passed the same keywords (for example, the “ddof” keyword for a standard deviation aggregator).
- uses_weighting(**kwargs)[source]#
Determine whether this aggregator uses weighting.
Kwargs:
- kwargs:
Arguments to filter of weighted keywords.
- Returns
Boolean.
- call_func#
Data aggregation function.
- cell_method#
Cube cell method string.
- lazy_func#
Lazy aggregation function, may be None to indicate that a lazy operation is not available.
- units_func#
Unit conversion function.
- iris.analysis.clear_phenomenon_identity(cube)[source]#
Helper function to clear the standard_name, attributes, and cell_methods of a cube.
- iris.analysis.create_weighted_aggregator_fn(aggregator_fn, axis, **kwargs)[source]#
Return an aggregator function that can explicitely handle weights.
Args:
- aggregator_fn (callable):
An aggregator function, i.e., a callable that takes arguments
data
,axis
and**kwargs
and returns an array. Examples:Aggregator.aggregate()
,Aggregator.lazy_aggregate()
. This function should accept the keyword argumentweights
.
- axis (int):
Axis to aggregate over. This argument is directly passed to
aggregator_fn
.
Kwargs:
Arbitrary keyword arguments passed to
aggregator_fn
. Should not includeweights
(this will be removed if present).
- Returns
A function that takes two arguments
data_arr
andweights
(both should be an array of the same shape) and returns an array.