iris.fileformats.netcdf#

Support loading and saving NetCDF files using CF conventions for metadata interpretation.

See : NetCDF User’s Guide and netCDF4 python module.

Also : CF Conventions.

class iris.fileformats.netcdf.CFNameCoordMap[source]#

Bases: object

Provide a simple CF name to CF coordinate mapping.

append(name, coord)[source]#

Append the given name and coordinate pair to the mapping.

Parameters:
  • name – CF name of the associated coordinate.

  • coord – The coordinate of the associated CF name.

Return type:

None.

coord(name)[source]#

Return the coordinate, given a CF name, or None if not recognised.

Parameters:

name – CF name of the associated coordinate, or None if not recognised.

Return type:

CF name or None.

property coords#

Return all the coordinates.

name(coord)[source]#

Return the CF name, given a coordinate, or None if not recognised.

Parameters:

coord – The coordinate of the associated CF name.

Return type:

Coordinate or None.

property names#

Return all the CF names.

class iris.fileformats.netcdf.NetCDFDataProxy(shape, dtype, path, variable_name, fill_value)[source]#

Bases: object

A reference to the data payload of a single NetCDF file variable.

property dask_meta#
dtype#
fill_value#
property ndim#
path#
shape#
variable_name#
class iris.fileformats.netcdf.Saver(filename, netcdf_format, compute=True)[source]#

Bases: object

A manager for saving netcdf files.

Manage saving netcdf files.

Parameters:
  • filename (str or netCDF4.Dataset) – Name of the netCDF file to save the cube. OR a writeable object supporting the netCF4.Dataset api.

  • netcdf_format (str) – Underlying netCDF file format, one of ‘NETCDF4’, ‘NETCDF4_CLASSIC’, ‘NETCDF3_CLASSIC’ or ‘NETCDF3_64BIT’. Default is ‘NETCDF4’ format.

  • compute (bool, default=True) –

    If True, delayed variable saves will be completed on exit from the Saver context (after first closing the target file), equivalent to complete().

    If False, the file is created and closed without writing the data of variables for which the source data was lazy. These writes can be completed later, see delayed_completion().

    Note

    If filename is an open dataset, rather than a filepath, then the caller must specify compute=False, close the dataset, and complete delayed saving afterwards. If compute is True in this case, an error is raised. This is because lazy content must be written by delayed save operations, which will only succeed if the dataset can be (re-)opened for writing. See save().

Return type:

None

Example

>>> import iris
>>> from iris.fileformats.netcdf.saver import Saver
>>> cubes = iris.load(iris.sample_data_path('atlantic_profiles.nc'))
>>> with Saver("tmp.nc", "NETCDF4") as sman:
...     # Iterate through the cubelist.
...     for cube in cubes:
...         sman.write(cube)
__exit__(type, value, traceback)[source]#

Flush any buffered data to the CF-netCDF file before closing.

static cf_valid_var_name(var_name)[source]#

Return a valid CF var_name given a potentially invalid name.

Parameters:

var_name (str) – The var_name to normalise.

Returns:

The var_name suitable for passing through for variable creation.

Return type:

str

static check_attribute_compliance(container, data_dtype)[source]#

Check attributte complliance.

complete()[source]#

Complete file by computing any delayed variable saves.

This requires that the Saver has closed the dataset (exited its context).

Return type:

None

compute#

Whether to complete delayed saves on exit.

delayed_completion()[source]#

Perform file completion for delayed saves.

Create and return a dask.delayed.Delayed to perform file completion for delayed saves.

Return type:

dask.delayed.Delayed

Notes

The dataset must be closed (saver has exited its context) before the result can be computed, otherwise computation will hang (never return).

file_write_lock#

A per-file write lock to prevent dask attempting overlapping writes.

filepath#

Target filepath

update_global_attributes(attributes=None, **kwargs)[source]#

Update the CF global attributes.

Update the CF global attributes based on the provided iterable/dictionary and/or keyword arguments.

Parameters:

attributes (dict or iterable of key, value pairs, optional) – CF global attributes to be updated.

write(cube, local_keys=None, unlimited_dimensions=None, zlib=False, complevel=4, shuffle=True, fletcher32=False, contiguous=False, chunksizes=None, endian='native', least_significant_digit=None, packing=None, fill_value=None)[source]#

Wrap for saving cubes to a NetCDF file.

Parameters:
  • cube (iris.cube.Cube) – A iris.cube.Cube to be saved to a netCDF file.

  • local_keys (iterable of str, optional) –

    An iterable of cube attribute keys. Any cube attributes with matching keys will become attributes on the data variable rather than global attributes.

    Note

    Has no effect if iris.FUTURE.save_split_attrs is True.

  • unlimited_dimensions (iterable of str and/or iris.coords.Coord, optional) – List of coordinate names (or coordinate objects) corresponding to coordinate dimensions of cube to save with the NetCDF dimension variable length ‘UNLIMITED’. By default, no unlimited dimensions are saved. Only the ‘NETCDF4’ format supports multiple ‘UNLIMITED’ dimensions.

  • zlib (bool, default=False) – If True, the data will be compressed in the netCDF file using gzip compression (default False).

  • complevel (int, default=4) – An integer between 1 and 9 describing the level of compression desired (default 4). Ignored if zlib=False.

  • shuffle (bool, default=True) – If True, the HDF5 shuffle filter will be applied before compressing the data (default True). This significantly improves compression. Ignored if zlib=False.

  • fletcher32 (bool, default=False) – If True, the Fletcher32 HDF5 checksum algorithm is activated to detect errors. Default False.

  • contiguous (bool, default=False) – If True, the variable data is stored contiguously on disk. Default False. Setting to True for a variable with an unlimited dimension will trigger an error.

  • chunksizes (tuple of int, optional) – Used to manually specify the HDF5 chunksizes for each dimension of the variable. A detailed discussion of HDF chunking and I/O performance is available here. Basically, you want the chunk size for each dimension to match as closely as possible the size of the data block that users will read from the file. chunksizes cannot be set if contiguous=True.

  • endian (str, default="native") – Used to control whether the data is stored in little or big endian format on disk. Possible values are ‘little’, ‘big’ or ‘native’ (default). The library will automatically handle endian conversions when the data is read, but if the data is always going to be read on a computer with the opposite format as the one used to create the file, there may be some performance advantage to be gained by setting the endian-ness.

  • least_significant_digit (int, optional) – If least_significant_digit is specified, variable data will be truncated (quantized). In conjunction with zlib=True this produces ‘lossy’, but significantly more efficient compression. For example, if least_significant_digit=1, data will be quantized using numpy.around(scale*data)/scale, where scale = 2**bits, and bits is determined so that a precision of 0.1 is retained (in this case bits=4). From here: “least_significant_digit – power of ten of the smallest decimal place in unpacked data that is a reliable value”. Default is None, or no quantization, or ‘lossless’ compression.

  • packing (type or str or dict or list, optional) – A numpy integer datatype (signed or unsigned) or a string that describes a numpy integer dtype(i.e. ‘i2’, ‘short’, ‘u4’) or a dict of packing parameters as described below. This provides support for netCDF data packing as described here. If this argument is a type (or type string), appropriate values of scale_factor and add_offset will be automatically calculated based on cube.data and possible masking. For more control, pass a dict with one or more of the following keys: dtype (required), scale_factor and add_offset. Note that automatic calculation of packing parameters will trigger loading of lazy data; set them manually using a dict to avoid this. The default is None, in which case the datatype is determined from the cube and no packing will occur.

  • fill_value (optional) – The value to use for the _FillValue attribute on the netCDF variable. If packing is specified the value of fill_value should be in the domain of the packed data.

Return type:

None.

Notes

The zlib, complevel, shuffle, fletcher32, contiguous, chunksizes and endian keywords are silently ignored for netCDF 3 files that do not use HDF5.

exception iris.fileformats.netcdf.UnknownCellMethodWarning[source]#

Bases: IrisUnknownCellMethodWarning

Backwards compatible form of iris.warnings.IrisUnknownCellMethodWarning.

add_note()#

Exception.add_note(note) – add a note to the exception

args#
with_traceback()#

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

iris.fileformats.netcdf.load_cubes(file_sources, callback=None, constraints=None)[source]#

Load cubes from a list of NetCDF filenames/OPeNDAP URLs.

Parameters:
  • file_sources (str or list) – One or more NetCDF filenames/OPeNDAP URLs to load from. OR open datasets.

  • callback (function, optional) – Function which can be passed on to iris.io.run_callback().

  • constraints (optional)

Return type:

Generator of loaded NetCDF iris.cube.Cube.

iris.fileformats.netcdf.parse_cell_methods(nc_cell_methods, cf_name=None)[source]#

Parse a CF cell_methods attribute string into a tuple of zero or more CellMethod instances.

Parameters:
  • nc_cell_methods (str) – The value of the cell methods attribute to be parsed.

  • cf_name (optional)

Return type:

iterable of iris.coords.CellMethod.

Notes

Multiple coordinates, intervals and comments are supported. If a method has a non-standard name a warning will be issued, but the results are not affected.

iris.fileformats.netcdf.save(cube, filename, netcdf_format='NETCDF4', local_keys=None, unlimited_dimensions=None, zlib=False, complevel=4, shuffle=True, fletcher32=False, contiguous=False, chunksizes=None, endian='native', least_significant_digit=None, packing=None, fill_value=None, compute=True)[source]#

Save cube(s) to a netCDF file, given the cube and the filename.

  • Iris will write CF 1.7 compliant NetCDF files.

  • If split-attribute saving is disabled, i.e. iris.FUTURE .save_split_attrs is False, then attributes dictionaries on each cube in the saved cube list will be compared, and common attributes saved as NetCDF global attributes where appropriate.

    Or, when split-attribute saving is enabled, then cube.attributes.locals are always saved as attributes of data-variables, and cube.attributes.globals are saved as global (dataset) attributes, where possible. Since the 2 types are now distinguished : see CubeAttrsDict.

  • Keyword arguments specifying how to save the data are applied to each cube. To use different settings for different cubes, use the NetCDF Context manager (Saver) directly.

  • The save process will stream the data payload to the file using dask, enabling large data payloads to be saved and maintaining the ‘lazy’ status of the cube’s data payload, unless the netcdf_format is explicitly specified to be ‘NETCDF3’ or ‘NETCDF3_CLASSIC’.

Parameters:
  • cube (iris.cube.Cube or iris.cube.CubeList) – A iris.cube.Cube, iris.cube.CubeList or other iterable of cubes to be saved to a netCDF file.

  • filename (str) –

    Name of the netCDF file to save the cube(s). Or an open, writeable netCDF4.Dataset, or compatible object.

    Note

    When saving to a dataset, compute must be False : See the compute parameter.

  • netcdf_format (str, default="NETCDF") – Underlying netCDF file format, one of ‘NETCDF4’, ‘NETCDF4_CLASSIC’, ‘NETCDF3_CLASSIC’ or ‘NETCDF3_64BIT’. Default is ‘NETCDF4’ format.

  • local_keys (iterable of str, optional) –

    An iterable of cube attribute keys. Any cube attributes with matching keys will become attributes on the data variable rather than global attributes.

    Note

    This is ignored if ‘split-attribute saving’ is enabled, i.e. when iris.FUTURE.save_split_attrs is True.

  • unlimited_dimensions (iterable of str and/or iris.coords.Coord objects, optional) – List of coordinate names (or coordinate objects) corresponding to coordinate dimensions of cube to save with the NetCDF dimension variable length ‘UNLIMITED’. By default, no unlimited dimensions are saved. Only the ‘NETCDF4’ format supports multiple ‘UNLIMITED’ dimensions.

  • zlib (bool, default=False) – If True, the data will be compressed in the netCDF file using gzip compression (default False).

  • complevel (int, default=4) – An integer between 1 and 9 describing the level of compression desired (default 4). Ignored if zlib=False.

  • shuffle (bool, default=True) – If True, the HDF5 shuffle filter will be applied before compressing the data (default True). This significantly improves compression. Ignored if zlib=False.

  • fletcher32 (bool, default=False) – If True, the Fletcher32 HDF5 checksum algorithm is activated to detect errors. Default False.

  • contiguous (bool, default=False) – If True, the variable data is stored contiguously on disk. Default False. Setting to True for a variable with an unlimited dimension will trigger an error.

  • chunksizes (tuple of int, optional) – Used to manually specify the HDF5 chunksizes for each dimension of the variable. A detailed discussion of HDF chunking and I/O performance is available here. Basically, you want the chunk size for each dimension to match as closely as possible the size of the data block that users will read from the file. chunksizes cannot be set if contiguous=True.

  • endian (str, default="native") – Used to control whether the data is stored in little or big endian format on disk. Possible values are ‘little’, ‘big’ or ‘native’ (default). The library will automatically handle endian conversions when the data is read, but if the data is always going to be read on a computer with the opposite format as the one used to create the file, there may be some performance advantage to be gained by setting the endian-ness.

  • least_significant_digit (int, optional) –

    If least_significant_digit is specified, variable data will be truncated (quantized). In conjunction with zlib=True this produces ‘lossy’, but significantly more efficient compression. For example, if least_significant_digit=1, data will be quantized using numpy.around(scale*data)/scale, where scale = 2**bits, and bits is determined so that a precision of 0.1 is retained (in this case bits=4). From

    ”least_significant_digit – power of ten of the smallest decimal place in unpacked data that is a reliable value”. Default is None, or no quantization, or ‘lossless’ compression.

  • packing (type or str or dict or list, optional) – A numpy integer datatype (signed or unsigned) or a string that describes a numpy integer dtype (i.e. ‘i2’, ‘short’, ‘u4’) or a dict of packing parameters as described below or an iterable of such types, strings, or dicts. This provides support for netCDF data packing as described in here If this argument is a type (or type string), appropriate values of scale_factor and add_offset will be automatically calculated based on cube.data and possible masking. For more control, pass a dict with one or more of the following keys: dtype (required), scale_factor and add_offset. Note that automatic calculation of packing parameters will trigger loading of lazy data; set them manually using a dict to avoid this. The default is None, in which case the datatype is determined from the cube and no packing will occur. If this argument is a list it must have the same number of elements as cube if cube is a iris.cube.CubeList, or one element, and each element of this argument will be applied to each cube separately.

  • fill_value (numeric or list, optional) – The value to use for the _FillValue attribute on the netCDF variable. If packing is specified the value of fill_value should be in the domain of the packed data. If this argument is a list it must have the same number of elements as cube if cube is a iris.cube.CubeList, or a single element, and each element of this argument will be applied to each cube separately.

  • compute (bool, default=True) –

    Default is True, meaning complete the file immediately, and return None.

    When False, create the output file but don’t write any lazy array content to its variables, such as lazy cube data or aux-coord points and bounds. Instead return a dask.delayed.Delayed which, when computed, will stream all the lazy content via dask.store(), to complete the file. Several such data saves can be performed in parallel, by passing a list of them into a dask.compute() call.

    Note

    If saving to an open dataset instead of a filepath, then the caller must specify compute=False, and complete delayed saves after closing the dataset. This is because delayed saves may be performed in other processes : These must (re-)open the dataset for writing, which will fail if the file is still open for writing by the caller.

Returns:

If compute=True, returns None. Otherwise returns a dask.delayed.Delayed, which implements delayed writing to fill in the variables data.

Return type:

None or dask.delayed.Delayed

Notes

The zlib, complevel, shuffle, fletcher32, contiguous, chunksizes and endian keywords are silently ignored for netCDF 3 files that do not use HDF5.

Submodules#