Loading Iris Cubes
To load a single file into a list of Iris cubes
iris.load() function is used:
import iris filename = '/path/to/file' cubes = iris.load(filename)
Iris will attempt to return as few cubes as possible by collecting together multiple fields with a shared standard name into a single multidimensional cube.
iris.load() function automatically recognises the format
of the given files and attempts to produce Iris Cubes from their contents.
Currently there is support for CF NetCDF, GRIB 1 & 2, PP and FieldsFiles file formats with a framework for this to be extended to custom formats.
In order to find out what has been loaded, the result can be printed:
>>> import iris >>> filename = iris.sample_data_path('uk_hires.pp') >>> cubes = iris.load(filename) >>> print(cubes) 0: air_potential_temperature / (K) (time: 3; model_level_number: 7; grid_latitude: 204; grid_longitude: 187) 1: surface_altitude / (m) (grid_latitude: 204; grid_longitude: 187)
This shows that there were 2 cubes as a result of loading the file, they were:
surface_altitude cube was 2 dimensional with:
the two dimensions have extents of 204 and 187 respectively and are represented by the
air_potential_temperature cubes were 4 dimensional with:
the same length
timedimension of length 3
model_level_numberdimension of length 7
The result of
iris.load() is always a
list of cubes.
Anything that can be done with a Python
list can be done
with the resultant list of cubes. It is worth noting, however, that
there is no inherent order to this
list of cubes.
Because of this, indexing may be inconsistent. A more consistent way to
extract a cube is by using the
iris.Constraint class as
described in Constrained Loading.
Throughout this user guide you will see the function
iris.sample_data_path being used to get the filename for the resources
used in the examples. The result of this function is just a string.
Using this function allows us to provide examples which will work across platforms and with data installed in different locations, however in practice you will want to use your own strings:
filename = '/path/to/file' cubes = iris.load(filename)
To get the air potential temperature cube from the list of cubes
iris.load() in the previous example,
list indexing can be used:
>>> import iris >>> filename = iris.sample_data_path('uk_hires.pp') >>> cubes = iris.load(filename) >>> # get the first cube (list indexing is 0 based) >>> air_potential_temperature = cubes >>> print(air_potential_temperature) air_potential_temperature / (K) (time: 3; model_level_number: 7; grid_latitude: 204; grid_longitude: 187) Dimension coordinates: time x - - - model_level_number - x - - grid_latitude - - x - grid_longitude - - - x Auxiliary coordinates: forecast_period x - - - level_height - x - - sigma - x - - surface_altitude - - x x Derived coordinates: altitude - x x x Scalar coordinates: forecast_reference_time 2009-11-19 04:00:00 Attributes: STASH m01s00i004 source Data from Met Office Unified Model um_version 7.3
Notice that the result of printing a cube is a little more verbose than it was when printing a list of cubes. In addition to the very short summary which is provided when printing a list of cubes, information is provided on the coordinates which constitute the cube in question. This was the output discussed at the end of the Iris Data Structures section.
Dimensioned coordinates will have a dimension marker
x in the
appropriate column for each cube data dimension that they describe.
Loading Multiple Files
To load more than one file into a list of cubes, a list of filenames can be
filenames = [iris.sample_data_path('uk_hires.pp'), iris.sample_data_path('air_temp.pp')] cubes = iris.load(filenames)
It is also possible to load one or more files with wildcard substitution
using the expansion rules defined
For example, to match zero or more characters in the filename, star wildcards can be used:
filename = iris.sample_data_path('GloSea4', '*.pp') cubes = iris.load(filename)
The cubes returned will not necessarily be in the same order as the order of the filenames.
In fact when Iris loads data from most file types, it normally only reads the essential descriptive information or metadata : the bulk of the actual data content will only be loaded later, as it is needed. This is referred to as ‘lazy’ data. It allows loading to be much quicker, and to occupy less memory.
For more on the benefits, handling and uses of lazy data, see Real and Lazy Data.
Given a large dataset, it is possible to restrict or constrain the load to match specific Iris cube metadata. Constrained loading provides the ability to generate a cube from a specific subset of data that is of particular interest.
As we have seen, loading the following file creates several Cubes:
filename = iris.sample_data_path('uk_hires.pp') cubes = iris.load(filename)
Specifying a name as a constraint argument to
iris.load() will mean
only cubes with matching
will be returned:
filename = iris.sample_data_path('uk_hires.pp') cubes = iris.load(filename, 'surface_altitude')
Note that, the provided name will match against either the standard name,
long name, NetCDF variable name or STASH metadata of a cube. Therefore, the
previous example using the
surface_altitude standard name constraint can
also be achieved using the STASH value of
filename = iris.sample_data_path('uk_hires.pp') cubes = iris.load(filename, 'm01s00i033')
If further specific name constraint control is required i.e., to constrain
against a combination of standard name, long name, NetCDF variable name and/or
STASH metadata, consider using the
iris.NameConstraint. For example,
to constrain against both a standard name of
surface_altitude and a STASH
filename = iris.sample_data_path('uk_hires.pp') constraint = iris.NameConstraint(standard_name='surface_altitude', STASH='m01s00i033') cubes = iris.load(filename, constraint)
To constrain the load to multiple distinct constraints, a list of constraints can be provided. This is equivalent to running load once for each constraint but is likely to be more efficient:
filename = iris.sample_data_path('uk_hires.pp') cubes = iris.load(filename, ['air_potential_temperature', 'surface_altitude'])
iris.Constraint class can be used to restrict coordinate values
on load. For example, to constrain the load to match
filename = iris.sample_data_path('uk_hires.pp') level_10 = iris.Constraint(model_level_number=10) cubes = iris.load(filename, level_10)
Constraints can be combined using
& to represent a more restrictive
filename = iris.sample_data_path('uk_hires.pp') forecast_6 = iris.Constraint(forecast_period=6) level_10 = iris.Constraint(model_level_number=10) cubes = iris.load(filename, forecast_6 & level_10)
& is supported, the
| that might reasonably be expected is
not. Explanation as to why is in the
For an example of constraining to multiple ranges of the same coordinate to
generate one cube, see the
iris.Constraint reference documentation.
To generate multiple cubes, each constrained to a different range of the
same coordinate, use
As well as being able to combine constraints using
iris.Constraint class can accept multiple arguments,
and a list of values can be given to constrain a coordinate to one of
a collection of values:
filename = iris.sample_data_path('uk_hires.pp') level_10_or_16_fp_6 = iris.Constraint(model_level_number=[10, 16], forecast_period=6) cubes = iris.load(filename, level_10_or_16_fp_6)
A common requirement is to limit the value of a coordinate to a specific range, this can be achieved by passing the constraint a function:
def bottom_16_levels(cell): # return True or False as to whether the cell in question should be kept return cell <= 16 filename = iris.sample_data_path('uk_hires.pp') level_lt_16 = iris.Constraint(model_level_number=bottom_16_levels) cubes = iris.load(filename, level_lt_16)
As with many of the examples later in this documentation, the simple function above can be conveniently written as a lambda function on a single line:
bottom_16_levels = lambda cell: cell <= 16
Cube attributes can also be part of the constraint criteria. Supposing a
cube attribute of
STASH existed, as is the case when loading
then specific STASH codes can be filtered:
filename = iris.sample_data_path('uk_hires.pp') level_10_with_stash = iris.AttributeConstraint(STASH='m01s00i004') & iris.Constraint(model_level_number=10) cubes = iris.load(filename, level_10_with_stash)
For advanced usage there are further examples in the
iris.Constraint reference documentation.
Constraining a Circular Coordinate Across its Boundary
Occasionally you may need to constrain your cube with a region that crosses the boundary of a circular coordinate (this is often the meridian or the dateline / antimeridian). An example use-case of this is to extract the entire Pacific Ocean from a cube whose longitudes are bounded by the dateline.
This functionality cannot be provided reliably using constraints. Instead you should use the
functionality provided by
to extract this region.
Constraining on Time
Iris follows NetCDF-CF rules in representing time coordinate values as normalised, purely numeric, values which are normalised by the calendar specified in the coordinate’s units (e.g. “days since 1970-01-01”). However, when constraining by time we usually want to test calendar-related aspects such as hours of the day or months of the year, so Iris provides special features to facilitate this:
Firstly, when Iris evaluates Constraint expressions, it will convert time-coordinate
values (points and bounds) from numbers into
for ease of calendar-based testing.
>>> filename = iris.sample_data_path('uk_hires.pp') >>> cube_all = iris.load_cube(filename, 'air_potential_temperature') >>> print('All times :\n' + str(cube_all.coord('time'))) All times : DimCoord([2009-11-19 10:00:00, 2009-11-19 11:00:00, 2009-11-19 12:00:00], standard_name='time', calendar='gregorian') >>> # Define a function which accepts a datetime as its argument (this is simplified in later examples). >>> hour_11 = iris.Constraint(time=lambda cell: cell.point.hour == 11) >>> cube_11 = cube_all.extract(hour_11) >>> print('Selected times :\n' + str(cube_11.coord('time'))) Selected times : DimCoord([2009-11-19 11:00:00], standard_name='time', calendar='gregorian')
iris.time module provides flexible time comparison
iris.time.PartialDateTime object can be compared to
objects such as
datetime.datetime instances, and this comparison will
then test only those ‘aspects’ which the PartialDateTime instance defines:
>>> import datetime >>> from iris.time import PartialDateTime >>> dt = datetime.datetime(2011, 3, 7) >>> print(dt > PartialDateTime(year=2010, month=6)) True >>> print(dt > PartialDateTime(month=6)) False >>>
These two facilities can be combined to provide straightforward calendar-based time selections when loading or extracting data.
The previous constraint example can now be written as:
>>> the_11th_hour = iris.Constraint(time=iris.time.PartialDateTime(hour=11)) >>> print(iris.load_cube( ... iris.sample_data_path('uk_hires.pp'), ... 'air_potential_temperature' & the_11th_hour).coord('time')) DimCoord([2009-11-19 11:00:00], standard_name='time', calendar='gregorian')
It is common that a cube will need to be constrained between two given dates. In the following example we construct a time sequence representing the first day of every week for many years:
>>> print(long_ts.coord('time')) DimCoord([2007-04-09 00:00:00, 2007-04-16 00:00:00, 2007-04-23 00:00:00, ... 2010-02-01 00:00:00, 2010-02-08 00:00:00, 2010-02-15 00:00:00], standard_name='time', calendar='gregorian')
Given two dates in datetime format, we can select all points between them.
>>> d1 = datetime.datetime.strptime('20070715T0000Z', '%Y%m%dT%H%MZ') >>> d2 = datetime.datetime.strptime('20070825T0000Z', '%Y%m%dT%H%MZ') >>> st_swithuns_daterange_07 = iris.Constraint( ... time=lambda cell: d1 <= cell.point < d2) >>> within_st_swithuns_07 = long_ts.extract(st_swithuns_daterange_07) >>> print(within_st_swithuns_07.coord('time')) DimCoord([2007-07-16 00:00:00, 2007-07-23 00:00:00, 2007-07-30 00:00:00, 2007-08-06 00:00:00, 2007-08-13 00:00:00, 2007-08-20 00:00:00], standard_name='time', calendar='gregorian')
Alternatively, we may rewrite this using
>>> pdt1 = PartialDateTime(year=2007, month=7, day=15) >>> pdt2 = PartialDateTime(year=2007, month=8, day=25) >>> st_swithuns_daterange_07 = iris.Constraint( ... time=lambda cell: pdt1 <= cell.point < pdt2) >>> within_st_swithuns_07 = long_ts.extract(st_swithuns_daterange_07) >>> print(within_st_swithuns_07.coord('time')) DimCoord([2007-07-16 00:00:00, 2007-07-23 00:00:00, 2007-07-30 00:00:00, 2007-08-06 00:00:00, 2007-08-13 00:00:00, 2007-08-20 00:00:00], standard_name='time', calendar='gregorian')
A more complex example might require selecting points over an annually repeating date range. We can select points within a certain part of the year, in this case between the 15th of July through to the 25th of August. By making use of PartialDateTime this becomes simple:
>>> st_swithuns_daterange = iris.Constraint( ... time=lambda cell: PartialDateTime(month=7, day=15) <= cell < PartialDateTime(month=8, day=25)) >>> within_st_swithuns = long_ts.extract(st_swithuns_daterange) ... >>> print(within_st_swithuns.coord('time')) DimCoord([2007-07-16 00:00:00, 2007-07-23 00:00:00, 2007-07-30 00:00:00, 2007-08-06 00:00:00, 2007-08-13 00:00:00, 2007-08-20 00:00:00, 2008-07-21 00:00:00, 2008-07-28 00:00:00, 2008-08-04 00:00:00, 2008-08-11 00:00:00, 2008-08-18 00:00:00, 2009-07-20 00:00:00, 2009-07-27 00:00:00, 2009-08-03 00:00:00, 2009-08-10 00:00:00, 2009-08-17 00:00:00, 2009-08-24 00:00:00], standard_name='time', calendar='gregorian')
Notice how the dates printed are between the range specified in the
and that they span multiple years.
iris.load_cubes() functions are
iris.load() except they can only return
one cube per constraint.
iris.load_cube() function accepts a single constraint and
returns a single cube. The
iris.load_cubes() function accepts any
number of constraints and returns a list of cubes (as an iris.cube.CubeList).
Providing no constraints to
is equivalent to requesting exactly one cube of any type.
A single cube is loaded in the following example:
>>> filename = iris.sample_data_path('air_temp.pp') >>> cube = iris.load_cube(filename) >>> print(cube) air_temperature / (K) (latitude: 73; longitude: 96) Dimension coordinates: latitude x - longitude - x ... Cell methods: mean time
However, when attempting to load data which would result in anything other than one cube, an exception is raised:
>>> filename = iris.sample_data_path('uk_hires.pp') >>> cube = iris.load_cube(filename) Traceback (most recent call last): ... iris.exceptions.ConstraintMismatchError: Expected exactly one cube, found 2.
All the load functions share many of the same features, hence multiple files could be loaded with wildcard filenames or by providing a list of filenames.
The strict nature of
means that, when combined with constrained loading, it is possible to
ensure that precisely what was asked for on load is given
- otherwise an exception is raised.
This fact can be utilised to make code only run successfully if
the data provided has the expected criteria.
For example, suppose that code needed
in order to run:
import iris filename = iris.sample_data_path('uk_hires.pp') air_pot_temp = iris.load_cube(filename, 'air_potential_temperature') print(air_pot_temp)
Should the file not produce exactly one cube with a standard name of ‘air_potential_temperature’, an exception will be raised.
Similarly, supposing a routine needed both ‘surface_altitude’ and ‘air_potential_temperature’ to be able to run:
import iris filename = iris.sample_data_path('uk_hires.pp') altitude_cube, pot_temp_cube = iris.load_cubes(filename, ['surface_altitude', 'air_potential_temperature'])
The result of
iris.load_cubes() in this case will be a list of 2 cubes
ordered by the constraints provided. Multiple assignment has been used to put
these two cubes into separate variables.
In Python, lists of a pre-known length and order can be exploited using multiple assignment:
>>> number_one, number_two = [1, 2] >>> print(number_one) 1 >>> print(number_two) 2