# Iris Data Structures#

The top level object in Iris is called a cube. A cube contains data and metadata about a phenomenon.

In Iris, a cube is an interpretation of the *Climate and Forecast (CF)
Metadata Conventions* whose purpose is to:

*require conforming datasets to contain sufficient metadata that they are
self-describing… including physical units if appropriate, and that each
value can be located in space (relative to earth-based coordinates) and
time.*

Whilst the CF conventions are often mentioned alongside NetCDF, Iris implements several major format importers which can take files of specific formats and turn them into Iris cubes. Additionally, a framework is provided which allows users to extend Iris’ import capability to cater for specialist or unimplemented formats.

A single cube describes one and only one phenomenon, always has a name, a unit
and an n-dimensional data array to represents the cube’s phenomenon. In order
to locate the data spatially, temporally, or in any other higher-dimensional
space, a collection of *coordinates* exist on the cube.

## Coordinates#

A coordinate is a container to store metadata about some dimension(s) of a cube’s data array and therefore, by definition, its phenomenon.

Each coordinate has a name and a unit.

- When a coordinate is added to a cube, the data dimensions that it
represents are also provided.

The shape of a coordinate is always the same as the shape of the associated data dimension(s) on the cube.

A dimension not explicitly listed signifies that the coordinate is independent of that dimension.

Each dimension of a coordinate must be mapped to a data dimension. The only coordinates with no mapping are scalar coordinates.

Depending on the underlying data that the coordinate is representing, its values may be discrete points or be bounded to represent interval extents (e.g. temperature at

*point x***vs**rainfall accumulation*between 0000-1200 hours*).Coordinates have an attributes dictionary which can hold arbitrary extra metadata, excluding certain restricted CF names

More complex coordinates may contain a coordinate system which is necessary to fully interpret the values contained within the coordinate.

There are two classes of coordinates:

**DimCoord**

Numeric

Monotonic

Representative of, at most, a single data dimension (1d)

**AuxCoord**

May be of any type, including strings

May represent multiple data dimensions (n-dimensional)

## Cube#

A cube consists of:

a standard name and/or a long name and an appropriate unit

a data array who’s values are representative of the phenomenon

a collection of coordinates and associated data dimensions on the cube’s data array, which are split into two separate lists:

*dimension coordinates*- DimCoords which uniquely map to exactly one data dimension, ordered by dimension.*auxiliary coordinates*- DimCoords or AuxCoords which map to as many data dimensions as the coordinate has dimensions.

an attributes dictionary which, other than some protected CF names, can hold arbitrary extra metadata. This implements the concept of dataset-level and variable-level attributes when loading and and saving NetCDF files (see

`CubeAttrsDict`

and NetCDF`save()`

for more).a list of cell methods to represent operations which have already been applied to the data (e.g. “mean over time”)

a list of coordinate “factories” used for deriving coordinates from the values of other coordinates in the cube

### Cubes in Practice#

## A Simple Cube Example#

Suppose we have some gridded data which has 24 air temperature readings (in Kelvin) which is located at 4 different longitudes, 2 different latitudes and 3 different heights. Our data array can be represented pictorially:

Where dimensions 0, 1, and 2 have lengths 3, 2 and 4 respectively.

The Iris cube to represent this data would consist of:

a standard name of

`air_temperature`

and a unit of`kelvin`

a data array of shape

`(3, 2, 4)`

a coordinate, mapping to dimension 0, consisting of:

a standard name of

`height`

and unit of`meters`

an array of length 3 representing the 3

`height`

points

a coordinate, mapping to dimension 1, consisting of:

a standard name of

`latitude`

and unit of`degrees`

an array of length 2 representing the 2 latitude points

a coordinate system such that the

`latitude`

points could be fully located on the globe

a coordinate, mapping to dimension 2, consisting of:

a standard name of

`longitude`

and unit of`degrees`

an array of length 4 representing the 4 longitude points

a coordinate system such that the

`longitude`

points could be fully located on the globe

Pictorially the cube has taken on more information than a simple array:

Additionally further information may be optionally attached to the cube. For example, it is possible to attach any of the following:

a coordinate, not mapping to any data dimensions, consisting of:

a standard name of

`time`

and unit of`days since 2000-01-01 00:00`

a data array of length 1 representing the time that the data array is valid for

an auxiliary coordinate, mapping to dimensions 1 and 2, consisting of:

a long name of

`place name`

and no unita 2d string array of shape

`(2, 4)`

with the names of the 8 places that the lat/lons correspond to

an auxiliary coordinate “factory”, which can derive its own mapping, consisting of:

a standard name of

`height`

and a unit of`feet`

knowledge of how data values for this coordinate can be calculated given the

`height in meters`

coordinate

a cell method of “mean” over “ensemble” to indicate that the data has been meaned over a collection of “ensembles” (i.e. multiple model runs).

## Printing a Cube#

Every Iris cube can be printed to screen as you will see later in the user guide. It is worth familiarising yourself with the output as this is the quickest way of inspecting the contents of a cube. Here is the result of printing a real life cube:

```
air_potential_temperature / (K) (time: 3; model_level_number: 7; grid_latitude: 204; grid_longitude: 187)
Dimension coordinates:
time x - - -
model_level_number - x - -
grid_latitude - - x -
grid_longitude - - - x
Auxiliary coordinates:
forecast_period x - - -
level_height - x - -
sigma - x - -
surface_altitude - - x x
Derived coordinates:
altitude - x x x
Scalar coordinates:
forecast_reference_time 2009-11-19 04:00:00
Attributes:
STASH m01s00i004
source 'Data from Met Office Unified Model'
um_version '7.3'
```

Using this output we can deduce that:

The cube represents air potential temperature.

There are 4 data dimensions, and the data has a shape of

`(3, 7, 204, 187)`

The 4 data dimensions are mapped to the

`time`

,`model_level_number`

,`grid_latitude`

,`grid_longitude`

coordinates respectivelyThere are three 1d auxiliary coordinates and one 2d auxiliary (

`surface_altitude`

)There is a single

`altitude`

derived coordinate, which spans 3 data dimensionsThere are 7 distinct values in the “model_level_number” coordinate. Similar inferences can be made for the other dimension coordinates.

There are 7, not necessarily distinct, values in the

`level_height`

coordinate.There is a single

`forecast_reference_time`

scalar coordinate representing the entire cube.The cube has one further attribute relating to the phenomenon. In this case the originating file format, PP, encodes information in a STASH code which in some cases can be useful for identifying advanced experiment information relating to the phenomenon.