Graphics Tests
Iris may be used to create various forms of graphical output; to ensure the output is consistent, there are automated tests to check against known acceptable graphical output. See Running the Tests for more information.
At present graphical tests are used in the following areas of Iris:
Module
iris.tests.test_plot
Module
iris.tests.test_quickplot
Gallery plots contained in
docs/gallery_tests
.
Challenges
Iris uses many dependencies that provide functionality, an example that applies here is matplotlib. For more information on the dependences, see Installing Iris. When there are updates to the matplotlib or a dependency of matplotlib, this may result in a change in the rendered graphical output. This means that there may be no changes to Iris, but due to an updated dependency any automated tests that compare a graphical output to a known acceptable output may fail. The failure may also not be visually perceived as it may be a simple pixel shift.
Testing Strategy
The Iris Cirrus-CI matrix defines multiple test runs that use different versions of Python to ensure Iris is working as expected.
To make this manageable, the iris.tests.IrisTest_nometa.check_graphic
test
routine tests against multiple alternative acceptable results. It does
this using an image hash comparison technique which avoids storing
reference images in the Iris repository itself.
This consists of:
The
iris.tests.IrisTest_nometa.check_graphic
function uses a perceptual image hash of the outputs (see https://github.com/JohannesBuchner/imagehash) as the basis for checking test results.The hashes of known acceptable results for each test are stored in a lookup dictionary, saved to the repo file
lib/iris/tests/results/imagerepo.json
(link) .An actual reference image for each hash value is stored in a separate public repository https://github.com/SciTools/test-iris-imagehash.
The reference images allow human-eye assessment of whether a new output is judged to be close enough to the older ones, or not.
The utility script
iris/tests/idiff.py
automates checking, enabling the developer to easily compare proposed new acceptable result images against the existing accepted reference images, for each failing test.
The acceptable images for each test can be viewed online. The Graphical Test Hash Index lists all the graphical tests in the test suite and shows the known acceptable result images for comparison.
Reviewing Failing Tests
When you find that a graphics test in the Iris testing suite has failed, following changes in Iris or the run dependencies, this is the process you should follow:
Create a new, empty directory to store temporary image results, at the path
lib/iris/tests/result_image_comparison
in your Iris repository checkout.In your Iris repo root directory, run the relevant (failing) tests directly as python scripts, or by using a command such as:
python -m unittest discover paths/to/test/files
In the
iris/lib/iris/tests
folder, run the command:python idiff.py
This will open a window for you to visually inspect side-by-side old, new and difference images for each failed graphics test. Hit a button to either accept, reject or skip each new result.
If the change is accepted:
the imagehash value of the new result image is added into the relevant set of ‘valid result hashes’ in the image result database file,
tests/results/imagerepo.json
the relevant output file in
tests/result_image_comparison
is renamed according to the image hash value, as<hash>.png
. A copy of this new PNG file must then be added into the reference image repository at https://github.com/SciTools/test-iris-imagehash (See below).
If a change is skipped:
no further changes are made in the repo.
when you run
iris/tests/idiff.py
again, the skipped choice will be presented again.
If a change is rejected:
the output image is deleted from
result_image_comparison
.when you run
iris/tests/idiff.py
again, the skipped choice will not appear, unless the relevant failing test is re-run.
Now re-run the tests. The new result should now be recognised and the relevant test should pass. However, some tests can perform multiple graphics checks within a single test case function. In those cases, any failing check will prevent the following ones from being run, so a test re-run may encounter further (new) graphical test failures. If that happens, simply repeat the check-and-accept process until all tests pass.
Add Your Changes to Iris
To add your changes to Iris, you need to make two pull requests (PR).
The first PR is made in the
test-iris-imagehash
repository, at https://github.com/SciTools/test-iris-imagehash.First, add all the newly-generated referenced PNG files into the
images/v4
directory. In your Iris repo, these files are to be found in the temporary results folderiris/tests/result_image_comparison
.Then, to update the file which lists available images,
v4_files_listing.txt
, run from the project root directory:python recreate_v4_files_listing.py
Create a PR proposing these changes, in the usual way.
The second PR is created in the Iris repository, and should only include the change to the image results database,
tests/results/imagerepo.json
. The description box of this pull request should contain a reference to the matching one intest-iris-imagehash
.
Note
The result_image_comparison
folder is covered by a project
.gitignore
setting, so those files will not show up in a
git status
check.
Important
The Iris pull-request will not test successfully in Cirrus-CI until the
test-iris-imagehash
pull request has been merged. This is because there
is an Iris test which ensures the existence of the reference images (uris)
for all the targets in the image results database. It will also fail
if you forgot to run recreate_v4_files_listing.py
to update the
image-listing file in test-iris-imagehash
.