Overview of visual UI testing

Visual testing is a form of regression test that ensures that screens that were correct have not changed unexpectedly.

Visual testing of an application is done by running an application, and saving screen snapshots at key points called checkpoints. These screenshots are then compared to previously stored baseline images, and any visually significant differences are reported.

Implementing visual UI testing typically involves four basic steps:

  1. Write a test that exercises your Application UI by sending simulated mouse and keyboard events in order to enter various states, and capturing a screenshot in each of these states.
  2. Compare the captured screenshots to previously captured baseline images.
  3. Review the resulting differences and :
    1. Identify cases where differences were caused by a new feature that do not appear in the baseline image and "accepting" the new screenshot so that it will be used as the new baseline image for that checkpoint.
    2. Identify cases where differences indicate a bug that needs to be fixed, reporting the issue and "rejecting" the image - meaning that the baseline image is not updated and remains as it is.
  4. Save the baseline updates, so that they are used for the next test run.

The very first time a test is run, there are no baseline images, so the screenshots that were captured are adopted as the baseline images. On subsequent runs, the flow is as described above.

This process is illustrated in the diagram below and the description that follows it.

First run
  • 1. First you write a test script that exercises your UI. For example, you simulate 3 UI states, after setting up each UI state you trigger a visual UI test.
  • 2. Then you run the test for the 1st time.
  • 3. Since there is no existing baseline, all the checkpoints are reported as "new" steps.
  • 4. All the images captured during this run are automatically stored, and are then used as the baseline images in future tests that are run on this baseline.
Subsequent runs
  • 5. You make some changes to the application and / or test code.
  • 6. You run the test again and the two differences (marked as 1F and 3B in the figure) are detected.
  • 7a. In one case (1F), you review the results and decide that the difference is due to a new feature - you accept the image and the newly captured image will be used as the new baseline image (after you save the baseline 4.).
  • 7b. In the second case you decide that the difference is because of a bug. You report the bug and reject the image - i.e., request that the current baseline image should be retained.
  • 8. Finally you save the baseline and then it will be used as the basis for comparison on the next test run (now with the new baseline image 1F).