Why you should pay attention to flaky Selenium tests
Shreya Bose, Technical Content Writer at BrowserStack - February 13, 2022
Let’s begin by imagining the plight of a certain tester who creates automated end-to-end tests for a web app. They use Selenium WebDriver and look at code as a craft. For example, they use page objects to separate test code from the application logic. They use visualization as well as the right machine hardware to run tests. They also work with developers, other testers, and managers because they understand the importance of collaboration in automation testing.
Clearly they are doing everything right, but still keep encountering a recurring problem: flaky tests.
What are Flaky Tests?
A flaky test is one that might pass or fail for the same configuration. Their existence confuses developers because in this case, test failure does not always signal the existence of bugs in the code. Since the primary purpose of software tests is the detection of bugs, this non-determinism defeats the purpose.
Flaky tests are particularly disruptive to tests with a broad scope, such as functional tests and acceptance tests.
It’s easy to think of frustrating roadblocks and nothing else. But then, it can actually be helpful in revealing gaps in the testing pipeline, both on the technical and human aspects.
This article will explore what those gaps are, and why it pays to pay attention to flaky selenium tests when they crop up.
The Technical Aspect of Flaky Selenium Tests
In Selenium WebDriver tests, flakiness comes from a couple of tests:
- Lack of synchronization: The architecture of web apps contain multiple layers, The way these layers interact influences web app performance, including network speed, HTTP handling, source rendering, and computer processing resources. Because of this, some operations may have varied timings when the website is put through different end-to-end tests. In one instance, a button may not show up on the page quickly enough, or dialog box might not shift fast enough for the automated test to progress accurately.This can be solved by including Selenium wait commands that synchronize test steps with the software logic. If some actions need a bit more time to execute, Selenium wait commands are perfect for halting test execution until the action is complete or a certain web element is found. However, remember that if certain areas of the web app consistently need waits, especially longer waits, it could suggest poor performance. An example of this would be this: one set of automated end-to-end tests related to the same feature fails every time. Dig deeper; it is possible that some bad coding practices may be involved. Flaky tests would indirectly pick up on this problem.
- Accidental load testing: As automated test suites grow, the number of lines in test code increases. Additionally, the number of tests a piece of software is put through also increases. Consequently, test suites are reorganized to be executed at the same time, usually through parallel testing with Selenium to reduce test runtime. However, a side effect of this can also impose large loads on the software, resulting in an unintentional load test. Certain tests may run perfectly fine when executed in a series but may display flakiness when running simultaneously. If the failures are seemingly random, perform some debugging. In an example, it was found that during parallel testing, all tests would try to log in with the same admin user when tests started. This means that multiple simultaneous logins were occurring through the same user. Chances are, a web app will not be prepared for this kind of load. But in this case, the issue was made visible by flakiness.
The Human Aspect of Flaky Selenium Tests
Here are some things flaky Selenium tests can reveal about people in the organization:
- Teamwork and Communication: Flaky tests are a barometer of how well teams function. It can be challenging to get every team member to take an interest in end-to-end tests. Since flaky tests appear to pass sometimes and fail at other times, it’s safe to assume that anyone asking about them has been looking at test results consistently. They are the people more likely to be receptive to collaborative agile practices. If several flaky tests appear and have not been flagged, this indicates that the team is either not receiving information about the tests, or that the team is not interested.
- Test results fatigue: This refers to a condition in which teams have been so saturated with unreliable test results that they start ignoring end-to-end results altogether. Obviously, this negates the benefits of automation. A prime reason for test results fatigue is flaky tests. If team members start ignoring flaky tests and all results related to flaky tests, they end up disregarding large portions of the automated testing pipeline. By paying attention to how a team reacts to flaky tests, one gains insight into how the team is invested in automation. It also reveals how engaged the team is with the project at hand. In any team or organization that implements continuous deployment, automated end-to-end tests are necessary for product builds and releases. Flaky tests can halt tests and releases, and thus need to be resolved in order to complete test suites.
As discussed above, flaky tests, despite being troublesome can be an indicator of the quality of the Selenium automation test setup. It might be tempting to simply dismiss them as occurring due to some issues with automated tools and get rid of the tools (and, in turn, automation).
Instead, dive deeper, examine flaky tests, and look for where they are emerging from. It could be an anomaly with performance, speed, or with the nature of software. The point here is: use flaky tests as a friend.