From Home to Prod: 6 Remote Teams on Automated Testing on the Cloud

During global lockdowns, software isn’t eating the world. It’s helping save it.

COVID-19 has forced people to work from home and spend more time online than ever. But for businesses that have transitioned to digital (and many software companies), the pandemic is an opportunity:

For software companies, at least, the coronavirus shutdown is not turning out to be one of those [economic] crises. Already at sky-high levels, software shares have proved more resilient than the overall stock market — and in some cases are rising to new records. —

But the rising tide isn’t raising all the boats. The biggest winners in the pandemic are those who have (and have always had) software quality as their competitive differentiator.

“The normal spike for us is New Year’s Eve, where basically everyone at the same time just wants to message everyone to wish them a happy New Year... And we are well beyond that spike.” —Mark Zuckerberg, CEO, Facebook

“In this era of remote everything, we have seen two years’ worth of digital transformation in two months. Microsoft Teams is enabling this accelerated transformation by giving people a single tool to chat, call, meet, and collaborate” —Jared Spataro, Corporate VP for Microsoft 365

Which brings us to an undeniable truth: Pandemic or not, delivering quality software gives you an edge. And since it needs continuous, automated testing, we turned to our fully-remote and globally-dispersed power users for inspiration: to see how they perfected their automated testing on the cloud and troubleshot common QA challenges without breaking their stride.

These are the teams that build products/software used by millions of other businesses and users worldwide, and the QA best practices that keep them ahead of the curve:

1. Discourse

QA best practices: Discourse

"In some cases, when fixing a flaky test, the fix is in the app, not in the test." —Sam Saffron, co-founder

With a remote team and hundreds of project contributors, tests are important at Discourse, even (and especially) the tests that fail sometimes.

Co-founder Sam Saffron says, “Flaky tests are useful at finding underlying flaws in our application. In some cases, when fixing a flaky test, the fix is in the app, not in the test.”

Thus began Discourse’s continuous process of unearthing (and fixing) problematic patterns within the app—by finding and fixing flaky tests. “We built a system that continuously runs our test suite on an instance on Digital Ocean and flags flaky tests (which we temporarily disable). We would assign the flaky test to the developer who originally wrote the test. Once fixed, the developer who sorted it out would post a quick post mortem. This [practice] helped us learn about approaches we can take to fix flaky tests and improved the visibility of problems within the app.”

2. Sulu

QA best practices: Sulu

"Shift-left to avoid errors early." —Daniel Rotter, creator

Sulu, an open-source PHP content management system with hundreds of contributors across the world, shortened their review cycles—and consequently, their time-to-market, by doing a bulk of their testing well before code review.

Daniel Rotter, creator of Sulu, explains, “Testing has a very high priority in our project. We use static code analyzers like PHPStan and Flow to avoid errors early. We use PHPUnit and Jest for automated tests. And we do a lot of JavaScript development, so we test our application on different browsers with BrowserStack. This way, a lot of the errors, namely code smells, are already solved before a reviewer gets to look at the code.”


QA best practices:

"A full test build restart for smoke testing is just not practical." —Anna Goncharova, engineer

Instead of doing full restarts for build failures, Twitter devised a simple mechanism to do device-level retries, preserving time, resources, and developers’ faith in automated tests.

“Adding support for device-level retries resolved our long-lived pain caused by test flakiness,” Anna Goncharova, engineer at Twitter, explains how they do reliable testing at scale at Twitter. “We can pick and choose which devices to test the website on (with BrowserStack). After tests are run, the results are sent back and stored as XML files. We read them in and convert them to JSON objects (which lets us do further custom processing). And then, we programmatically check for failures by reading the XML files: only re-running the test case that failed.”

4. Wehkamp

QA best practices: Wehkamp

"Testing the most used/important functionality on the most used devices isn’t enough." —Hylke de Jong, automation engineer

After an Android WebView update broke their (web) checkout on Android devices,, one of the largest (and oldest) online stores in the Netherlands, decided to adopt a risk-based approach to testing.

Hylke de Jong, automation engineer at, explains it best. “We give priority to the parts of our app and the devices with the highest risk associated with them. The amount of risk is based on two variables: Probability is the likelihood that something is going to break on a device. The impact is how bad it’ll be IF something did break.” Wehkamp uses factors like screen resolution, mobile OS version, number of device components involved in high-value pages (like checkout), and app crash data to inform their device selection and test prioritization.

[Learn about Wehkamp's way of creating the perfect cross-device test strategy for mobile.]

5. Optimizely

QA best practices: Optimizely

"For continuous improvement, inform your feedback loops with data." —Brian Lucas, senior software engineer

Optimizely, Internet's premier experimentation platform, has tens of thousands of tests to run at every commit. With such a humongous pool of test data, gathering meaningful insights and feedback is tricky at best.

So Optimizely collects metrics on every test run and uses them to monitor build health and continuously prune the flakiness out of their tests. Brian Lucas (senior software engineer at Optimizely) explains, “If a test is passing sometimes and failing at other times, do what Facebook does: disable the unreliable test, unblock the engineer who owns it, and ticket it so it can be fixed and merged back.”

Hunting down the offending test is made easier with Optimizely’s build health monitoring dashboard, which (among metrics), contains a ‘go/flaky’ link. “Now, when a test is suddenly failing for me, I can just go to the dashboard and see whether it’s listed on this flaky test manifest file. It will tell me how long it takes for the test to run, and I can drill down to see when it first started failing for everyone else.”

[Reducing QA dependencies, automating feedback loops, and continuous experimentation: Watch Optimizely's 3-part plan to de-risk and speed up releases.]

6. GoodRx

QA best practices: GoodRx

"Maintain production sanity." —Priyanka Halder, head of QA

For businesses in the heavily regulated telemedicine industry, “even the smallest bugs in production can cost hundreds of thousands of dollars,” says Priyanka Halder, Head of QA at GoodRx, a leading telemedicine brand in the US.

“We segregate our priorities. We set up hourly automation tests on BrowserStack to sanity-check the production site, starting with a continuous P0/P1 pipeline. This is the one which always needs to be working and must not break,” Priyanka says. Besides that, GoodRx feature-flags major releases and dogfoods them by routing internal users to the features first. It allows the team to find new and unexpected usage scenarios while maintaining their release velocity.

[Check out how Priyanka set up and scaled the QA function from the ground up at GoodRx.]


World's best teams may or may not be fully-remote, but working from home hasn't put a dent in their value delivery, thanks to impeccable QA practices and continuous, automated testing on the cloud. For online businesses, lockdown-induced traffic uptick is an opportunity. Make the most of it sustainably by delivering quality software. It's the only way teams can set themselves apart from the crowd.