Front-end only integration testing with Playwright

Published in

Teads Engineering

9 min readFeb 8, 2024

Introduction

End-to-end testing. It’s slow and fragile, but it’s a robust solution to verify that a shiny new single-page application is actually doing what you think it’s doing.

But what if we just want to test our front-end? Our backend and its integration to other servers are already well-typed and tested. To thoroughly test our client application, we are going to need to send data across the stack to create fixtures and test our client against this data.

This also means we need to run our entire stack just to test the front, and if we are doing this on our local machine, this can be rather slow and expensive in terms of resource use.

What if we could just confidently test the front-end without even spinning up our back-end? What if our tests could be cheap and fast to run?

These tests could become a front-line developer tool, and not only a safety net run on our CI when we push our code.

In our implementation, we chose to use Playwright. The concepts outlined here could be applied using another framework, but there may be some Playwright-specific elements.

Note: This article primarily focuses on the use-case of a classic single-page application, with a heavy client containing a lot of logic/complexity (A React app for example).

End-to-end data contracts as an alternative to end-to-end testing

The most important part of mocking our data is to rely on an enforced data contract between the client and server. This could be shared typescript types for requests/responses (in our case), a typed api generated by tRPC, or perhaps something like gRPC if your backend is not also written in JS.

If we define exactly what our server will receive or send using a contract, when we make breaking changes to the API of the server, we will break our tests. We fix our tests by aligning them to the new contract.

This end-to-end type safety allows us to test the client and server independently because we know the integration between the two is protected by the contract.

Running our front-end integration tests without a server

If we don’t run our server, how do we even load our page? There are two concerns here: loading the page + assets and handling post-load server interactions.

Load the page

You may be used to intercepting and mocking your XHR requests during testing, but why stop there? If we allow the testing framework to intercept even requests to our HTML and assets (CSS/JS), we don’t even need to spin up a server (or its dependencies) to run our tests.

A simple example is our page load request returning an index.html, which in turn generates requests for app.js and app.css.

We can intercept all three of these network requests, and return the assets directly. This is not necessarily faster than having the server return these files, but does negate the need for the server!

Mock the data

In a single-page application, the bulk of our interaction with the server after loading the application will be XHR requests to load or modify data. All such requests will be intercepted and the responses mocked.

Scaling tests

Let’s talk a little about scaling our test suite. Like any code base, our test suite is susceptible to degradation over time, becoming hard to understand and modify.

Don’t treat your tests like a second-class citizen; architecting and building them with the same care and attention you put into your application code will result in a pleasant testing experience that stands the test of time.

First of all, use strong typing across the board when writing your test code. This will greatly ease the ability to refactor and keep the code clean, as well as providing ‘intellisense autocomplete’ functionality which improves speed while writing tests.

In our tests, we chose to use Typescript to write our suites. Many prefer the readability of suites written in Gherkin notation, but JavaScript gives us the flexibility to architect tests and tooling in any way we like without having to jump through hoops.

The trade-off is that more effort is needed in order to keep the tests readable. This can be achieved by building clean facades built around readability and hiding the actual implementation.

We recommend starting with fixtures, endpoints and pages.

Fixtures

For each type exposed by your server’s API, create a fixture factory that returns typed objects for use in your tests.

E.g.

Endpoints

By fully typing your endpoint’s requests and responses, you can easily create helper objects for managing request interception.

This is where using something like tRPC can give you an advantage, as it gives you a typed api out of the box.

E.g.

Page

Our chosen testing framework, Playwright, proposes using the Page Object Model pattern.

For a single-page application, we may want to think of this as ‘Route Object Model’, or even ‘Feature Object Model’, as ‘Page’ may impose mental limits.

The general idea is to create a readability-focused api for interacting with the page, which hides the actual framework calls that query and drive the page.

This allows us to expose a higher-level api, where functionality can be grouped under logical units.

As you find common code or useful abstractions, you can extract these into shared utils and share them between your page object model

For example, we also chose to wrap the playwright request interception functionality to create arguably more readable APIs.

A simple example of developing our test API

Using only raw playwright calls

Using a page object model api, typed fixtures, a request handler api and an interception wrapper api

In this example, we have started to implement a page object model and some useful abstractions on top of the base playwright api.

It is critical that a test is easy to follow so we know that it’s testing the right things, which can be achieved by focusing on readability at the test steps level.

Make the tests fast

There are many things that will slow down browser-driven tests, and by addressing these issues, we can dramatically improve the speed at which our test suites execute.

Disable Animations

Having a way to globally disable animations within your application will make a huge overall difference to the execution speed. Typical UI animations range anywhere between 100 to 500 milliseconds.

Consider a test that opens a modal (animation), opens a select menu (animation), selects a value, closes the select menu (animation) and then closes the modal (animation).

Given a duration of 200ms per animation here, we are adding 800ms to this interaction.

If the test takes 1000ms to run, 80% of this time is spent animating the UI. Animations are the icing on our cake, what we really want to test is that our application does what it is supposed to.

Of course, depending on the project, you may wish to test your animations.

Avoid parsing and loading javascript

A big part of the time for any test is loading the page. If you have a large application bundle (let’s say 1–2 MB), every time you reload the page you will incur a heavy overhead to reparse/reload the javascript.

If you are writing many small tests that verify a small feature on the page, you will pay this overhead time and time again.

For example, a test (without animations) might take 200ms to execute once the page has loaded, but the application start-up time (parse/load etc) itself can easily exceed 1000–2000 ms.

Our approach was to build a reset method into our application. For our react application, this was essentially unmounting and remounting our application.

This technique could be considered very controversial, after all, how do we ensure that we have completely reset all the state being used by the application?

Our approach was to architect the application in such a way that we encapsulate all state within our instantiated application.

This requires care when writing the application, but I would say this forces us to follow good engineering practices, and the performance benefits for the tests are huge.

Parallelism

Playwright gives us a fine-grained control for how we execute our tests in parallel. We can control how many threads and which tests should share a thread. We can also do sharding, which allows us to distribute tests over a grid of machines.

Combining these three performance enhancements

First step, eliminate those pesky animations. In our application, we heavily relied on MaterialUI, a React UI library. This library contains many animations, but mercifully, also exposes a way to globally disable these animations.

You can implement your animation code in a way that allows globally disabling animations. This could be as simple as injecting a CSS rule such as * { transition: none !important; }, but will depend on how you have built your animations.

Next, separate our tests into feature suites, containing a handful of tests. Run each suite in its own thread which reuses the same page for each test. The page is loaded once in the beforeAll() hook of the suite, and the tests inside will run sequentially, each one resetting the page in its before() hook.

Performance comparison

Given a page/application load of 1000ms, an average animation time of 500ms (i.e. 2 animations at 250ms) and an average test time of 300ms, consider the following breakdown:

Say we have 500 tests running in sequence with no performance optimizations:
500 * (1000 + 500 + 300) = 15 minutes

If we remove the animations from the equation:
500 * (1000 + 300) = 10 minutes and 48 seconds

If we load the application only once:
500 * 300 + 1000 = 2 minutes and 35 seconds

If we divide this by 4 threads
125 * 300 + 1000 = 44 seconds

As we keep adding tests, we only pay the cost to execute the steps and the gap between the two will get wider and wider.

If we run each suite in parallel threads, and we have say four threads available, we can dramatically decrease the time to execute all suites.

As soon as a single suite contains too many tests, it may be better to repay the page load time and divide the suite into two, so that it can run them in parallel. You can throw a warning when a test suite reaches this predefined max time threshold.

As the suite grows, we can take advantage of sharding to use multiple machines to execute the tests even more quickly.

Why Playwright and not Puppeteer/Cypress/Selenium etc?

I would take a moment to extol the virtues of this modern testing library. Despite its young age, it has a strong team from Microsoft backing it, and is already starting to feel like a mature tool.

Its feature set allows for almost endless possibilities, giving you control over every aspect of loading and manipulating web resources, and the browser itself (iframes, multi-tab).

It has the best cross-browser and cross-language support and operates using the most modern protocols for driving the browser.

The other automation tools are solid, but we find each one lacking in some way or another. Cypress is great and provides a great developer experience, but lacks in flexibility and feature set. Puppeteer is solid for browser automation, but lacks when it comes to testing. Selenium is very mature with a large ecosystem of plugins to achieve almost anything but can be frustrating in terms of developer experience.

I would strongly recommend considering it for your next project.

Conclusion

This approach is not presented as the holy grail of testing solutions, but can be an interesting approach given the right conditions. We currently test several production projects this way!

The advantage of fast tests which are easy to run means a faster feedback cycle for developers. E2e test can also account for a significant part of the time it takes to perform a CI run. A second win is that because the tests are so fast, we end up writing more of them, increasing confidence in the application.

Being serverless, they are very easy to configure and run. Running in watch mode can give very fast feedback during development.

As with any code base, maintaining a clean, readable and extendable testing suite that is strongly typed will pay off in the long run.