Vercel, our static website hosting service, makes it easy to set up page redirects, you can have up to 1024 redirects. It’s all controlled from a single JSON file. Many things can go wrong in that file, stakes are quite high (starting with SEO) and any redirect mistakes can be hard to spot.
Here’s our automated checking setup.
2022 Autumn update — we’re now using Remix and doing redirects both server-side and client-side using its API.
Some background first
Web page redirects can help in many cases. For example,
- to salvage the traffic after changing URL’s (for instance, after renaming the blog post’s title),
- as defences against mistyped URL’s (for example, we’ve got “whitespace” spelt with and without a hyphen, on different packages), and
- to cover the gaps in URL paths (for example,
/articles/tag/all
exists but/articles/tag
does not).
There are two ways to perform page redirects: client-side and server-side.
Client-side, it’s done via <meta>
redirect tags, for example <meta http-equiv="refresh" content="0;URL='http://thetudors.example.com/'" />
. But that’s not optimal because if implemented, when people will land of a “wrong” page, the server will respond with 200 OK
code (which is not “OK”), then <meta>
will redirect them to “correct” page; a browser will issue a GET request to a new URL, which will (hopefully) respond with another 200 OK
. That’s not good.
The proper way is to do it server-side; to cause the server to respond with code 308 Permanent Redirect
when a visitor lands on the “wrong” page. Then the server will respond to a new location; the browser will GET-request that, and (hopefully) get 200 OK
.
Our static site generator of choice is Eleventy, but vercel redirects apply to any front-end solution.
Redirects in Vercel
Vercel server-side redirects are set in a config file, vercel.json
, which is placed in the root folder of the website and looks like this:
{
"redirects": [
{
"source": "/articles/tag",
"destination": "/articles/tag/all",
"statusCode": 308
},
{
"source": "/os/array-include-with-glob",
"destination": "/os/array-includes-with-glob",
"statusCode": 308
}
]
}
We are going to check the following things:
- read, check, sort and overwrite the
vercel.json
with redirects sorted by thesource
key - validate the
source
URL, to ensure it does not exist (otherwise, what’s the point of redirect?) - validate the
destination
URL, to ensure it exists (so that we’re redirecting to a real page) - JSON schema should be validated, at least rudimentary, to make sure only three keys:
source
,destination
andstatusCode
are in each object source
should not equaldestination
(but that’s covered by validations thatdestination
exist andsource
does not; if so, it can’t be the same thing)- ensure redirect count is less than 1024
- there are 4K line limit too but such check is irrelevant, our URL’s won’t go as long
- ensure URL’s are relative, there should be no domain (or even
http
substring anywhere)
Implementation
The check script relies on having access to the production build of the website because we are checking, do files exist on dist/
folder. We don’t publish, risk google bot crawling errors and then check. We check right after the build, using npm scripts, intending to run the checks on GitLab CI (and fail to publish if any errors are found).
This checking functionality could be part of a vercel CLI, but to do URL-validation checks, vercel CLI would need some information — where is the dist
build folder. Even then, CLI would be triggered from npm scripts, from package.json
, so we’re back to square one. You can’t escape npm scripts.
Some checks could be part of vercel CLI though, checks for redirect URL length, or misspelt keys.
We will create an internal Node script file, triggered by package.json
during builds. It will be run during CI builds too.
Here’s how we set it up — we create utils/scripts/validateVercelJson.js
file, put the shebang line #!/usr/bin/env node
at the top, set the permissions chmod +x utils/scripts/validateVercelJson.js
and wire up a new script in package.json: "test:vercel": "node utils/scripts/validateVercelJson.js",
. The node ...
call is so-called Node repl; it’s just calling Node as a CLI program, in a terminal, executing another JS file. If all pass, nothing happens; if an error occurs, an error is thrown, the build would fail (also we want this to happen on GitLab CI pipelines).
It’s hard to explain all eight checks in writing due to sheer code amount, plus the code will likely change over time, so head to GitLab and see the current, live validator script validateVercelJson.js
and the live vercel.json
. They’re public and open-source!