Characters encoding issue in static pages in NextJS
Last modified on Wed 18 May 2022


Although the characters encoding issue is not an issue with NextJS but React (check more details here), you can encounter this if you have static pages that have some HTML entities in the page content - e.g. ampersand (&) in query parameters.

These HTML entities will not be encoded by default, which will result in incorrect content. This will pose an issue mostly for crawlers.

Ampersand is encoded incorrect

This can be fixed by adding a custom decode method in the _document file.


To prepare for React 18, we recommend avoiding customizing getInitialProps and renderPage, if possible.


If you do not have the _document file in the pages folder, create a file with the default content (copy all except gIP in _document) and:

import { decode } from 'html-entities';
static async getInitialProps(ctx) {
    const initialProps = await Document.getInitialProps(ctx);
    // based on
    return {
        html: initialProps.html.replace(
            (match, attribute, value) => `${attribute}="${decode(value)}"`

The above code searches and replaces all href, src and srcSet attributes in all pages with decoded characters.

Ampersand is now encoded correctly


If you need to encode HTML entities, the proposed solution will work well. There is no need to add this by default.