How to avoid SEO penalties when using Netlify

The Problem

I've been really happy with my Netlify hosting. It's fast, free, and deploys my site on a global CDN. Even better, Netlify has all sorts of advanced deploy previews and other features that I'm only starting to play with.

All that said, today I realized that one consequence of how Netlify does things is that sites could end up penalized by Google and other search engines.
Specifically, because Netlify makes multiple versions of your site available, your site could be penalized for having "duplicate content"—the same penalty that search engines apply to content mills that steal other people's work and repost it as their own.

What is the problem, exactly? Well, with default settings Netlify makes every page available as a page on your domain and as a page in a subdomain inside Netlify.com. So, for example, the page you are reading right now would be available by default at both www.codesections.com/blog/netlify and at codesections.netlify.com/blog/netlify; since it shows up at both locations, it would be counted as duplicate content.

In fact, the problem is even worse than that: Netlify may also (depending on your settings) publish different branches of your site to different URLS (even with the same content) and will create "deploy-previews" that allow you to test live deploys before publishing them to your primary domain. These features are really great, and I make use of both of them. (In fact, branch deploys are what let me easily have passgen.codesections.com as a subdomain in my site). But they mean that you could end up with more than just two copies of each page on your site—way more, in fact.

The Solution

Fortunately, the solution is very simple. You to entirely avoid this issue, you need to take two steps.

First, tell Netlify to redirect traffic from your Netlify subdomain to your primary domain. This is a simple matter of setting the appropriate _redirects file in the root of your site. What you need to do is to tell Netlify to redirect all traffic from the .netlify subdomain back to your site. Here's what mine looks like:

# Redirect default Netlify subdomain to primary domain
https://codesections.netlify.com/* https://www.codesections.com/:splat 301!

Just replace codesections with your base url, and save that file as _redirects in your site root. Once you've done that, you'll have solved the bigger half of the problem.

To solve the second half, you'll need to add rel="canonical" tags to all the pages for your site. Depending on how you build your site, that could be easy or painful. I use Gutenberg, which makes this process incredibly easy. (Exactly as I'd expect from such a powerful static site generator.)

All I need to do is to create a new variable in my config.toml file that's equal to my base url, and then add it to my templates.

Specifically, I add this line to the [extra] section of my config.toml file:

live_base_url = "https://www.codesections.com"

and then add this line to each of my templates inside the <head> section:

<link rel="canonical" href="{{ config.extra.live_base_url }}{{ current_path }}">

(Why are we using our own live_base_url instead of Gutenberg's built-in base_url? Because you may want to change the base_url if you use Netlify's deploy previews, and we don't want to limit ourselves from using that very helpful feature.)

And that's it! Your site now has no duplicate content, and won't be unfairly lumped in with sites that have scraped content.

If you have any questions, feel free to reach out to me in any of the ways listed on my contact page—I'd be happy to help.