Skip to content

Checking Links with GitHub Actions

Updated: at 04:47 AM

Automation is Good, Actually

GitHub Actions give you superpowers. The first and most obvious applications I’ve found for it are the normal CI/CD processes which are the bread and butter of any good software project: builds, unit testing, etc. But I’ve also found it useful for automating other tasks, like checking links.

With this site, I am not only writing a blog, I am collecting a set of best practices for static, internet-connected content sites. I love automation, as I am but one person who is always juggling a dozen different priorities. I want to automate as much as I can, and help others do the same.

Keeping bad and broken links off your site is a critical part of maintaining a good user experience. Broken links make your website look broken. They also make it harder for search engines to index your site, and can hurt your search ranking. It’s just a bad experience we can all relate to.

Enter Lychee and GitHub Actions

After looking around for a bit, this appears to be one of the most popular ways to check links on GitHub. I tested it for a while, and it works great for me. I am ready to comfortably let it run and wait for it to send me emails on failure. I’ve tested it by deliberately putting bad links in my Markdown and it promptly yelled at me. Side note, remember to test your automations!

This action is a wrapper around the lychee link checker, another amazing tool written in Rust. The action will spin up a container and load lychee into it, then run it with the arguments you provide. It will then save the output to a file which you can review. The link checker runs daily at 6pm UTC, and will also run when you manually trigger it.

Customizing for the Blog

I keep all of my images in the assets folder, which allows Astro to automatically do some incredible image compression and optimization magic when it builds and deploys my blog (which is also handled by an automation on my hosting provider). Because they reside in the assets folder, link checkers are not able to resolve them, either by alias or absolute path, and will report them as broken links.

To get around this, I use the --exclude flag to tell the link checker to ignore any links that contain the string @assets. This is an alias that exists for all of my image links. It would be nice if I could check my image links, too. However, that would only be possible on the running site, at which point the files have been transformed and no longer exist in the same form as they do in the repository.

I also added the formats I use on this site: mdx and mdoc.

Code

/.github/workflows/links.yml

name: Links

on:
  repository_dispatch:
  workflow_dispatch:
  schedule:
    - cron: "00 18 * * *"

jobs:
  linkChecker:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - name: Link Checker
        id: lychee
        uses: lycheeverse/lychee-action@v1.8.0
        with:
          args: --exclude '.*@assets.*' --verbose --no-progress './**/*.md' './**/*.mdx' './**/*.mdoc' './**/*.html'

      - name: Create Issue From File
        if: env.lychee_exit_code != 0
        uses: peter-evans/create-issue-from-file@v4
        with:
          title: Link Checker Report
          content-filepath: ./lychee/out.md
          labels: report, automated issue

Conclusion

Short article this time, folks. I just wanted to share this GitHub action and encourage my fellow static site builders to automate your link checking. This is one of many methods to accomplish this essential task.

Reference