5 min read

Publishing GitHub Pages from Azure Pipelines

Do you work on or maintain a project for technical users? A key part of attracting users, especially to an open source project, is publishing great documentation. However, keeping it up to date as your APIs and concepts change can be challenging or just time-consuming.

A popular way to maintain great docs is to keep them in your project’s repo. Often they’re built from some kind of easy-to-edit source format (like Markdown) and rendered as HTML. Once you’ve built the HTML, where do you publish it? For open source projects on GitHub, a seemingly obvious choice is GitHub Pages.

GitHub Pages will automatically handle building Jekyll content for you. In my case, however, I want to generate my own HTML. First, I’ll show you what I set up in my own GitHub repo. At the end, I’ll walk you through building and publishing your own GitHub Pages using Azure Pipelines.

What we’ll need

We’re going to need:

  • A system for transforming content into HTML
  • Some source content
  • A repo on GitHub to hold this stuff

And we’ll build a pipeline for automating our publishing step.

Our content system

Markdown is an extremely popular source format for documentation, so is reStructuredText (at least if you’re into Python). This isn’t a post about Markdown or rST, though. In order to keep things generic, I’m going to invent the world’s silliest documentation system: all it knows how to do is take a directory of HTML files and replace the token “{{ NOW }}” with the current time. It’s a shell script like this:

#!/usr/bin/env bash

# docs.sh

ROOT=$(cd `dirname $0` && pwd)
SRC_DIR=$ROOT/src
DEST_DIR=$ROOT

NOW=$(date)

# if we don't have any HTML files, don't do anything
shopt -s nullglob
for f in $SRC_DIR/*.html
do
    echo Processing $f
    DEST_FILE=$DEST_DIR/$(basename $f)
    # replace "{{ NOW }}" with the time this script started
    sed "s/{{ NOW }}/$NOW/g" <$f >$DEST_FILE
done

And as for source content, we’ll start with just an index.html file:

<!DOCTYPE html>
<!-- src/index.html -->
<html>
  <head>
    <meta charset="UTF-8">
    <title>Hello World!</title>
  </head>
  <body>
    <h1>Hello World!</h1>
    <p>Adding another line!</p>
    <p>This page was generated {{ NOW }}.</p>
  </body>
</html>

 

Here’s what that looks like, side-by-side:GitHub pages tutorial - doc language

Our GitHub Pages repo

If you aren’t familiar with it, GitHub Pages lets you push HTML content to a Git repo and have it automatically show up on an HTTP server. You can make Pages for a project, for yourself, or for an organization (with slightly different capabilities on each). I followed GitHub’s great tutorial on Pages from the command line to get started. My username is vtbassmatt, so I decided to make a user page for myself. My repo is called vtbassmatt/vtbassmatt.github.io.

Our GitHub Pages repo image

Because I’m publishing a user page, GitHub will publish whatever is on master. I also chose to leave the source of my content in master. This gives me a neat side-effect: the content for my page will be accessible on the web (at /src) as well as the “rendered” HTML.

Hello World! image

The pipeline

The heart of the system is this Azure Pipelines YAML file:

# Publish GitHub Pages
# azure-pipelines.yml

trigger:
- master

pool:
  vmImage: 'Ubuntu-16.04'

steps:
- script: |
    ./docs.sh
    git config --local user.name "Azure Pipelines"
    git config --local user.email "azuredevops@microsoft.com"
    git add .
    git commit -m "Publishing GitHub Pages  ***NO_CI***"
  displayName: 'Build and commit pages'

- task: DownloadSecureFile@1
  inputs:
    secureFile: deploy_key
  displayName: 'Get the deploy key'

- script: |
    mkdir ~/.ssh && mv $DOWNLOADSECUREFILE_SECUREFILEPATH ~/.ssh/id_rsa
    chmod 700 ~/.ssh && chmod 600 ~/.ssh/id_rsa
    ssh-keyscan -t rsa github.com >> ~/.ssh/known_hosts
    git remote set-url --push origin git@github.com:vtbassmatt/vtbassmatt.github.io.git
    git push origin HEAD:master
  displayName: 'Publish GitHub Pages'
  condition: |
    and(not(eq(variables['Build.Reason'], 'PullRequest')),
        eq(variables['Build.SourceBranch'], 'refs/heads/master'))

This pipeline will trigger whenever I push to master and will run on the hosted Ubuntu agent pool. The first script step will run my silly doc generator, then check in the generated docs.

What’s that ***NO_CI*** token for? We’re eventually going to push this commit back to master. But recall that this pipeline triggers on pushes to master… which would lead to an infinite loop of pipelines running. The ***NO_CI*** statement tells Azure Pipelines not to trigger on this commit. (Azure Pipelines also understands a few other ways to skip CI for a commit.)

The next step is a task which downloads a file that’s been securely stored. That file is the private key of a GitHub deploy key. By presenting the private key, GitHub will allow my build agent to authenticate and push changes to the repo.

vtbassmatt.github.io screenshot

Finally, the last script step pushes the commit back to GitHub. SSH is picky about file locations, directory permissions, and connecting to a host it has never seen before. The first three lines take care of getting the private key in the right place.

It’s worth nothing: Azure Pipelines has a native InstallSSHKey task. That would have handled downloading the secure file and adding the known_hosts entry. I opted to do this manually with shell scripts, mostly as a learning exercise.

The fourth line changes our push URL from https:// to ssh://, which will tell Git to present the SSH key. You’ll obviously want to change the values to match your repo.

Because of the way Azure Pipelines optimizes fetching Git repos, from Git’s perspective, we aren’t actually on the master branch. That’s why we have to use the refspec HEAD:master on the final line which calls git push.

That condition is a little wild as well. You can read it like a prefix-notation functional language (or an Excel formula, if you prefer): “Run this step only if the variable Build.Reason is NOT ‘PullRequest’ and the variable Build.SourceBranch is ‘master’.”

Following along at home

Now we have all the pieces in place. To replicate what I’ve done:

  1. Set up your GitHub repo with the shell script, the source content, and the azure-pipelines.yml file. Make sure to edit the pipeline to apply to your GitHub repo. (Hint: your GitHub username is not vtbassmatt!)
  2. Install the Azure Pipelines app and go through the setup experience. Your first build will fail because you don’t have the secure file in place – that’s OK.
  3. Generate your deploy key and give the public half to GitHub.

3. Generate your deploy key screenshot

4. Give the private half of the deploy key to Azure Pipelines: Go to the Library on your Azure Pipelines organization and create a secure file called “deploy_key”.  You’ll also want to click Edit on the secure file and check the “Authorize for use in all pipelines” box.

4. Give the private half of the deploy key to Azure Pipelines

Library deploy key screenshot

5. Go back to GitHub and use the web editor to change files in the /src folder. Start a PR. The pipeline will run, but it will skip the step to push the built content to GitHub.

"5. Go back to GitHub and use the web editor to change files in the /src directory" screenshot

6. Complete the PR. The pipeline will run again, this time as a continuous integration trigger to master. The resulting content will be automatically pushed back to master and ultimately deployed on GitHub Pages!

 

Questions or feedback? Let us know in the comments.