luke.b//blog

Mono-Repo CD via Git + Docker

~ #blog #dev

Since 2020, I’ve been deploying my personal website and coding projects by continuously deploying a git repository using Docker containers and docker-compose.

Today I’d like to share how this works by writing a blog post that will go through the same continuous deployment, from markdown file all the way to your browser.

The system brings together a few technologies, namely:

  • Git: a version-control system
  • Docker: a system that can be used to contain programs in an isolated and repeatable environment.
  • Docker compose: a tool for quickly manipulating multiple Docker containers

Firstly, I’ll cover the Git part.

“Ch-ch-changes…”

Whenever I want to make a change to a project, I open up my editor and edit some files, testing my changes locally by running the relevant commands.

$ yarn start
...

Once I’m satisfied with my changes, I make a new commit. And push the commit to my git server.

$ git commit -m'stuff'
$ git push

The server receives the push, and carries out the post-receive hook. This is a script that is saved to the home directory of the git user that I added to my server.

# post-receive

#!/bin/sh
GIT_WORK_TREE=/home/git/wd git checkout -f

while read old new ref
do
  # Handle created or deleted branches.
  echo $old | grep -qsE '^0+$' && old=$(git hash-object -t tree /dev/null)
  echo $new | grep -qsE '^0+$' && new=$(git hash-object -t tree /dev/null)

  projects_to_build=$(git diff-tree --no-commit-id --name-only -r "$old" "$new" | grep -o projects/[^\/]* | sort -u)

  echo "Rebuilding projects: $projects_to_build"

  for i in $projects_to_build; do (cd  /home/git/wd/$i;. ./build.sh); done
done

Let’s break this script down.

The first part will cause git to checkout a working copy of the repo in the /home/git/wd directory. “wd” stands for working directory here.

GIT_WORK_TREE=/home/git/wd git checkout -f

Then, the while loop is executed. This will run the read command, putting data passed via stdin into variables $old, $new and $ref. When the remote git server runs the post-receive hook, it will send data over stdin that can be used by the hook script. This read command receives that data.

while read old new ref

After that, the variables are used to get the object-IDs of the old and the new git tree objects that represent the old and new set of files in the old and new revisions (or commits) respectively. Without going into detail, recall that everything in git is stored as an object in the object database of the repository. Groups of files, including directories and sub-directories are also stored as objects known as tree objects.

The post-receive script receives the old and new revisions (or commit hashes) that represent the previous position of the branch being updated and the new position. The script must convert these revisions to objects in the git object database for the next step: comparing the two commits to figure out which files changed.

But then why not do a simple git diff HEAD^? Well, doing so would not handle merge commits or push with several commits. It’s best to use the information provided and check for all potential changes.

(Note: I think my source for this solution is this SO answer but I’m not 100% certain.)

# Handle created or deleted branches.
echo $old | grep -qsE '^0+$' && old=$(git hash-object -t tree /dev/null)
echo $new | grep -qsE '^0+$' && new=$(git hash-object -t tree /dev/null)

Next up, the script figures out which files have been changed and which project directories those files belong to. git diff-tree is used to compare the two tree objects identified earlier. The grep selects only the part of the file name that identifies the project, and the sort -u makes sure there’s only one line per project.

projects_to_build=$(git diff-tree --no-commit-id --name-only -r "$old" "$new" | grep -o projects/[^\/]* | sort -u)

Now the script has a list of projects to build and we get to the fun part.

For each project, the script will run a build script in the project directory.

The great thing about this is that the script can contain anything.

  for i in $projects_to_build; do (cd  /home/git/wd/$i;. ./build.sh); done

For my purposes, I wanted to build all of my apps under the same system – Docker. So now we move to the second part of the journey.

“It’s Docker layers, all the way down”

Each project I build has a Dockerfile. I won’t go into the details but essentially each project can be built in a repeatable deterministic way that produces a Docker image. This image will run the same on any machine in the same environment.

Basically this means that when one developer says “it runs on my machine”, it is highly likely that it really does. And it probable runs on other machines too.

And now, the Dockers must be composed together

Each project in the ./projects directory has a build.sh script that will cause the corresponding app to build a new docker image and restart the associated docker container via docker-compose.

In-fact, every build.sh is almost identical:

cd ../dc
docker-compose up [name-of-the-service]

This script will set the current directory to the one containing the docker-compose.yml and start or restart the corresponding service.

The docker-compose.yml looks something like this:

version: "3.3"
volumes:
  vl-blog:
  # ...

services:
  nginx:
    image: nginx
    volumes:
      - ../nginx/nginx.conf:/etc/nginx/nginx.conf:ro
      - vl-blog:/srv/www/blog
      # ...
    ports:
      - "127.0.0.1:8000:80"
  blog:
    image: klakegg/hugo
    working_dir: /site
    command:
      - -d
      - /build
    volumes:
      - ../blog:/site
      - vl-blog:/build
  # ...

The actual file contains many more services, but for the purposes of this blog I’ve removed them.

I chose to show the nginx service because it plays a special role of acting as a reverse proxy, which will route requests depending on the provided HTTP path to the corresponding service. This requires a small amount of configuration, but the nginx service is not special itself – it is like any other docker service run by the system.

Most of my services actually only consist of a build phase and don’t continue to run after building. The artifacts that result from the build phase are made available to the nginx service via the volumes: configuration. Typically each service will take a single host directory as input and have a virtual volume as an output. This virtual volume (e.g. vl-blog) is mapped to a directory in the nginx container, which is then served by the nginx HTTP server.

This is really cool.

The really cool thing about all this is that any and all updates are continuously deployed. With this, I can deploy entire apps, together with the nginx configuration required to make them available publicly in a single push.

This level of automation has made it so much easier for me to get projects out quickly. And not just projects either, but also blog content… Not that I ever write blogs 😅

Thanks

Well this has been a fun deep-dive into my ridiculously simple mono-repo-app continuous deployment system. Or RiSiMoRACDepS for short. The name needs work, admittedly.

Thanks for reading!