As part of one of my side projects I wanted to have a full CI/CD setup working. I got the CI running a while ago with Docker and GitLab CI using the build stage below.

.gitlab-ci.yml

    build:
      stage: build
      image:
        name: docker/compose:latest
        entrypoint: ["/bin/sh", "-c"]
      services:
        - docker:19.03.0-dind
      before_script:
        - docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
        - docker pull $CI_REGISTRY_IMAGE:latest || true
      script:
        - docker build 
          --cache-from $CI_REGISTRY_IMAGE:latest 
          --tag $CI_REGISTRY_IMAGE:latest
          --file Dockerfile
          "."
        - docker push $CI_REGISTRY_IMAGE:latest

As you can see I’m pulling in the latest image from the repository and then using it as a cache for building an updated image before pushing that back to the repository.

The Dockerfile is as follows:

FROM python:3.8-slim-buster

ENV PYTHONUNBUFFERED 1

RUN apt-get update \
  && apt-get install -y build-essential \
  && apt-get install -y libpq-dev \
  && apt-get install -y gettext \
  && apt-get -y install build-essential curl \
  && apt-get purge -y --auto-remove -o APT::AutoRemove::RecommendsImportant=false \
  && rm -rf /var/lib/apt/lists/*

RUN addgroup --system django \
    && adduser --system --ingroup django django

WORKDIR /app

########## Frontent build
# Install nodejs
RUN curl -sL https://deb.nodesource.com/setup_12.x | bash -
RUN apt-get install -y nodejs
RUN nodejs -v && npm -v

COPY ./assets /app/assets
COPY ./package.json /app/package.json
COPY ./npm-shrinkwrap.json /app/npm-shrinkwrap.json
COPY ./webpack.config.js /app/webpack.config.js

# Build JS, CSS etc
RUN npm install --production \
  && npm run build \
  && npm cache clean --force
########## /Frontent build

########## Python build
COPY ./requirements /requirements
RUN pip install --no-cache-dir -r /requirements/production.txt \
    && rm -rf /requirements
########## /Python build

COPY ./compose/production/django/entrypoint /entrypoint
RUN chmod +x /entrypoint
RUN chown django /entrypoint

COPY ./compose/production/django/start /start
RUN chmod +x /start
RUN chown django /start

COPY --chown=django:django . /app

RUN python /app/manage.py collectstatic --noinput

USER django

ENTRYPOINT ["/entrypoint"]

Separating the build

After reading about multi-stage docker builds I decided to break up the build into three parts. These parts all live in the same docker file but I’ve split them out here:

  1. Python build

    FROM python:3.8-slim-buster as build-python
      
    COPY ./requirements /requirements
    RUN pip wheel --no-cache-dir --no-deps --wheel-dir /wheels \
       -r /requirements/production.txt
    
  2. Node build

    FROM python:3.8-slim-buster AS build-node
      
    RUN apt-get update && apt-get -y install curl
    RUN curl -sL https://deb.nodesource.com/setup_12.x | bash -
    RUN apt-get install -y nodejs
    RUN nodejs -v && npm -v
       
    WORKDIR /app
      
    COPY ./assets /app/assets
    COPY ./package.json /app/package.json
    COPY ./npm-shrinkwrap.json /app/npm-shrinkwrap.json
    COPY ./webpack.config.js /app/webpack.config.js
    RUN npm install --production
    RUN npm run build
    
  3. Final image build

    FROM python:3.8-slim-buster
    ENV PYTHONUNBUFFERED 1
       
    RUN apt-get update \
      && apt-get install -y build-essential \
      && apt-get install -y libpq-dev \
      && apt-get install -y gettext \
      && apt-get -y install build-essential \
      && apt-get purge -y --auto-remove -o APT::AutoRemove::RecommendsImportant=false \
      && rm -rf /var/lib/apt/lists/*
       
    RUN addgroup --system django \
        && adduser --system --ingroup django django
       
    # copy pre-build artifacts from previous biuld stages
    COPY --from=build-node /app/static /app/static
    COPY --from=build-python /wheels /wheels
       
    RUN pip install --no-cache /wheels/* \
        && rm -rf /wheels \
        && rm -rf /root/.cache/pip/*
       
    WORKDIR /app
       
    COPY ./compose/production/django/entrypoint /entrypoint
    RUN chmod +x /entrypoint
    RUN chown django /entrypoint
       
    COPY ./compose/production/django/start /start
    RUN chmod +x /start
    RUN chown django /start
       
    COPY --chown=django:django . /app
       
    RUN python /app/manage.py collectstatic --noinput
       
    USER django
       
    ENTRYPOINT ["/entrypoint"]
    

The primary benefit of the multi-stage build is that the final image should be smaller than before since it won’t contain as many layers.

There is a secondary benefit too which is that the build will be faster because the stages are isolated from each other and can be built independently (almost). This means that changing a python dependency should not require re-building the node layers and visa versa.

You can see that from the following tests I ran:

>> time docker build --tartet $TARGET -f Dockerfile . [--no-cache] 

No cache timings

Target Cached stages Time
build-python None 28s
build-node None 3m 11s

Changing a JS dependency

Changing a JS dependency or JS file should re-run the npm install but if there are no changes to python dependencies then the python build stage should not be affected.

Since the node stage comes after the python stage it should use the cached python layers and only re-run the commands to build the Javascript.

Target Cached stages Time
build-node python 1m 5s

I’m not going to paste in all the build output but the timing data is enough to see that it is possible to rebuild the build-node stage without also rebuilding the build-python stage.

Updating the build

Now that the Dockerfiles were updated the updated build configuration for GitLab looks like this (highlights added):

    build:
      stage: build
      image:
        name: docker/compose:latest
        entrypoint: ["/bin/sh", "-c"]
      services:
        - docker:19.03.0-dind
      before_script:
        - docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
+       - docker pull $CI_REGISTRY_IMAGE:build-python || true
+       - docker pull $CI_REGISTRY_IMAGE:build-node || true
        - docker pull $CI_REGISTRY_IMAGE:latest || true
      script:
        - docker build
+         --target build-python
+         --cache-from $CI_REGISTRY_IMAGE:build-python
+         --tag $CI_REGISTRY_IMAGE:build-python
+         --file Dockerfile
+         "."
+       - docker build
+         --target build-node
+         --cache-from $CI_REGISTRY_IMAGE:build-python
+         --cache-from $CI_REGISTRY_IMAGE:build-node
+         --tag $CI_REGISTRY_IMAGE:build-node
+         --file Dockerfile
+         "."
+       - docker build
+         --cache-from $CI_REGISTRY_IMAGE:build-python
+         --cache-from $CI_REGISTRY_IMAGE:build-node
+         --cache-from $CI_REGISTRY_IMAGE:latest
          --tag $CI_REGISTRY_IMAGE:latest
          --file Dockerfile
          "."
+       - docker push $CI_REGISTRY_IMAGE:build-python
+       - docker push $CI_REGISTRY_IMAGE:build-node
        - docker push $CI_REGISTRY_IMAGE:latest

Technically you should not need to provide the --cache-from arguments (I don’t need that locally) but in CI I could not get the caching to work without using them. You’ll note that now I have to push each stage to the repository to make sure that I have access to it on the next build run.