Continuous Integration in Common Lisp with Github Actions

edit 2020/09/13: switched back to upstream install-for-ci.sh since it merged my patches.

This is the first part of a series of posts about how i set up CI for CL code using github actions.

The first thing we need to do is add an action. Either click the actions tab in the github UI and then Set up this workflow on the Simple workflow and change the name to CI.yml, or just manually create .github/workflows/CI.yml.

For this example, we will use roswell to install and run lisp implementations, and test 64-bit SBCL and ccl on Linux, Windows, and OSX.

First this we need to do is specify when we want to run the tests. We will run on pushes to any branch, and pull requests to master. (lots of other options are available, see here for details.


on:
  push:
  pull_request:
    branches: [ master ]

Next we need to specify what combinations of OS and lisp implementations to test:

lisp here can be any implementation name roswell recognizes, like sbcl-bin, sbcl, ccl, ccl32, ecl, clisp, allegro, cmucl, abcl.

os can be any of the workflow labels listed here (and possibly ubuntu 16.04 and windows server 2016 )

We will test on sbcl-bin, the latest released SBCL binary, and ccl, the latest ccl release, 64 bit in both cases. Both will be tested on Ubuntu, MacOS and windows. (more complex setups will be showin in a later part)

Note that github actions come with a limited amount of free CPU time for running actions, and windows and osx cost 2x and 10x as much cpu time respectively compared to linux, so if your tests are slow, you might want to limit those, and possibly disable them when debugging the initial actions setup and test suite. Full details on billing here.

jobs:
  test:
    name: ${{ matrix.lisp }} on ${{ matrix.os }}
    strategy:
      matrix:
        lisp: [sbcl-bin ,ccl]
        os: [ windows-latest, ubuntu-latest, macos-latest]


    runs-on: ${{ matrix.os }}

Optionally we can specify that we want the action to let all jobs finish, even if some fail. For this example we will let it kill unfinished jobs if any fail, but this option is useful when we are explicitly testing portability and want to see which implementations can or cannot run it rather than just that some can't.

#      fail-fast: false

Next we specify the steps needed to run the job:

    steps:

first we turn off CRLF conversion on windows, since that might confuse sbcl. Also, change where roswell installs its binary and add that to the path, since it can't find it otherwise.

    - name: windows specific settings
      if: matrix.os == 'windows-latest'
      run: |
        git config --global core.autocrlf false
        echo "::set-env name=ROSWELL_INSTALL_DIR::~/ros"
        echo "::add-path::~/ros/bin"

then check out the repository

    - uses: actions/checkout@v2

To save time if we run tests frequently, we cache the .roswell dir if possible. The cache will be keyed on the OS, implementation, and hash of all .asd files. If there isn't an exact match, it will try restoring a match of just os+lisp or just OS, and then save a cache with full key.


    - name: cache .roswell
      id: cache-dot-roswell
      uses: actions/cache@v1
      with:
        path: ~/.roswell
        key: ${{ runner.os }}-dot-roswell-${{ matrix.lisp }}-${{ hashFiles('**/*.asd') }}
        restore-keys: |
          ${{ runner.os }}-dot-roswell-${{ matrix.lisp }}-
          ${{ runner.os }}-dot-roswell-

We still run the roswell install even if the install was cached, since it makes some global changes as well, like installing system packages if needed. The matrix.lisp is the value from the matrix defined above for the particular instance of the job, so that is passed to the roswell CI script in the LISP environment var to specify what it should install.

    - name: install roswell
      shell: bash
      env:
       LISP: ${{ matrix.lisp }}
      run: curl -L https://raw.githubusercontent.com/roswell/roswell/master/scripts/install-for-ci.sh | sh

once Roswell is installed, we run some commands to print out info about the install, which is useful when trying to match the setup if it finds a problem that doesn't show up on developer machines.

continue-on-error indicates that failures here shouldn't fail the entire run (though probably something is too broken for the real tests to pass).


   - name: run lisp
      continue-on-error: true
      shell: bash
      run: |
        ros -e '(format t "~a:~a on ~a~%...~%~%" (lisp-implementation-type) (lisp-implementation-version) (machine-type))'
        ros -e '(format t " fixnum bits:~a~%" (integer-length most-positive-fixnum))'
        ros -e "(ql:quickload 'trivial-features)" -e '(format t "features = ~s~%" *features*)'

next we update any existing QL dist stored in the cached roswell


    - name: update ql dist if we have one cached
      shell: bash
      run: ros -e "(ql:update-all-dists :prompt nil)"

finally we load the system and run the tests.

In order for test results to show up as pass/fail in CI, we need to ensure we exit and return an appropriate value. For that we wrap loading and tests in a handler case to print the error then exit the lisp on errors.

Additionally, on implementations with recent ASDF, we might have problems with warnings about bad system names, so we muffle those.

    - name: load code and run tests
      shell: bash
      run: |
        ros -e '(handler-bind (#+asdf3.2(asdf:bad-SYSTEM-NAME (function MUFFLE-WARNING))) (handler-case (ql:quickload :ci-example.test) (error (a) (format t "caught error ~s~%~a~%" a a) (uiop:quit 123))))' -e '(ci-example.test:run-tests-for-ci)'

readable version of lisp form above:


(handler-bind (#+asdf3.2(asdf:bad-SYSTEM-NAME (function MUFFLE-WARNING)))
  (handler-case (ql:quickload :ci-example.test)
    (error (a)
      (format t "caught error ~s~%~a~%" a a)
      (uiop:quit 123))))

once we commit and push the .yml file it will try to run the action and probably fail since we haven't defined a test system yet (or because yaml is annoying and there are typos). In that case github will send an email with link to the failing action with details. (when doing a lot of testing of CI itself, you can 'ignore' the repo with the 'unwatch' button in github UI to avoid the mails, but don't forget to watch it again when you get done and want to see the results)

While the action runs, you can watch status and output from the actions tab in github UI.

If we add code to define the package ci-example.test and the function ci-example:run-tests-for-ci that exits with zero on success (or non-zero otherwise), it should pass the CI and we can add banners to the README like


![CI](https://github.com/3b/ci-example/workflows/CI/badge.svg?branch=master)

which looks like

If we then push some bad changes to a branch, it will show up with failed tests , and similarly a pull request will show "Some checks were not successful"