Python

Photo credit: unsplash-logoChris Ried

It’s been almost a month since the latest Jupiter Dev Log installment. I managed to hack a bit on this, and two “minor” releases have happened - 0.1.0, which essentially introduced the new CLI & Notion centric workflows that’ll come to the rest of the system in due time, and 0.2.0, which added public documentation and a bunch of minor UX and feature improvements.

But in between these two releases, the biggest change has been adding support for linting. It’s been an interesting journey, and I’ll cover it in the rest of the post.

As a refresher, “linting” is a lightweight process of source code analysis, meant to spot syntactic and style errors. It’s a fuzzy definition, and many linters do some sort of static analysis too. It’s the first work of quality control I’ve done on Jupiter, but not the last. Type annotations and testing will surely come in the future.

The first question I had to answer was “what to lint”? The natural answer here would be - “the sources, duh”. Meaning the src directory. And the test directory when it’ll come. But of course, that is not all that’s found in the repository. So my answer here was to try to lint “all the things” - documentation, GitHub workflow configs, the Dockerfile, etc. As the title says!

The second and third question I had to answer was “what tools to use?” and “how to integrate them into the system”? As an example, for the Python code in src I started out with pylint which is the canonical Python linter. I added the ./scripts/lint-sources.sh script with the following form:

#!/bin/bash

set -ex

pylint --jobs=8 --rcfile=./scripts/lint/pylint ./src

You can checkout the final pylint rcfile. It’s quite big and was autogenerated by pylint itself with default values. I iterated on it while addressing the many linter errors and got it down to something suitable for the project. I finally added the ./scripts/lint.sh script which simply called out to the lint-sources.sh script and to the other linters I added in turn.

The next step was to integrate this basic script functionality into “the system” - the Makefile for local interactions and GitHub workflows for CI work. The Makefile integration was easy - I just needed to add a .PHONY rule called lint which just calls ./scripts/lint.sh. The GitHub one was a bit trickier. I added a new workflow for regular pushes of branches, which would run the linter. It looked like so:

---
name: Develop

"on":
  push:
    branches:
      - 'develop'
      - 'feature/**'
      - 'bugfix/**'

jobs:
  linter:
    name: Linter
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v2
      - name: Prepare for CI
        run: sudo ./scripts/setup-for-ci.sh
      - name: Lint scripts
        run: ./scripts/lint.sh

So from the start there was the need for this new workflow, and to limit it to just develop, feature bugfix branches (you’ll see why in a minute). But there was also the need to define a CI environment, via the setup-for-ci.sh script. Previously the CI system just invoked Docker build commands, which had environment setup there, and a “production” environment at that. With this script there’s a reproducible way of building the dev environment on a remote machine. There’s a setup-for-dev.sh script which is supposed to work on a developer’s machine. Inside it you’ll find the usual pip install -r requirements.txt and other assorted commands.

As far as I know, besides Actions, there isn’t another way of sharing “steps” between workflows. But Actions seem very heavyweight for a single-repo case. Therefore I just copy-pasted the Linter step inside the Release workflow. Part of good software engineering is to know when it’s alright to repeat yourself!

With the above accomplished, there was a nice process setup - a basic linter hooked into make, the IDE and the CI system (which triggered it on every push). So it was time to extend it “horizontally” by adding more and varied linters:

  • The main focus was of course on the Python sources.
    • Here I used the following linters:
      • Pylint as the main linter.
      • Pyflakes as a secondary and somewhat simpler linter.
      • Bandit for detecting common security issues. It actually found one!
      • Pydocstyle for linting the inline docs.
      • Vulture for detecting unused code.
    • The above found a gazillion issues - mostly stylistic, but also a couple of small bugs. A lot of edits and file movements occurred on account of this. Including renaming the commands directory to command, cause apparently there was a deprecated ancient Python system module named commands which still tripped the linters up and made them require a different sorting order.
    • My main gripe would be that there were so many linting and analysis tools. There’s about as many I didn’t include for one reason or another.
  • For the Dockerfile I used hadolint. This I had to actually use directly as a Docker image / container. As a side-note, it’s becoming more common to see CLI apps packaged as Docker images and run via docker run, where you’d see them packaged as deb, casks or whatever. I can’t say I blame them. docker push is much simpler than the steps many of these systems make you jump through to publish on them.
  • For the docs I used markdownlint. This came packaged as a Rubygem. But yeah, the documentation itself is also linted!
  • For the ./scripts directory I used shellcheck. Yup, there’s linters for shells (which aren’t just /bin/true). And it is pretty neat and managed to find a bunch of style issues, one bug, and one more efficient way of using bash instead of piping to sed.
  • For the yaml files I used yamllint. I hooked both the toplevel files here (mkdocs.yml and .readthedocs.yml) and the workflows under .github. Again, a bunch of small stylistic issues were found.

As you can see, it’s quite a bunch of tools, and quite a bunch of infrastructure. I think it’s well worth the cost though, and perhaps something which I should have done earlier.

The only thing I was not able to lint was the Makefile. There’s some solutions, but they’re either incomplete, or unmaintained, or simply not do that much. If you know of any, let me know in the comments below, cause I’d for sure be interested in covering this last aspect.

That’s about it for this time, see you next time.