Data stores in docker, app code locally

Table of Contents

Around 2016, I got really excited about Docker Compose for development and I wasn’t alone. HackerNews kept talking about it. I remember walking around Madison Square Park and overhearing “docker” from every other tech worker.

This enthusiasm came from a reasonable place: we could finally get consistent, reproducible, multi-language development environments without wrangling entire VMs. A docker compose file seemed like a breath of fresh air.

Now that I’ve been on several dev teams using big docker compose files and talking to friends doing the same, I think it’s not worth it. Here’s why.

The Pattern #

You usually have a web app with some database dependencies, so you write a Docker Compose file:

version: '2'
services:
  web:
    build: .
    ports:
      - "5000:5000"
    env_file: .env
    volumes:
      - ./webapp:/opt/webapp
  postgres:
    image: postgres:14.5
  redis:
    image: redis:7.0-alpine

A couple of data stores and a web app. We have a host volume mount so the files on our host get mounted into the running container. As we edit files on our host, they become available “immediately” in the container without rebuilding the image.

There are lots of articles talking about this pattern. Here’s one example from Heroku

Problem #

The problem comes from that web container.

Performance #

Docker on OSX has come a long way, but it’s still crazy slow – especially the file system. Things that touch the file system a lot like npm install or next build are so much faster running on my host than in a docker container. I love being able to fry eggs on my MacBook, but it does make a mess of things.

Increased mental overhead #

The extra networking and file system layers of Docker add metal burden that makes development and debugging harder.

When things get funny you need your dev environment to be simple and predictable. You don’t want to be thinking:

Am I running the latest version of this code?
Are the dependencies coming from the container or the host?
I updated this file, why isn’t it updated in the container?
How to I access this network service? That slows you down when you’re trying to build new features and debug tricky issues.

Debugging #

Interactive debuggers are awesome. Many developers over-use print statements and under-use breakpoints and step debugging. While you can get interactive debuggers set up code running in a docker container with most IDEs, the extra layers make it harder, you don’t get around to it, it’s broken when you need it, so you just muscle through with a bunch of console.logs.

Other Confusion #

Usually your setup gets more complicated than the example. You’ve got several application containers: a backend, and a frontend, plus some test container. You’ve got multiple docker-compose files, and they override each other: docker-compose.base.yml, docker-compose.dev.yml, docker-compose.test.yml.

I’ve had environment values fail to update within the containers. Then you have to figure out what docker command to run to really, really update everything. Which values change when I run versus build?

Ultimately you’re left with the worse feeling while debugging:

Is my code buggy or is my environment buggy?

The Alternative #

Keep the data stores in docker-compose but run application code on the host.

For those services (Redis, Postgres, Elasticsearch) that you access over the network, Docker provides a consistent and easy way to get a development environment set up. You don’t encounter any of the downsides: you don’t need to interact with these containers very much in the course of development. Start them and leave them running.

Then run all application code natively on the host along with your editor or IDE. You’ll get the best performance, IDE setup is simple, and the debugger just works.

To succeed at this, use your language-specific tools to enforce consistent environments between developers. Things like pyenv or package.json’s engine let you enforce language and package manager versions. Lock files let you control the specific version of code dependencies. If you need to, write a script that checks the overall developer environment.

You’ll spend less time debugging your environment and more time on shipping.

Plus you can focus on writing and maintaining a good on-boarding doc for new developers.