Seamlessly Run Composer On HHVM Inside Docker: Introducing make-docker-command

François Zaninotto
François ZaninottoSeptember 10, 2014
#devops#shell#tutorial

Single purpose containers for running isolated composer, bower, compass, or capistrano commands are possible, but hard to manipulate. Read on to see how we use GNU make to seamlessly run any command line tool inside Docker, or skip to the conclusion if you just want a copy/paste experience.

The Problem

In order to install, update, build or deploy a codebase, developers use more and more command line tools. For instance, a single PHP web project may need Composer, Bower, compass, and Capistrano (not to mention phpunit, php-cs-fixer, browserify, etc).

Each of these tools requires a specific tech stack:

  • Composer needs HHVM (faster than PHP)
  • Bower needs Node.js and npm
  • Compass and Capistrano need Ruby and Rubygems

Developing such a typical PHP project means installing at least 3 different programming languages. Sometimes, two distinct projects need two specific versions of a given utility. Soon you're using rvm or nvm and it becomes a nightmare.

Why not let the project define exactly the utilities required for the build phases, and use Docker containers to isolate them?

The Idea: Seamless Dockerized Commands

What if, instead of calling the local installation of composer:

$ composer install

A developer could call a command looking almost exactly the same:

$ [magic] composer install

And this command would:

  1. download a pre-configured docker image with HHVM and composer
  2. run the container, and give it access to the local code, composer cache directory, and SSH configuration
  3. inside the container, execute the composer install command

With 3 prerequisites:

  1. The [magic] tool should not modify the host environment
  2. The [magic] tool should allow for different versions of the same utility on different projects
  3. The [magic] tool should not force developers to use a new syntax for using composer

Single purpose Docker images are easy to setup (we'll get there shortly). And it turns out the [magic] command already exists: it's called make. The GNU make utility is perfect for this job; it's what developers use to automate build commands anyway, and it can be committed together with the project code. That way, code and build tools are always in sync, and a new developer can run all build commands in minutes.

But there are lots of little problems to solve before providing such a frictionless user experience. Let's go through all of them, before giving the final solution.

Packaging A Tool Into A Single Purpose Container

First of all, creating a basic single purpose Docker container is quite easy. Here is an example for a HHVM + Composer image:

FROM ubuntu:14.04

ENV HOME /root

RUN apt-get update -qq && apt-get install -y -qq git curl wget

# Install HHVM
RUN wget -O - http://dl.hhvm.com/conf/hhvm.gpg.key | apt-key add -
RUN echo deb http://dl.hhvm.com/ubuntu trusty main | tee /etc/apt/sources.list.d/hhvm.list
RUN apt-get update -qq && apt-get install -y -qq hhvm

# Install composer
RUN bash -c "wget http://getcomposer.org/composer.phar && mv composer.phar /usr/local/bin/composer"

WORKDIR /srv

ENTRYPOINT ["hhvm", "/usr/local/bin/composer"]

Nothing unusual here (refer to the Dockerfile documentation if some of these commands are new to you). To build the corresponding container, save the code into a new file named Dockerfile anywhere you want, then call the docker build command:

$ docker build -t="docker-composer" /path/to/Dockerfile

This downloads Ubuntu, git, curl, HHVM, and composer, and it may take a few minutes the first time you run it. Subsequent builds use the Docker cache, so they are much faster.

Now that the docker-composer container is built, run it by calling docker run from a project directory relying on Composer:

$ cd /path/to/myproject
$ ls
... ... composer.json ... ...
$ sudo docker run -ti \
    -v `pwd`:/srv \
    docker-composer install
Loading composer repositories with package information
Installing dependencies (including require-dev)
...

The volume mounting (-v option) maps the current host directory (result of the pwd execution) with the container WORKDIR. The docker run command will therefore execute composer install in the current directory - but within the container. Encouraging, but that's only the first step.

Using A Makefile To Simplify Docker Invocation

Composer does a lot of HTTP requests; its performs much better if it can keep a cache of past requests. Unless you commit changes, the modifications on a Docker container are lost between runs. To persist a composer cache between each 'docker-composer' call, you have to mount a host directory to the composer cache directory in the container:

$ mkdir /var/tmp/composer
$ sudo docker run -ti \
    -v `pwd`:/srv \
    -v /var/tmp/composer:/root/.composer \  # new mount
    docker-composer install

Now, each call to docker run uses the local /var/tmp/composer directory as cache for composer.

This is already too much shell code to be copy/pasted each time you call composer install. Let's create a Makefile to make it more straightforward:

composer-install:
  @mkdir --parent /var/tmp/composer
  @sudo docker run -ti \
    -v `pwd`:/srv \
    -v /var/tmp/composer:/root/.composer \
    docker-composer install

Now, to run the composer container, all it takes is a:

$ make composer-install

The problem is that install is only one command in a large list of possible Composer commands. Does that mean that you have to create one target for each command in the makefile (composer-install, composer-update, composer-require, etc)? That would be too much work. Instead, let's use the functions for transforming text supported by GNU make to create a catch-all target for all composer commands:

# If the first argument is "composer"...
ifeq (composer,$(firstword $(MAKECMDGOALS)))
	# use the rest as arguments for "composer"
	COMPOSER_ARGS := $(wordlist 2,$(words $(MAKECMDGOALS)),$(MAKECMDGOALS))
	# ...and turn them into do-nothing targets
	$(eval $(COMPOSER_ARGS):;@:)
endif

composer:
  @mkdir --parent /var/tmp/composer
  @sudo docker run -ti \
    -v `pwd`:/srv \
    -v /var/tmp/composer:/root/.composer \
    docker-composer $(COMPOSER_ARGS)

The initial part is used to detect arguments following the "composer" word in the make invocation. For instance, calling make composer install populates the COMPOSER_ARGS variable with the string "install". So now you can call:

$ make composer install

And any other composer command. Just one limitation: If you need to pass options (like --prefer-source) to the command, make will interpret these as make command line options. You have to prefix your call with --, which means "everything after this is an argument, not an option":

$ make -- composer install --prefer-source

Note that you could achieve the same thing with an alias instead of a makefile, but that would defeat the purpose: an alias is global to the user, while a makefile is specific to a project.

Linux-Specific Improvements and Dealing With Root

On Linux, you may see a "WARNING: No swap limit support" message when executing the command. It shouldn't prevent the command from running, but the message means that your system needs an additional LXC configuration. See the official Docker documentation for the solution.

Another problem is that Docker requires root privileges (hence the sudo in the Makefile), so calling make composer install asks for the developer's password. The recommended way to skip this step is to add your user to the docker group, logout then login again:

$ sudo groupadd docker
$ sudo gpasswd -a `whoami` docker
$ sudo service docker restart

You can now remove the sudo in the Makefile. But you're not done yet. Docker runs the container as the root user, so all the files generated by the container belong to the root:root user. Take a look at the vendor/ directory generated by the make composer install command:

$ ls -al
drwxr-xr-x 10 francois staff  340 sep  5 08:01 ./
drwxr-xr-x 73 francois staff 2,5K sep  4 18:42 ../
-rw-rw-r--  1 francois staff  684 sep  5 08:01 Makefile
-rw-rw-r--  1 francois staff  305 sep  4 21:53 composer.json
-rw-r--r--  1 root     root  7,4K sep  5 06:42 composer.lock
drwxr-xr-x  7 root     root   238 sep  4 23:27 vendor/

That's bad if you're setting up dependencies for a web server for instance. This is a well known Docker limitation, which will hopefully be addressed soon.

In the meantime, the only way to bypass the obstacle is to create a user within the container with the same group id and user id as the current host user, and execute commands with this new user. So for instance if my user id is 1000 and my group id is 50, the container should run the following commands:

# create dummy group with id 20
$ groupadd -f -g 20 dummy
# create dummy user with id 1000, and add it to the dummy group
$ useradd -u 1000 -g dummy dummy
# make sure the dummy user owns his home directory
$ mkdir --parent /home/dummy
$ chown -R dummy:dummy /home/dummy
# run HHVM+composer with the dummy user
$ sudo -u dummy hhvm /usr/local/bin/composer $(COMPOSER_ARGS)

That sounds scary if you have to do it by hand every time you run the container, but fortunately the Makefile can automate that:

# read the current user name and group id
GROUP_ID = $(shell id -g)
USER_ID  = $(shell id -u)
CONTAINER_USERNAME  = dummy
CONTAINER_GROUPNAME = dummy
HOMEDIR = /home/$(CONTAINER_USERNAME)
# forge a command to be executed inside the container
CREATE_USER_COMMAND = \
  groupadd -f -g $(GROUP_ID) $(CONTAINER_GROUPNAME) && \
  useradd -u $(USER_ID) -g $(CONTAINER_GROUPNAME) $(CONTAINER_USERNAME) && \
  mkdir --parent $(HOMEDIR) && \
  chown -R $(CONTAINER_USERNAME):$(CONTAINER_GROUPNAME) $(HOMEDIR) && \
  sudo -u $(CONTAINER_USERNAME)

# ...

composer:
	@mkdir --parent /var/tmp/composer
	@docker run -ti \
		-v `pwd`:/srv \
		-v /var/tmp/composer:$(HOMEDIR)/.composer \
		docker-composer bash -c \
			'$(CREATE_USER_COMMAND) hhvm /usr/local/bin/composer $(COMPOSER_ARGS)'

The call to docker run includes the CREATE_USER_COMMAND, which means that the container can't use the composer entry point anymore. The Dockerfile must be modified to remove the final ENTRYPOINT line:

FROM ubuntu:14.04

ENV HOME /root

RUN apt-get update -qq && apt-get install -y -qq git curl wget

# Install HHVM
RUN wget -O - http://dl.hhvm.com/conf/hhvm.gpg.key | apt-key add -
RUN echo deb http://dl.hhvm.com/ubuntu trusty main | tee /etc/apt/sources.list.d/hhvm.list
RUN apt-get update -qq && apt-get install -y -qq hhvm

# Install composer
RUN bash -c "wget http://getcomposer.org/composer.phar && mv composer.phar /usr/local/bin/composer"

WORKDIR /srv

Now calling make composer install doesn't ask for the root password anymore, and generates a vendor directory with the current user's credentials.

$ make composer install
$ ls -al
drwxr-xr-x 10 francois staff  340 sep  5 08:01 ./
drwxr-xr-x 73 francois staff 2,5K sep  4 18:42 ../
-rw-rw-r--  1 francois staff  684 sep  5 08:01 Makefile
-rw-rw-r--  1 francois staff  305 sep  4 21:53 composer.json
-rw-r--r--  1 francois staff 7,4K sep  5 06:42 composer.lock
drwxr-xr-x  7 francois staff  238 sep  4 23:27 vendor/

OS X Specific Improvements

If you're on a OS X, you'll need a custom version of boot2docker, allowing volume mounting, to get this working.

Since boot2docker uses VirtualBox user mapping, you don't need to use sudo when calling the docker command. Additionally, the files generated in the mounted volume by the container already belong to the host user and group automatically. But the modified boot2docker only allows to share files under the /User/ directory, so the temporary composer cache needs to move there. Here is an OS-X compatible Makefile:

# ...

composer:
	@mkdir --parent ~/tmp/composer
	@docker run -ti \
		-v `pwd`:/srv \
		-v ~/tmp/composer:/root/.composer \
		docker-composer bash -c \
			'hhvm /usr/local/bin/composer $(COMPOSER_ARGS)'

The /User/ directory limitation is a hard one: OS X users won't be able to run the dockerized composer from outside this directory. Let's hope that the core Docker team can find an alternative soon (more on that later).

One Makefile For All Platforms

The Makefile should work both on OS X and Linux platforms, so that it can be committed with the project. Let's add OS detection based on the result of the docker info command:

ifeq (Boot2Docker, $(findstring Boot2Docker, $(shell docker info)))
  PLATFORM := OSX
else
  PLATFORM := Linux
endif

ifeq ($(PLATFORM), OSX)
  COMPOSER_CACHE_DIR = ~/tmp/composer
  HOMEDIR = /root
  CREATE_USER_COMMAND =
else
  COMPOSER_CACHE_DIR = /var/tmp/composer
  CONTAINER_USERNAME = dummy
  CONTAINER_GROUPNAME = dummy
  HOMEDIR = /home/$(CONTAINER_USERNAME)
  GROUP_ID = $(shell id -g)
  USER_ID = $(shell id -u)
  CREATE_USER_COMMAND = \
    groupadd -f -g $(GROUP_ID) $(CONTAINER_GROUPNAME) && \
    useradd -u $(USER_ID) -g $(CONTAINER_GROUPNAME) $(CONTAINER_USERNAME) && \
    mkdir --parent $(HOMEDIR) && \
    chown -R $(CONTAINER_USERNAME):$(CONTAINER_GROUPNAME) $(HOMEDIR) && \
    sudo -u $(CONTAINER_USERNAME)
endif

# ...

composer:
	@mkdir --parent $(COMPOSER_CACHE_DIR)
	@echo docker run -ti \
		-v `pwd`:/srv \
		-v $(COMPOSER_CACHE_DIR):$(HOMEDIR)/.composer \
		docker-composer bash -c \
			'$(CREATE_USER_COMMAND) hhvm /usr/local/bin/composer $(COMPOSER_ARGS)'

Using The Docker Hub Registry

Instead of writing a Dockerfile, and waiting for the apt-get commands to execute the first time you build the container, you can use the image we have committed to Docker Hub. It's called marmelab/composer-hhvm, and it works just like described above. To do so, just change the container image name target in the Makefile:

composer:
	@mkdir --parent $(COMPOSER_CACHE_DIR)
	@echo docker run -ti \
		-v `pwd`:/srv \
		-v $(COMPOSER_CACHE_DIR):$(HOMEDIR)/.composer \
		marmelab/composer-hhvm bash -c \
			'$(CREATE_USER_COMMAND) hhvm /usr/local/bin/composer $(COMPOSER_ARGS)'

Next time you run make composer install, docker will pull all the fs layers for this image from Docker Hub, and within 4 to 5 minutes of download you'll have a working composer container - no installation required on the container.

Sharing Identities With Docker Commands

Composer, like many other commands, uses your SSH keys to authenticate to third-party services. For instance, a composer.json may refer to private GitHub repositories. Unless you give access to your private keys to the container, the dockerized Composer can't fetch these private repositories.

The solution is to share the SSH identity and known hosts files between the host and the container. There is one pitfall: these files must be readable by the dummy user, so a simple file mount via the docker -v command isn't enough. Once again, the Makefile comes to the rescue to copy the mounted SSH keys (to avoid modifying the original on the host), and alter the file rights:

# location of SSH identity on the host
DOCKER_SSH_IDENTITY ?= ~/.ssh/id_rsa
DOCKER_SSH_KNOWN_HOSTS ?= ~/.ssh/known_hosts
# copy mounted SSH files to the dummy user
ADD_SSH_ACCESS_COMMAND = \
  mkdir --parent $(HOMEDIR)/.ssh && \
  test -e /var/tmp/id && cp /var/tmp/id $(HOMEDIR)/.ssh/id_rsa ; \
  test -e /var/tmp/known_hosts && cp /var/tmp/known_hosts $(HOMEDIR)/.ssh/known_hosts ; \
  test -e $(HOMEDIR)/.ssh/id_rsa && chmod 600 $(HOMEDIR)/.ssh/id_rsa ;

# utility commands
AUTHORIZE_HOME_DIR_COMMAND = chown -R $(CONTAINER_USERNAME):$(CONTAINER_GROUPNAME) $(HOMEDIR) &&
EXECUTE_AS = sudo -u $(CONTAINER_USERNAME)

# ...

composer:
  @mkdir --parent $(COMPOSER_CACHE_DIR)
  @docker run -ti \
    -v `pwd`:/srv \
    -v $(COMPOSER_CACHE_DIR):$(HOMEDIR)/.composer \
    -v $(DOCKER_SSH_IDENTITY):/var/tmp/id \              # new mount
    -v $(DOCKER_SSH_KNOWN_HOSTS):/var/tmp/known_hosts \  # new mount
    marmelab/composer-hhvm bash -c '\
      $(CREATE_USER_COMMAND) \
      $(ADD_SSH_ACCESS_COMMAND) \
      $(AUTHORIZE_HOME_DIR_COMMAND) \
      $(EXECUTE_AS) hhvm /usr/local/bin/composer $(COMPOSER_ARGS)'

Now, running make composer install uses the SSH identity of the local ~/.ssh/id_rsa file, and the known hosts from ~/.ssh/known_hosts. This allows the dockerized composer to connect to private GitHub repositories.

One small glitch remains: When connecting to GitHub using the dummy SSH key, the container SSH client asks the user for the key passphrase. This is normal, because the container doesn't have access to the host OS' keychain, which usually deals with key passphrases silently.

The solution is to create a new identity file with no passphrase, and use this key instead of the default SSH key.

Here is how to create a new SSH key pair without passphrase:

$ cd ~/.ssh
$ mkdir docker_identity && cd docker_identity
$ ssh-keygen -t rsa -f id_rsa -N ''
$ ssh-keyscan -t rsa github.com > known_hosts
$ ls
id_rsa  id_rsa.pub  known_hosts

Add the content of the id_rsa.pub as a new SSH key on GitHub, then set the DOCKER_SSH_IDENTITY and DOCKER_SSH_KNOWN_HOSTS environment variables on the host to point to these new files:

$ export DOCKER_SSH_IDENTITY="~/.ssh/docker_identity/id_rsa"
$ export DOCKER_SSH_KNOWN_HOSTS="~/.ssh/docker_identity/known_hosts"

Thanks to the ?= magic, these paths will replace the default locations defined in the Makefile.

Now you can run composer install without ever having to enter your passphrase anymore.

Performance

What's the overhead of Docker and the make logic versus a local execution? It really depends on the nature of the task. The main difference is between tasks making a lot of disk I/Os (like a composer install with cache), and tasks with few I/Os, or mostly network I/Os (like a composer update with no composer.lock).

On Linux, the overhead of the make logic and LXCs is low - it may add up to a few seconds to the execution of the command. Disk I/O intensive commands don't suffer too much, and there can be a big benefit in using HHVM instead of PHP on large projects.

$ git clone git@github.com:fabpot/Goutte.git && cd Goutte
$ rm -Rf ~/.composer/cache/

# reference measurements on the host
# Few Disk I/Os
$ time composer update --prefer-dist
  11.36 real         0.02 user         0.02 sys
$ rm -Rf vendor
# Many Disk I/Os
$ time composer install --prefer-dist
  3.40 real          0.02 user         0.02 sys

# dockerized command
# Few Disk I/Os
$ time make -- composer update --prefer-dist
  16.79 real         0.02 user         0.02 sys
$ rm -Rf vendor
# Many Disk I/Os
$ time make -- composer install --prefer-dist
  4.62 real          0.02 user         0.04 sys

On OS X, the poor performance of VirtualBox shared folders makes the dockerized composer slower than the local one. A soon-to-be released FUSE support in docker may reduce this overhead drastically.

$ git clone git@github.com:fabpot/Goutte.git && cd Goutte
$ rm -Rf ~/.composer/cache/

# reference measurements on the host
# Few Disk I/Os
$ time composer update --prefer-dist
  22.05 real         6.12 user         0.49 sys
$ rm -Rf vendor
# Many Disk I/Os
$ time composer install --prefer-dist
  0.83 real         0.41 user         0.20 sys

# dockerized command
# Few Disk I/Os
$ time make -- composer update --prefer-dist
  23.82 real         0.01 user         0.02 sys
$ rm -Rf vendor
# Many Disk I/Os
$ time make -- composer install --prefer-dist
  8.74 real         0.01 user         0.01 sys

The install command makes much more disk I/Os than the update command, so the impact of the shared folder low performance makes the docker version much slower on OS X.

And of course, the overhead is much more important the first time you run the command, since docker must download the marmelab/composer-hhvm image from Docker Hub. If you create similar docker images for other commands, docker will be smart enough to use its local cache to avoid downloading twice the same dependencies (provided commands appear in the same order in all Dockerfiles).

Conclusion

Running a dockerized composer command seamlessly is entirely possible. We've released the complete Makefile that allows it on a new project called make-docker-command. Using this Makefile, you can also run dockerized PHPUnit, Bower, and Compass. Pull requests to add more commands are welcome.

A dockerized command has many benefits (zero installation, same execution environment for all developers, sometimes even better performance) and goes in the direction of microservices. It is lighter than using one VM per project, and can even be extended to server commands.

Docker volume mapping still has rough edges, especially on OS X. It currently takes a lot of boilerplate code to overcome its current limitations. But the docker team is hard at work improving this system with the FUSE volume mounting. It's probable that in the near future, a large part of the Makefile code exposed here will become irrelevant because Docker will handle mounted directories and user rights correctly.

All in all, dockerized commands is a very promising feature, and it shows how docker can help developers beyond just deployment. I hope command line tools authors will publish Makefiles and Docker images for their tools to make them easier to use.

Many thanks to Brice Bernard for his ideas and support for writing this article.

Did you like this article? Share it!