The Dockerfile

7 minute read

This article is part 3 in a series: Docker



Writing & building a dockerfile

If you have read the two earlier posts in this series, you should know, at least to some extent, what the difference between a container, image and docker-compose file is.
In this post I wish to talk about the Dockerfile. The Dockerfile is a file which describes the image, it basically tells the docker engine how to build the image.
It contains a set of commands which the engine executes and puts into layers, which is what the final image consists of.

To start off, we can look at a very basic dockerfile.

FROM jitesoft/node-base:latest
LABEL maintainer="John Doe <[email protected]>"

ENV NODE_ENV="production"

ADD . /app

RUN npm install -g gulp \
    && cd /app/src \
    && npm install \
    && cd .. \
    && rm -r /app/src

WORKDIR /app/dist
EXPOSE 9000
ENTRYPOINT ['node']
CMD ['index.js', 'start']

The file above uses the most vital parts of the dockerfile syntax.

FROM indicates which image that the file is derived from. That means that you can use another dockerfile or image, even one you have not written yourself, to extend. The FROM command is always there, there is no way to get away from it! You can use scratch though, if you wish to have a totally empty docker image to start off with.

The LABEL is just a label for the image, image labels can later be viewed through the docker inspect command. In this case it is used to set a maintainer label, which consists of a name and a email address to the person maintaining the image. This is not required, but its a common practice.

ENV is used to add environment variables to the image. An environment variable that is defined in the file can be overriden by the -e flag when running the container but will default to what it is set to in the file. Its also possible to use the defined env variable during build time.
There is another directive which is much used as the ENV one, ARG. ARG defines a build time variable which does not persist into the container.
It can be changed when the image is built by passing a new value in with a --build-arg flag.

To add data from the computer building the image, the ADD or COPY directive is used. It takes the local data and copies it into the docker image at a the specified location. In the example image, we copy the whole directory that the docker image is in and put it in the /app directory.
This is the best way to add data into the image, this way data is added to the image even before a container is built.
This can be useful to move install files or scripts or basically anything over from the repository or computer that the author works from. It can also sometimes be used to build a 100% static image without any outside data connected to it. This is sometimes a very good way to go, if you do not need to persist the data in a local directory on the computer.

Next the RUN directive is used. RUN is a way for the file to define what the image is supposed to do. All code written after RUN (until end of line) will be executed in the images shell during build. This is used to install things, build things, well, basically anything you wish to do that is done before the image is used to create a container.
As you might see, mos of the lines in the RUN command ends with a \, the \ indicates that the next line is part of the current RUN command. That way its possible to split it up into multiple lines instead of one long line of text, which could easily be hard to manage when it grows too big. It also means that you do not have to run a whole lot of RUN commands to split it up into more than one line, which is nice.

After RUN a WORKDIR is set.
It’s possible to set a workdir multiple times in a dockerfile, each time it will set the shells starting directory and the last one will decide where the container will enter.
If workdir /app/src is set at the end, running the container will make the container start up at /app/src, in the above example, we set it to /app/dist, which is good, cause in the command above, we removed the /app/src directory!

EXPOSE opens up a given port to the network that the container runs. This is very useful way to only expose the most executing ports outside of the network, but still let other containers access the container while they are running. Giving a extra layer of security!

ENTRYPOINT can be written in two forms. As a command: command a b or as an array (aka exec form) which is the preferred way: ["command", "a", "b"]. This defines a command that will be appended after any command the user of the image sends into the container with the docker run command.
If you set the entrypoint to ls and the container is started as docker run --rm container/name /app, the /app directory will be printed out in the console as the final command will be ls /app. The entrypoint can be overriden when running the container, which could be useful to for example enter the containers shell instead of executing a command.

CMD and ENTRYPOINT are often used together. In our case, we use the entrypoint to define which executable (node) we want to run and we use the CMD to define which commands to pass in to the executable (in this case index.js and start).
If using a entrypoint and a cmd together as above, the CMD should be written in exec form (["command", "command"]) but if acting on its own, the “shell form” (command) can be used (node index.js start).
If you define more than one CMD in the file, the last one will be used and only the last one. Only one CMD will be executed and it will only be executed if no command is sent in through the docker run command. If any command is passed, the container will use that instead of CMD. So in the above example, we could just as well pass in another file when we run the container: docker run --rm container/name notindex.js and it would run the node executable (due to the entrypoint) on the notindex.js file instead.
CMD could easily be mistaken for the RUN command, as cmd is a short form of Command, but the RUN command is used to tell the image what to do during build, while the CMD command is used to tell the container what to do when it starts.

So, when we are ready to build the image, we just fire: docker build -t myname/imagename and it will start the build and tag the image with myname/imagename making it easy to run by docker run --rm myname/imagename.

Publishing docker images

When we have our docker image built and ready we might wish to push it up to the docker hub registery (or another maybe?), this is easily done by logging in to docker hub through the terminal:

docker login [optional registry url if not docker hub, which else is default]

Write username and password. OBSERVE that the login asks for email, but it actually expects the username!

When we are logged in, we can push the image by using the docker push command:

docker push myname/imagename:tag

Where myname is your docker hub username and imagename is the name of the image you are pushing, the tag is an optional tag, like latest or something similar.

There are ways to automatically build images from repositories on docker hub, but I have decided to not go through this in this post, just check the docs and it should be quite easy to figure out!