Dockerfile Instructions

dockerfile instructions

On this post we will explain the different Dockerfile instructions you can use when creating your container. These are commands that are put in the Docker File.

The syntax for instructions and their arguments in a Dockerfile is:

# Comment
INSTRUCTION arguments

Instructions can be lowercase or uppercase letters, but to differentiate the instructions and the arguments, instructions are generally written in uppercase.

Dockerfile example

So before we get into each instruction, let me show you a Dockerfile example so we know what we are talking about:

FROM ubuntu
RUN apt-get update && apt-get -y install httpd
RUN mkdir -p /data/myscript
WORKDIR /data/myscript
CMD python app.py

There are a variety of Dockerfile instructions we can put in our Dockerfile. These include FROM, RUN, WORKDIR, COPY, ADD, VOLUME, CMD, ENTRYPOINT, WORKDIR, USER, ONBUILD, LABEL, ARG, SHELL, HEALTHCHECK, EXPOSE and ENV. You can see a full list of the available Dockerfile instructions here. We are going to describe the main ones below.

FROM

The FROM instruction initializes a new build stage and sets the Base Image for subsequent instructions. As such, a valid Dockerfile must start with a FROM instruction.

Syntax:

FROM <image> [AS <name>]
FROM <image>[:<tag>] [AS <name>]
FROM <image>[@<digest>] [AS <name>]
  • ARG is the only instruction that may precede FROM in the Dockerfile.
  • FROM can appear multiple times within a single Dockerfile to create multiple images or use one build stage as a dependency for another. Simply make a note of the last image ID output by the commit before each new FROMinstruction. Each FROM instruction clears any state created by previous instructions.
  • Optionally a name can be given to a new build stage by adding AS name to the FROM instruction. The name can be used in subsequent FROM and COPY --from=<name|index> instructions to refer to the image built in this stage.
  • The tag or digest values are optional. If you omit either of them, the builder assumes a latest tag by default. The builder returns an error if it cannot find the tag value.

RUN

RUN has 2 forms:

  • RUN <command> (shell form, the command is run in a shell, which by default is /bin/sh -c on Linux or cmd /S /C on Windows)
  • RUN ["executable", "param1", "param2"] (exec form)

The RUN instruction will execute any commands in a new layer on top of the current image and commit the results. The resulting committed image will be used for the next step in the Dockerfile.

Layering RUN instructions and generating commits conforms to the core concepts of Docker where commits are cheap and containers can be created from any point in an image’s history, much like source control.

The exec form makes it possible to avoid shell string munging, and to RUN commands using a base image that does not contain the specified shell executable.

CMD

The CMD Dockerfile instruction specifies the command to run when a container is launched. It is similar to the RUN instruction, but rather than running the command when the container is being built, it will specify the command to run when the container is launched, much like specifying a command to run when launching a container with the docker run command, for example:

docker run -i -t ubuntu /bin/bash

This would be articulated in the Dockerfile as:

CMD ["/bin/bash"]

You can also specify parameters to the command, like so:

CMD ["/bin/bash", "-l"]

Here we’re passing the -l flag to the /bin/bash command.

You’ll note that the command is contained in an array. This tells Docker to run the command ’as-is’. You can also specify the CMD instruction without an array, in which case Docker will prepend /bin/sh -c to the command.
This may result in unexpected behavior when the command is executed. As a result, it is recommended that you always use the array syntax.

Lastly, it’s important to understand that we can override the CMD instruction using the docker run command. If we specify a CMD in our Dockerfile and one on the docker run command line, then the command line will override the Dockerfile ’s CMD instruction.

ENTRYPOINT

ENTRYPOINT has two forms:

  • ENTRYPOINT ["executable", "param1", "param2"] (exec form, preferred)
  • ENTRYPOINT command param1 param2 (shell form)

An ENTRYPOINT allows you to configure a container that will run as an executable.

For example, the following will start nginx with its default content, listening on port 80:

docker run -i -t --rm -p 80:80 nginx

ENTRYPOINT looks similar to CMD, because it also allows you to specify a command with parameters. The difference is ENTRYPOINT command and parameters are not ignored when Docker container runs with command line parameters.

WORKDIR

WORKDIR /path/to/workdir

The WORKDIR instruction sets the working directory for any RUNCMDENTRYPOINTCOPY and ADD instructions that follow it in the Dockerfile. If the WORKDIR doesn’t exist, it will be created even if it’s not used in any subsequent Dockerfile instruction.

The WORKDIR instruction can be used multiple times in a Dockerfile. If a relative path is provided, it will be relative to the path of the previous WORKDIR instruction. For example:

WORKDIR /a
WORKDIR b
WORKDIR c
RUN pwd

USER

USER <user>[:<group>] or
USER <UID>[:<GID>]

The USER instruction sets the user name (or UID) and optionally the user group (or GID) to use when running the image and for any RUNCMD and ENTRYPOINT instructions that follow it in the Dockerfile.

ONBUILD

ONBUILD [INSTRUCTION]

The ONBUILD instruction adds to the image a trigger instruction to be executed at a later time, when the image is used as the base for another build. The trigger will be executed in the context of the downstream build, as if it had been inserted immediately after the FROM instruction in the downstream Dockerfile.

Any build instruction can be registered as a trigger.

This is useful if you are building an image which will be used as a base to build other images, for example an application build environment or a daemon which may be customized with user-specific configuration.

For example, if your image is a reusable Python application builder, it will require application source code to be added in a particular directory, and it might require a build script to be called after that. An example for ONBUILD instruction is shown below:

ONBUILD ADD . /app/src
ONBUILD RUN /usr/local/bin/python-build --dir /app/src

LABEL

LABEL <key>=<value> <key>=<value> <key>=<value> ...

The LABEL Dockerfile instruction adds metadata to an image. A LABEL is a key-value pair. To include spaces within a LABELvalue, use quotes and backslashes as you would in command-line parsing. A few usage examples:

LABEL "com.example.vendor"="ACME Incorporated"
LABEL com.example.label-with-value="foo"
LABEL version="1.0"

ARG

ARG <name>[=<default value>]

The ARG instruction defines a variable that users can pass at build-time to the builder with the docker buildcommand using the --build-arg <varname>=<value> flag. A Dockerfile may include one or more ARG instructions. For example, the following is a valid Dockerfile:

SHELL

SHELL ["executable", "parameters"]

The SHELL instruction allows the default shell used for the shell form of commands to be overridden. The default shell on Linux is ["/bin/sh", "-c"], and on Windows is ["cmd", "/S", "/C"]. The SHELL instruction must be written in JSON form in a Dockerfile.

HEALTHCHECK

The HEALTHCHECK instruction tells Docker how to test a container to check that it is still working. This can detect cases such as a web server that is stuck in an infinite loop and unable to handle new connections, even though the server process is still running. The healthcheck is a quite powerful instruction so we will go into further details in our next post.

EXPOSE

EXPOSE <port> [<port>/<protocol>...]

The EXPOSE instruction tells Docker that the container listens on the specified network ports at runtime. Default is TCP if the protocol is not specified.

The EXPOSE instruction does not actually publish the port. It functions as a type of documentation between the person who builds the image and the person who runs the container, about which ports are intended to be published. To actually publish the port when running the container, use the -p flag on docker run to publish and map one or more ports, or the -P flag to publish all exposed ports and map them to high-order ports.

By default, EXPOSE assumes TCP. You can also specify UDP:

EXPOSE 80/udp

To expose on both TCP and UDP, include two lines:

EXPOSE 80/tcp
EXPOSE 80/udp

In this case, if you use -P with docker run, the port will be exposed once for TCP and once for UDP. Remember that -P uses an ephemeral high-ordered host port on the host, so the port will not be the same for TCP and UDP.

Regardless of the EXPOSE settings, you can override them at runtime by using the -p flag. For example

docker run -p 80:80/tcp -p 80:80/udp ...

ENV

ENV <key> <value>
ENV <key>=<value> ...

The ENV instruction sets the environment variable <key> to the value <value>. This value will be in the environment for all subsequent instructions in the build stage and can be replaced inline in many as well.

The ENV instruction has two forms. The first form, ENV <key> <value>, will set a single variable to a value. The entire string after the first space will be treated as the <value> – including whitespace characters. The value will be interpreted for other environment variables, so quote characters will be removed if they are not escaped.

The second form, ENV <key>=<value> ..., allows for multiple variables to be set at one time. Notice that the second form uses the equals sign (=) in the syntax, while the first form does not. Like command line parsing, quotes and backslashes can be used to include spaces within values.

COPY

COPY has two forms:

  • COPY [--chown=<user>:<group>] <src>... <dest>
  • COPY [--chown=<user>:<group>] ["<src>",... "<dest>"] (this form is required for paths containing whitespace)

The COPY instruction copies new files or directories from <src> and adds them to the filesystem of the container at the path <dest>.

Multiple <src> resources may be specified but the paths of files and directories will be interpreted as relative to the source of the context of the build.

The following example shows the COPY syntax

COPY hom* /mydir/        # adds all files starting with "hom"
COPY hom?.txt /mydir/    # ? is replaced with any single character, e.g., "home.txt"

ADD

ADD has two forms:

  • ADD [--chown=<user>:<group>] ["<src>",... "<dest>"] (this form is required for paths containing whitespace)
  • ADD [--chown=<user>:<group>] <src>... <dest>

The ADD instruction copies new files, directories or remote file URLs from <src> and adds them to the filesystem of the image at the path <dest>.

We can see an example below

ADD hom* /mydir/        # adds all files starting with "hom"
ADD hom?.txt /mydir/    # ? is replaced with any single character, e.g., "home.txt"

Multiple <src> resources may be specified but if they are files or directories, their paths are interpreted as relative to the source of the context of the build.

At first glance you may notice that COPY and ADD seems to perform the same operations. However, ADD also supports 2 additional sources. First, you can use a URL instead of a local file / directory. Secondly, you can extract a tar file from the source directly into the destination.

A valid use case for ADD is when you want to extract a local tar file into a specific directory in your Docker image.

If you’re copying in local files to your Docker image, always use COPY because it’s more explicit.

VOLUME

The VOLUME instruction creates a mount point with the specified name and marks it as holding externally mounted volumes from native host or other containers. The value can be a JSON array, VOLUME ["/var/log/"], or a plain string with multiple arguments, such as VOLUME /var/log or VOLUME /var/log /var/db.

A volume is a specially designated directory within one or more containers that bypasses the Union File System to provide several useful features for persistent or shared data:

  • Volumes can be shared and reused between containers.
  • A container doesn’t have to be running to share its volumes.
  • Changes to a volume are made directly.
  • Changes to a volume will not be included when you update an image.
  • Volumes persist until no containers use them.

The syntax for the VOLUME instruction is quite simple actually

VOLUME ["/data"]

Volume is the last dockerfile instruction that we will describe. We covered the main ones but there are more instructions that you can check at the Dockerfile instructions reference.