A Dockerfile is a script that includes a series of commands to automatically build a new Docker image from a base image. The Dockerfile is provided to the Docker daemon, which in turn executes the instructions inside the Dockerfile and creates the image.
One of the simplest use cases is one wants to customize a Docker image pulled from Dockerhub, adding new commands or changing the provided entrypoint scripts.
Dockerfile can also be useful to dynamic container provisioning. Imagine you work at a company provides PaaS or FaaS. The services requests sent from your clients can be mapped into the Dockerfiles. Docker daemon will then build the image on demand and pass the containers back to your clients.
Instructions Used by Dockerfile
You may have already noticed that Dockerfile’s syntax is rather simple. Each line is either a comment or a instruction followed by arguments, as shown below.
# Comment INSTRUCTION arguments
We will now walk through a sample Dockerfile, taken from a Jupyter build, and explain the structure and commands step-by-step.
# for line comment. The command
FROM indicates the base image to use. In this example, it uses
jupyter/pyspark-notebook as the base image. If the base image isn’t already on your host, Docker daemon will try to pull the image from Dockerhub.
# Copyright (c) Jupyter Development Team. # Distributed under the terms of the Modified BSD License. FROM jupyter/pyspark-notebook
Define the maintainer.
MAINTAINER Jupyter Project <firstname.lastname@example.org>
Define the user that runs the container.
ENV command is to set the environment variables that can be accessed by the processes running inside the container. This is equivalent to run
export VAR=arguments in a Linux shell.
# RSpark config ENV R_LIBS_USER $SPARK_HOME/R/lib
RUN command is to execute its arguments, in this case
apt-get, inside the container. The scope of
RUN is within the building time.
# R pre-requisites RUN apt-get update && \ apt-get install -y --no-install-recommends \ fonts-dejavu \ gfortran \ gcc && apt-get clean && \ rm -rf /var/lib/apt/lists/* USER $NB_USER # R packages RUN conda config --add channels r && \ conda install --quiet --yes \ 'r-base=3.3.2' \ 'r-irkernel=0.7*' \ 'r-ggplot2=2.2*' \ 'r-rcurl=1.95*' && conda clean -tipsy # Apache Toree kernel RUN pip --no-cache-dir install https://dist.apache.org/repos/dist/dev/incubator/toree/0.2.0/snapshots/dev1/toree-pip/toree-0.2.0.dev1.tar.gz RUN jupyter toree install --sys-prefix # Spylon-kernel RUN conda install --quiet --yes 'spylon-kernel=0.2*' RUN python -m spylon_kernel install --sys-prefix
Build the Image
The following example shows how to build an image using the Dockerfile. It is always recommended you build the image from the directory where the Dockerfile lives in. Be careful about the dot at the end of the line, it instructs the build to use current working dir as the build context.
## --rm clean up the intermediate layers ## -t target, e.g., apache/toree:1.02. The default tag is latest sudo docker build --rm -t repo:tag .
It is worth to mention, Docker uses cache to accelerate the build. If the Dockerfile has a new line inserted, Docker will use cached image layers before that new line and rebuild everything from that new line to the end.