Skip to main content

Containers and Images

If you're starting your journey with containers, it's a good idea to start by studying Docker. We have complete material available here on our site that covers this subject in detail.

This study will help you become familiar with creating images and understanding how containers work.

The main objective of this material is to teach how to containerize an application, which involves creating an image for it.

Although containers existed in Linux long before Docker, this tool revolutionized the way we work with them, establishing a standard for image creation. When building a Dockerfile, we're actually creating an image that will run in a container runtime.

To maintain these standards, opencontainers.org was created. A Dockerfile is simply a file that adheres to these standards.

It's not the developer's job to worry about which container runtime is installed on the cluster nodes where the application runs. They just need to create images following conventions and best practices.

As long as we use the opencontainers.org standards, we can run the image in several different runtimes. It could be containerd, or CRI-O, Podman, Rkt, and so on.

Although not mandatory, it's recommended that the file we'll use to declare the steps for image creation be named Dockerfile. The main advantage is that this eliminates the need to specify the file name as a parameter, as this is the standard used.

We can create images using several different tools:

The only requirement is to have a container runtime installed on the machine. These tools simply interact with the runtime.

Installing Docker provides us with all these tools in a single installation, which is ideal for development environments. However, to run containers in a cluster, it's better to avoid installing all these tools and opt only for what's necessary, i.e., a container runtime. For this reason, containerd is currently the most recommended option.

So for study purposes, install Docker for your operating system.

Images

If you're running an application on your machine, what did you need to do?

  1. Have an operating system to run things
  2. Have the language compiler or interpreter. For example, Java, Go, Python, and so on.
  3. Install the application dependencies, i.e., libraries, packages, etc.
  4. Build the application to generate executables
  5. Run the executables with defined environment variables.

But in the end, what we really need is to have a system that can execute the generated files. We don't need to have the application source code inside the container, just the executables and their dependencies.

I'll call the file that will create the routine Dockerfile.

  1. It's always necessary to declare a FROM
FROM ubuntu

Let's build this one. Create a Dockerfile with this content.

docker build -t davidpuziol/ubuntu .
DEPRECATED: The legacy builder is deprecated and will be removed in a future release.
Install the buildx component to build images with BuildKit:
https://docs.docker.com/go/buildx/

Sending build context to Docker daemon 317.2MB
Step 1/1 : FROM ubuntu
latest: Pulling from library/ubuntu
fdcaa7e87498: Pull complete
Digest: sha256:562456a05a0dbd62a671c1854868862a4687bf979a96d48ae8e766642cd911e8
Status: Downloaded newer image for ubuntu:latest
---> de52d803b224
Successfully built de52d803b224
Successfully tagged davidpuziol/ubuntu:latest

What was done? From ubuntu it created an image called davidpuziol/ubuntu and nothing more. If we run this image we have ubuntu there

docker run -it davidpuziol/ubuntu:latest bash
root@e8e350471d95:/# cat /etc/os-release
PRETTY_NAME="Ubuntu 24.04 LTS"
NAME="Ubuntu"
VERSION_ID="24.04"
VERSION="24.04 LTS (Noble Numbat)"
VERSION_CODENAME=noble
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=noble
LOGO=ubuntu-logo
# We don't have python here
root@e8e350471d95:/# whereis python
python:
root@e8e350471d95:/#

The davidpuziol/ubuntu image is exactly the same as the ubuntu image. See that it has the same size, the same IMAGE_ID


docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
davidpuziol/ubuntu latest de52d803b224 2 days ago 76.2MB
ubuntu latest de52d803b224 2 days ago 76.2MB
kindest/node <none> 09c50567d34e 2 months ago 956MB

This will be the initial image you'll use to start doing what you need. Let's imagine I want to install python on this ubuntu. Each RUN command executes a layer in your container.

We need to update the repositories and then install python.

FROM ubuntu:22.04

RUN sudo apt-get update
RUN sudo apt-get install python -y

Building this image with the same command above we'll have an image that updated and then installed python. This image used ubuntu 22.04. The first one used 24.04.

Every command must not have user interaction, remember that. That's why the -y was passed in the install parameter.

Building again, let's define a tag. When we don't define the tag it's latest

docker build -t davidpuziol/ubuntu:v1 .
# will show the entire installation process

# Running image
docker run -it davidpuziol/ubuntu:v1 bash
root@db6927a8b536:/# whereis python3
python3: /usr/bin/python3 /usr/lib/python3 /etc/python3 /usr/share/python3
root@db6927a8b536:/# python3 --version
Python 3.10.12

But we need this image to have environment variables. Let's put a variable called MYVAR=david

FROM ubuntu:22.04

RUN apt-get update
RUN apt-get install python -y

ENV MYVAR=david
# Let's change to v2
docker build -t davidpuziol/ubuntu:v2 .
DEPRECATED: The legacy builder is deprecated and will be removed in a future release.
Install the buildx component to build images with BuildKit:
https://docs.docker.com/go/buildx/

Sending build context to Docker daemon 317.2MB
Step 1/4 : FROM ubuntu:22.04
---> 437ec753bef3
Step 2/4 : RUN apt-get update
---> Using cache
---> 36ece36a9a1c
Step 3/4 : RUN apt-get install python3 -y
---> Using cache # OBSERVE THAT IT USED THE IMAGE IT ALREADY HAD BEFORE
---> 243f9ff2aa56
# ADDED ONLY THE LAST STEP
Step 4/4 : ENV MYVAR=david
---> Running in 9200241f31f5
---> Removed intermediate container 9200241f31f5
---> ef48a65d545f
Successfully built ef48a65d545f
Successfully tagged davidpuziol/ubuntu:v2

docker run -it davidpuziol/ubuntu:v2 bash
root@ba9c48260bb4:/# env | grep MYVAR
MYVAR=david
root@ba9c48260bb4:/#

So far we've only installed tools, environment variables, but haven't put anything to run yet.

When we define to run the application with the command docker run -it davidpuziol/ubuntu:v2 bash we force bash to be the entrypoint of this container. If bash stops, the container will die.

The container's function is to die, what's keeping it alive is the bash that was passed. If bash stops it will die. Every container needs an entrypoint. The entrypoint is the process that will keep the container running.

FROM ubuntu:22.04

RUN apt-get update
RUN apt-get install python -y

ENV MYVAR=david

ENTRYPOINT ["bash"]
 docker build -t davidpuziol/ubuntu:v3 .
DEPRECATED: The legacy builder is deprecated and will be removed in a future release.
Install the buildx component to build images with BuildKit:
https://docs.docker.com/go/buildx/

Sending build context to Docker daemon 317.2MB
Step 1/5 : FROM ubuntu:22.04
---> 437ec753bef3
Step 2/5 : RUN apt-get update
---> Using cache
---> 36ece36a9a1c
Step 3/5 : RUN apt-get install python3 -y
---> Using cache
---> 243f9ff2aa56
Step 4/5 : ENV MYVAR=david
---> Using cache
---> ef48a65d545f
Step 5/5 : ENTRYPOINT ["bash"]
---> Running in 952493e489a0
---> Removed intermediate container 952493e489a0
---> a2a274001f9a
Successfully built a2a274001f9a
Successfully tagged davidpuziol/ubuntu:v3

# We didn't pass the command it will execute. When we entered it went straight to bash
docker run -it davidpuziol/ubuntu:v2
root@e0ae401dc83b:/#

Let's observe how the image grew.

docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
davidpuziol/ubuntu v3 a2a274001f9a 2 minutes ago 158MB # entrypoint didn't change size, but changed ID
davidpuziol/ubuntu v2 ef48a65d545f 11 minutes ago 158MB # environment doesn't change size, but changed ID
davidpuziol/ubuntu v1 243f9ff2aa56 18 minutes ago 158MB # OS + python
davidpuziol/ubuntu latest de52d803b224 2 days ago 76.2MB # only the OS
ubuntu latest de52d803b224 2 days ago 76.2MB
ubuntu 22.04 437ec753bef3 8 days ago 77.9MB
kindest/node <none> 09c50567d34e 2 months ago 956MB

Note:

ENTRYPOINT represents in the POD container the command: CMD represents in the POD container the args:

If we want to create an image that will sleep for 60 seconds

FROM ubuntu:22.04

RUN apt-get update
RUN apt-get install python -y

ENV MYVAR=david

ENTRYPOINT ["sleep"]
CMD ["60"]

We also have COPY which will copy content from the machine that's building to inside the image. This is when you'll copy the application into the container.

FROM ubuntu:22.04

RUN apt-get update
RUN apt-get install python -y
RUN other dependency commands you need for python for example


ENV MYVAR=david

# SOURCE ON LOCAL MACHINE DESTINATION IN CONTAINER
copy . /opt/myapp

# same as doing RUN cd /opt/myapp to change directory
WORKDIR /opt/myapp

# the container port you want to expose
EXPOSE 8080

ENTRYPOINT ["python3"]
CMD ["meuapp.py"]

This is the basics you need to know. The port will be the communication with the application. Without exposing a port, a Kubernetes service won't be able to reach the pod's container.

Of course, your application will also be prepared to do something on port 8080, that's why we're exposing this port.

There are many details involved in creating an image. This container above is running with the root user which wouldn't be ideal for security. We'll see more details along the way.

Do a more in-depth study of Docker and everything will become clearer.

Check out more information about entrypoint and runtime

Be familiar with the commands

docker image
docker inspect
docker run -p hostport:containerport image:tag
docker run -p hostport:containerport --entrypoint myentrypoint image:tag arg1 arg2 arg3
docker run -e APP=test
# commands mapping volumes are also necessary to understand containers and how things work inside pods