What is Docker?
Docker is an open platform that enables development, testing and deployment through isolated units called containers. Docker separates applications from infrastructure and helps manage infrastructure and applications in the same way.
Why Docker?
Docker is useful throughout the development lifecycle. Docker uses read-only units called images to bundle both the code and the environment for running the code. Anyone who wants to run the code can just pull the image and create containers without having to set up the environment themselves.
The history of Docker
Docker Inc. was founded by Kamel Founadi, Solomon Hykes and Sebastien Pahl during Y Combinator Summer 2010 and launched in 2011. The startup was also one of the 12 startups in the first cohort of Founder’s Den.Hykes started the Docker project in France as an internal project within dotCloud, a platform-as-a-service company.
Docker was first introduced to the public in 2013 at PyCon in Santa Clara. It was released as open source in March 2013. At the time, it used LXC as its default execution environment. A year later, with the release of version 0.9, Docker replaced LXC with its own component, libcontainer, written in the Go programming language.
In 2017, Docker created the Moby project for open research and development.
Advantages
Lightweight: Docker containers are lightweight because, unlike VMs, they do not have their own operating system and use the host operating system. Docker provides the ability to package and run the application in containers. Since these containers are lightweight and isolated, we can run multiple containers on the same machine, reducing infrastructure costs.
Code Sharing: Docker makes it easy to share code as there is no need to set up an environment to run the code. This allows for faster development and testing cycles.
Scaling: Creating multiple instances of the same application is as easy as creating containers. With container orchestration, it is very easy to manage multiple instances of the application.
Consistency: There are no environment-specific issues because the same image runs in all environments.
Portability: Docker containers can run on any system that supports Docker, regardless of the underlying infrastructure. This makes it easy to move applications between development, test, and production environments, as well as between on-premises and cloud infrastructures.
Version Control: Docker images can be versioned and stored in repositories, allowing for easy rollback and keeping a history of changes.
Ecosystem: Docker has a large and active ecosystem with a wide range of pre-built images and tools available in Docker Hub and other tabs. This makes it easier to get started and find solutions to common problems.
Security: Containers are generally more secure than applications running directly on the host operating system. They provide process and file system isolation, reducing the attack surface.
Disadvantages:
Learning Curve: Docker has a learning curve, especially for those new to containerization and orchestration. Understanding concepts like images, containers, Docker files and orchestration can take some time.
Resource Overhead: Containers are more efficient than traditional virtualization, but there is still some overhead when running multiple containers on a host. This can impact performance in resource-constrained environments.
Complex networking: Managing networking between containers and connecting them to external networks can be complex, especially in larger deployments.
Limited Windows support: Docker was originally developed for Linux, and while Docker exists for Windows, it may not be as feature-rich or mature as the Linux version.
Security issues: Although containers provide isolation, there are still security concerns, especially when it comes to untrusted or misconfigured containers. Docker security best practices must be followed to minimize vulnerabilities.
Lack of State: Containers are typically stateless, which means they aren’t well suited for applications that require persistent data storage. Additional solutions such as Docker volumes or databases are needed to manage stateful data.
Orchestration Complexity: While Docker alone is suitable for managing a few containers, orchestrating and scaling containerized applications in production environments can be very complex. Tools like Kubernetes are often used for this, which can add another layer of complexity.
When should you use Docker?
Development and Testing: Developers can create consistent development and test environments.
Microservices: Containers are a popular choice for deploying and managing microservices.
CI/CD Pipelines: Containers simplify continuous integration and continuous delivery (CI/CD) processes.
Cloud deployments: Many cloud providers support Docker containers for easy deployment.
Scaling of web applications: Containers can easily scale to handle heavy traffic loads.
Data science: Data scientists use Docker to create reproducible environments for their work.
Architecture
Docker uses a client-server architecture. The Docker client communicates with the Docker daemon, which takes care of creating, running, and deploying your Docker containers.
Docker client
A Docker client is any computer running Docker. A Docker client sends commands to Docker Deamon, which executes them. Docker client can communicate with more than one daemon.
Docker Deamon
Docker Daemon is probably the heart of Docker. It manages images, containers, networks, and volumes. It also executes the Docker client’s commands.
Registry
A Docker registry stores Docker images. Docker Hub is a public registry that anyone can use, and Docker searches for images on Docker Hub by default. You can even run your own private registry.
Docker Desktop
Docker Desktop is an easy-to-install application for your Mac, Windows, or Linux environment that lets you create and share containerized applications and microservices. Docker Desktop includes the Docker daemon, Docker client Docker Compose, Docker Content Trust, Kubernetes, and Credential Helper.
Docker Objects
Image
The docker image is a read-only blueprint with layered instructions for creating containers. All images are created from a base image.
- Image Layers
Docker images are created as layers. Each command in the Docker file can be considered a layer, and that command can in turn contain other layers. When we first create an image, all layers are created, but in subsequent builds, only the layers that have changed are recreated. This small optimization reduces the build time. - Multi-tier builds
Multi-level Docker builds are a technique used to create smaller and more efficient docker images. It allows you to use multiple build tiers within a single Docker file, with each tier performing a specific task in the build process. This is especially useful when you are building applications that require multiple dependencies and build tools, but you want the final image to be as small as possible.
Container
A container is an ephemeral instance of an image. A container contains both the application code and the environment for executing the code. A container is isolated from the host and other containers.
- Lifecycle of a container:
A Docker container goes through a lifecycle with various states and transitions as it is created, started, and eventually stopped or removed. Understanding the Docker container lifecycle is critical to effectively managing containers. Here is an overview of the typical Docker container lifecycle:
Created State: This is the initial state of a Docker container. A container is created from an image, but is not yet operational. At this stage, the container has a unique identifier (a container ID), but it does not consume any system resources.
Configuration: Before starting a container, you can configure various settings such as environment variables, network settings, container names, and storage volume by specifying them in thedocker run
command or in a Docker Compose file.
Execution state: A container enters the running state when it is started with thedocker run
command. In this state, the container’s process is executed and it uses system resources such as CPU and memory. The container is now isolated from the host and other containers.
Paused State (Optional): While a container is running, you can pause it using thedocker pause
command. In this state, the container’s processes are frozen and it no longer uses system resources. You can resume a paused container with thedocker unpause
command.
Restarting a container (optional): Containers can be restarted with thedocker restart
command. This will stop and restart the container, allowing you to make changes or fix problems without having to recreate the container.
Stopping a Container: To stop a running container, you can use thedocker stop
command, specifying the container ID or name. This sends a signal to the main container process, giving it a chance to perform cleanup before shutting down. The container enters a stopped state.
Stopped state: After a container is stopped, it enters the exited state. In this state, the container no longer uses system resources, but retains its configuration and data. You can restart a terminated container with thedocker start
command.
Removing a container: You can remove a container with thedocker rm
command by specifying the container ID or name. This permanently deletes the container, including its configuration and data. Be careful when removing containers, as data may be lost.
Create a new container (optional): You can create a new container with the same image and configuration, which is useful if you need to replace a container without changing its configuration. This can be done with thedocker run
command with the same container name or a new container name.
Container logs: Throughout the container lifecycle, you can use thedocker logs
command to view the container’s standard output and standard error logs. This is useful for debugging and monitoring.
Container Inspection: You can inspect a container’s configuration and metadata with thedocker inspect
command, which provides detailed information about the container’s current state and settings.
Network
Since Docker containers are isolated by default, networks enable communication between containers and the outside world.
- Network types
There are three ways containers can communicate: - Containers with the outside world: this is enabled by default
- Containers and containers on the same network: containers can communicate with each other on the same network using each other’s IP address or using service discovery.
Volume
Docker containers are ephemeral, meaning that if the container is lost, so is all the data. This can be a problem, especially if a container crashes and is replaced by a new container. Volumes provide a way to keep data from containers. There are three types of docker volume, as we will see in a moment.
- Volume types
- Anonymous volumes: anonymous volumes are volumes without a meaningful name. They are created by mapping an unknown location on the host to a location in the container. They are typically used for temporary data that does not need to be explicitly named.
- Named volumes: these are explicitly created volumes with a specific name and are suitable for data that should persist beyond the lifetime of containers.
- Bind mounts
Bind mounts in Docker are a mechanism that allows you to mount a directory or file from the host machine to a Docker container. This allows containers to access and manipulate files and directories on the host system. Bind mounts are useful for sharing data between the host and containers, especially if you need to persist data or if you want to provide configuration or code to containers without having to recreate images.
Dockerfile
Dockerfile is a file containing all the instructions for creating an image. Each instruction in Dockerfile forms a layer in the creation of an image. Most images are created from a base image, so the first line in Dockerfile starts with a base image.A sample Docker file is shown below.
-Sample Docker file
FROM node
WORKDIR /app
COPY . /app
RUN npm install
CMD [node index.js]
“FROM” let docker know that node is being used as the base image. Base image often forms the foundation of the environment on which the code runs.
“WORKDIR” creates a directory app inside the container. This is where all the code will be.
“COPY” copies the code from docker host to the container working directory.
“RUN” normally runs the command that follows at the time of image creation. In this case it installs all the npm packages required for the source code to run.
“CMD” normally runs the code that follows not at the time of image creation but at the time of container creation. In this case it runs the command to start the node server.
Docker comparison with Virtual Machines
Docker containers: Docker containers are ideal for microservices architectures, rapid application deployment, and development environments where lightweight, portable, and scalable solutions are needed. They are well suited for modern software development practices such as CI/CD.
Virtual Machines: Virtual Machine is suitable for running legacy applications, applications with strict security and isolation requirements, or scenarios where multiple operating systems must coexist on the same physical hardware. They are also used for hosting multiple applications with different dependencies on the same infrastructure.
Key Docker commands
Container Management:docker run
: Create and start a new container from an image.
docker run [options] image_name [command]_docker ps
: List running containers.
docker ps [options]docker ps -a
: List all containers, including those that have exited or been stopped
docker ps -adocker start
: Start one or more stopped containers.
docker start [options] container_id/container_namedocker stop
: Stop one or more running containers gracefully.
docker stop [options] container_id/container_namedocker rm
: Remove one or more containers.
docker rm [options] container_id/container_name
Image Management:docker images
: List all available images on your system
docker imagesdocker pull
: Download an image from a Docker registry.
docker pull image_name[:tag]_docker build
: Build a Docker image from a Dockerfile.
docker build [options] path_to_Dockerfiledocker rmi
: Remove one or more images.
docker rmi [options] image_id/image_name
To get detailed information about any Docker command and its options, you can use the--help
option with the command, e.g.,docker run --help
.
Docker compose
Often, an application requires multiple containers to be launched in a specific order, which can be done with Docker Compose.
What is Docker Compose
Docker Compose is a tool for defining and running Docker applications with multiple containers. It allows defining complex applications with a declarative, YAML-based syntax that facilitates managing the configuration and orchestration of multiple containers working together. Docker Compose is especially useful for developing, testing, and deploying applications that consist of multiple services or microservices.
Why Docker Compose
- Development environments with multiple microservices.
- Running multi-container applications with dependencies.
- Testing complex application stacks.
- Deploying applications with multiple services for local testing or staging.
Docker Swarm
Docker Swarm is a container orchestration and clustering tool that allows you to create and manage a swarm of Docker nodes (Docker hosts) as a single virtual system. It’s part of the Docker ecosystem and provides built-in support for container clustering and orchestration. Docker Swarm lets you deploy and manage containerized applications across multiple nodes, making it a popular choice for deploying containers at scale.
Use cases
- Scaling and managing containerized applications in production.
- High availability and load balancing for containerized services.
- Deployment of microservices and distributed applications.
- Zero downtime updates and scaling of services.