Docker Multistage Build: A Complete Guide

Introduction

Docker Multistage Build: In today’s fast-paced software development landscape, Docker has become a cornerstone technology for building, deploying and running applications in lightweight, portable containers. It simplifies the deployment process by encapsulating applications and their dependencies in isolated environments and ensuring consistency between development, test and production systems.

However, as applications become more complex, the size of Docker images can increase significantly, leading to slower builds, increased memory requirements and inefficient deployments. This is particularly problematic in resource-constrained environments such as CI/CD pipelines or cloud platforms, where optimization has a direct impact on performance and cost.

This is where Docker Multi-Stage Builds come into play. Introduced in Docker 17.05, multi-stage builds have revolutionized Dockerfile creation by allowing developers to create cleaner, smaller and more secure images without sacrificing build flexibility. By splitting the build process into multiple stages, unnecessary dependencies and artifacts can be removed from the final production image, resulting in optimized containers that are ready for immediate use.

In this blog post, we’ll dive deep into the concept of multi-stage builds, explore their benefits and go through practical examples and use cases. Whether you’re a beginner or an experienced developer, mastering multi-stage builds can drastically improve your Docker workflows and help you deploy production-ready applications more efficiently.

What are Docker Multistage Builds?

Docker Multi-Stage Builds are a powerful feature introduced in Docker 17.05 to streamline the process of building and optimizing container images. They allow developers to define multiple stages within a single Dockerfile, creating leaner and more efficient images by removing unnecessary dependencies from the final production image during build time.

How do Multistage Builds work?

Traditionally, Docker images were built in a one-step process where all build tools, dependencies and artifacts were combined into one package. This often resulted in bloated images that contained components that were only needed during the build process but were irrelevant to the runtime.

Multi-stage builds solve this problem by splitting the build and runtime environments into different stages. Each stage in the Dockerfile starts with a “FROM” statement, and developers can copy artifacts — such as compiled binaries, libraries or static files — from one stage to another using the “COPY –from” command. This modular approach allows you to refine the image incrementally, leaving unnecessary layers behind.

Example of a Multistage build

Here is a simple example of a multi-stage build for a Node.js application:

#Stage 1: Build Stage
FROM node:18-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
RUN npm run build

#Stage 2: Production Stage
FROM node:18-alpine AS production
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
CMD ["node", "dist/server.js"]

Explanation:

Stage 1 (Builder):
- Installs the dependencies and compiles the application code.
- It leaves source files and build tools once the build process is complete.
Stage 2 (Production):
- Copies only the compiled output (dist) and the required node_modules from the previous stage.
- A minimal image is created containing only the runtime environment required to run the application.

Comparison with traditional builds

Single-Stage Builds:
- Larger image size as build tools and dependencies are not removed.
- Potential security risks due to leftover development tools.
- Slower deployment times and higher storage costs.
Multistage Builds:
- Smaller, cleaner and production-optimized end images.
- Better security by eliminating unnecessary components.
- Easier maintenance and improved CI/CD workflows.

By using multi-stage builds, developers can effectively separate the creation and operation of their applications, resulting in more manageable and powerful Docker images.

Why use Multistage builds?

Creating efficient and secure Docker images is a top priority for developers, especially as applications grow in size and complexity. Multi-stage builds offer a streamlined approach to creating production-ready images by removing some of the limitations of traditional single-stage builds. Learn in detail why multi-stage builds have become an indispensable tool in modern Docker workflows:

Smaller image sizes

One of the main motivations for introducing multi-stage builds is to reduce the size of Docker images. Traditional builds often contain unnecessary dependencies, libraries and tools that are only needed during the build process and remain in the final image, adding to its size.

Example problem:
A Node.js application might contain npm and development dependencies in the final image, even though only the compiled JavaScript files are required at runtime.

Multi-stage solution:
Multi-stage builds allow you to install dependencies, compile code and run tests in intermediate stages, copying only the most important results to the final stage. The result is smaller, production-ready images.

Improved security

Reducing the image size not only improves performance, but also minimizes the attack surface. Images with fewer packages and tools are less likely to contain vulnerabilities that attackers can exploit.

Example
Development tools such as compilers and debugging utilities are often unnecessary in production environments and can pose security risks if exposed. Multi-stage builds ensure that these tools are not included in the final image, leaving a minimal and hardened runtime environment.

Simplified build pipelines

Multi-stage builds integrate seamlessly into CI/CD pipelines and make it easier to automate builds, tests and deployments. By defining multiple stages in a single Dockerfile, developers can consolidate complex build processes without relying on external scripts or tools.

Example workflow:

Stage 1: Build the application and run tests.
Stage 2: Extract only the compiled artifacts.
Stage 3: Package the runtime image for deployment.

This approach eliminates the need to maintain multiple Dockerfiles or custom shell scripts, simplifying build and deployment pipelines.

Faster build and deployment times

Multi-stage builds effectively utilize Docker’s caching mechanisms. Since each stage of the build process is cached independently, only the layers that change need to be rebuilt.

Example benefit:
If code changes only affect a specific stage (e.g. the application source code), the dependencies and base image layers remain cached, speeding up recovery times during development and deployment.

Isolation of the environment

Multi-stage builds allow developers to isolate different parts of the build process in separate environments. This ensures that the tools and dependencies required to build the application do not conflict with the final runtime environment.

Example use case:

Use a large image (e.g. Ubuntu) in the build phase to compile the code.
Use a lightweight base image (e.g. Alpine Linux) in the production phase for minimal runtime requirements.

Easier maintenance and updates

By making the build process modular, multi-stage builds make it easier to maintain Dockerfiles over time. Developers can update certain stages without impacting other stages, reducing the risk of breaking the build.

Example
Updating dependencies or changing build tools only affects the build phase and leaves the runtime phase unaffected.

Cost efficiency

Smaller and optimized Docker images reduce storage and data transfer costs, particularly in cloud environments where costs increase with resource usage. Multi-stage builds allow teams to create cost-efficient images without compromising functionality.

Real-World Use Cases

Frontend applications: Build React or Angular applications in one phase and deliver static files with NGINX in the final phase.
Backend applications: Compile Java or Go binaries in one build phase and copy them to a minimal runtime image.
Machine learning models: Train models in the build phase and provide only the serialized models and dependencies needed for inference.

The most important thing at the end

Multi-stage builds are not only about reducing image size, but also about increasing security, simplifying pipelines and improving performance. Whether you’re developing monolithic applications or microservices, adopting multi-stage builds can significantly optimize your Docker workflow and deployment process.

Setting up a basic Multistage Build

Docker multi-stage builds simplify the creation of optimized and lightweight container images. In this section, we’ll walk through the steps to set up a basic multi-stage build and explain the key concepts required to create efficient Dockerfiles.

The key components of a Multistage build

A multi-stage build consists of multiple “FROM” statements within the same Dockerfile, with each “FROM” starting a new stage. You can name these stages and selectively copy artifacts between them.

The most important commands used in multi-stage builds:

FROM <image> – Specifies the base image for each stage.
AS <name> – Assigns a name to the stage so that it can be referenced more easily later.
COPY --from=<stage> – Copies files or artifacts from a previous stage to the current stage.

Example: Node.js application

Let’s create a simple multi-stage Dockerfile for a Node.js application.

Dockerfile:

#Stage 1: Build stage
FROM node:18-alpine AS builder
WORKDIR /app

#Copy dependencies and install them
COPY package.json package-lock.json ./
RUN npm install

#Copy source code and build the app
COPY . .
RUN npm run build

#Stage 2: Production stage
FROM node:18-alpine AS production
WORKDIR /app

#Copy only the built files and node modules from the builder stage
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules

#Define the command to run the app
CMD ["node", "dist/server.js"]

Step-by-step explanation

Stage 1 – Build stage:

FROM node:18-alpine AS builder

Uses a lightweight Alpine Linux image with Node.js installed.
Name this stage as Builder to use it in later stages.

WORKDIR /app

Defines the working directory within the container.

COPY package.json package-lock.json ./

Copies dependency-related files into the container.

RUN npm install

Installs all dependencies required to develop and build the app.

COPY ..

Copies the entire source code of the application into the container.

RUN npm run build

Compiles the source code (e.g. transpiling TypeScript or bundling JavaScript).

Stage 2 – Production stage:

FROM node:18-alpine AS production

Starts a new stage with the same lightweight Node.js base image.

WORKDIR /app

Sets up the working directory for the production environment.

COPY --from=builder /app/dist ./dist

Copies only the compiled dist folder from the builder stage.

COPY --from=builder /app/node_modules ./node_modules

Copies only the required node_modules from the previous stage.

CMD ["node", "dist/server.js"]

Defines the command to start the application when the container is running.

Executing the Multistage build

To build and run the Docker image:

Build the image:

 docker build -t my-node-app .

Start the container:

 docker run -p 3000:3000 my-node-app

Check the application:

Open http://localhost:3000 in your browser to check if the application is running.

Observe the advantages

Smaller image size:
The production image contains no development dependencies and no source files, making it significantly smaller.
Faster deployment:
Lightweight images load faster and can be deployed quickly in cloud environments.
Simplified debugging:
Intermediate steps can be checked during the build process using Bash:

 docker build --target builder -t temp-build .
 docker run -it temp-build sh

Common variations

Use different base images:
Use a fully functional base image for the build phase and a minimal image (e.g. scratch or alpine) for production.
Adding tests in the build phase:

 RUN npm test

Integrate tests into the build phase and only continue if the tests are successful.

Multi-platform builds:
Build images for different architectures with:

 docker buildx build --platform linux/arm64,linux/amd64 -t my-image .

Optimization of production builds

Optimizing Docker production builds is critical to creating lightweight, secure, high-performance containers. Multi-stage builds provide a framework to achieve these goals by separating build dependencies from runtime requirements. This section introduces techniques to further refine production builds with multi-stage Dockerfiles.

Stripping development dependencies

Problem:

Development tools, compilers and libraries used during the build process often remain in the final image, increasing its size and attack surface.

Solution:

Use multi-stage builds to exclude these dependencies from the production phase.

Example for a Node.js application:

#Stage 1: Build stage
FROM node:18-alpine AS builder
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm install
COPY . .
RUN npm run build

#Stage 2: Production stage
FROM node:18-alpine
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
CMD ["node", "dist/server.js"]

Key optimizations:

The npm install command is executed in the builder stage, but only the dist folder and the required dependencies are copied to the production stage.
The development dependencies are never included in the final image.

Minimize layers

Problem:

Docker images consist of multiple layers, and too many layers increase complexity and size.

Solution:

Combine commands into fewer layers wherever possible.
Use “&&” to concatenate commands and “” for readability.

Example

RUN apt-get update && apt-get install -y \
 curl \
 git \
 && rm -rf /var/lib/apt/lists/*

Key optimizations:

Reduces the number of levels by combining commands into one step.
Cleans up temporary files (rm -rf) to minimize leftover artifacts.

Use of .dockerignore

Problem:

Copying unnecessary files (e.g. logs, temporary files and local configurations) into the image leads to its enlargement.

Solution:

Use a .dockerignore file to exclude unwanted files and directories.

Example .dockerignore:

node_modules 
npm-debug.log
Dockerfile
.dockerignore
.git
.idea
*.md
.env

Key optimizations:

Ensures that sensitive files and unnecessary artifacts are excluded from the context of image creation.
Reduces creation time and prevents accidental disclosure of credentials.

Selection of minimal base images

Problem:

Using large base images adds unnecessary overhead to the final container.

Solution:

Choose smaller, security-oriented base images such as alpine or scratch.

Example:

FROM golang:1.20 AS builder
WORKDIR /app
COPY .
RUN go build -o app

#Minimum runtime image
FROM alpine:latest
WORKDIR /app
COPY --from=builder /app/app .
CMD ["./app"]

Key optimizations:

Uses golang for creation and the lightweight alpine image for production.
The result is a final image size of less than 10 MB compared to 800+ MB for traditional images.

Multi-level secret management

Problem:

Sensitive data such as API keys or SSH credentials can be exposed if they are included in the final image.

Solution:

Use multi-level builds to securely handle secrets and prevent them from being persisted.

Example:

#Stage 1: Build stage
FROM golang:1.20 AS builder
WORKDIR /app
ARG API_KEY
ENV API_KEY=$API_KEY
RUN echo $API_KEY > ./key.txt
RUN go build -o app

#Stage 2: Last stage
FROM alpine:latest
WORKDIR /app
COPY --from=builder /app/app .
CMD ["./app"]

Key optimization:

Passes secrets as build arguments (ARG API_KEY) only in the build phase.
Secrets are excluded from the final image.

Multiplatform builds

Problem:

Applications must run on different architectures (e.g. x86, ARM).

Solution:

Use the Docker BuildKit and multiplatform builds to support different architectures.

Example:

docker buildx build --platform linux/amd64,linux/arm64 -t my-app .

Key optimizations:

Enables builds for multiple architectures without the need for separate Dockerfiles.
Ensures compatibility between different environments.

Leveraging Build Cache

Problem:

Building Docker images from scratch can be slow, especially in CI/CD pipelines.

Solution:

Optimize caching by carefully structuring Dockerfile instructions.

Example:

#Copy the dependencies first to use the caching
COPY package.json package-lock.json ./
RUN npm install

#Copy the rest of the source code
COPY . .
RUN npm run build

Key optimizations:

Separates dependencies and source code to maximize caching of layers.
Accelerates incremental builds by reusing cached layers.

Add health checks

Problem:

If the application stops responding after deployment, the deployment process can fail.

Solution:

Use Docker’s HEALTHCHECK statement to define health probes.

Example:

HEALTHCHECK CMD curl --fail http://localhost:3000 || exit 1

Key optimization:

Enables monitoring tools to check the state of the container and restart it if necessary.

Testing in build stages

Issue:

Skipping tests during the build can lead to undetected problems in production.

Solution:

Run tests as part of the build phase to detect bugs early.

Example:

RUN npm test

Key optimizations:

Ensures that only tested code gets into production and thus reduces runtime errors.

Final recommendations

Use multi-stage builds to minimize image size and increase security.
Optimize the layer structure and caching to speed up builds.
Use .dockerignore to exclude unnecessary files.
Choose minimal base images to reduce runtime overhead.
Implement secrets management strategies to protect sensitive data.
Integrate tests and health checks to ensure application reliability.

Multi-level builds for language-specific projects

Multi-level builds are very versatile and can be customized for different programming languages and frameworks. Each language has its own requirements for compiling, building and packaging, which makes multi-level builds an indispensable tool for creating optimized images. This section uses language-specific examples to show how multi-level builds streamline the development and deployment process.

Node.js applications

Scenario:
A React or Angular frontend application that requires dependencies and a build process, but only needs static assets in production.

Dockerfile example:

#Stage 1: Build Stage
FROM node:18-alpine AS builder
WORKDIR /app

#Install the dependencies and build the application
COPY package*.json ./
RUN npm install
COPY . .
RUN npm run build

#Phase 2: Production phase
FROM nginx:stable-alpine AS production
COPY --from=builder /app/build /usr/share/nginx/html
EXPOSE 80
CMD ["nginx", "-g", "daemon off;"]

Key Points:

In the build phase, the JavaScript code is compiled into static files.
In the production phase, an NGINX server is used to provide the static files.
The final image contains only the necessary files for deployment and is therefore lightweight.

Python applications

Scenario:
A Flask or Django backend application that requires dependencies during development but should do without unnecessary tools in production.

Dockerfile example:

#Stage 1: Build Stage
FROM python:3.10-slim AS builder
WORKDIR /app

#Install the dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

#Phase 2: Production phase
FROM python:3.10-slim AS production
WORKDIR /app

#Copy only the necessary files
COPY --from=builder /usr/local/lib/python3.10/site-packages /usr/local/lib/python3.10/site-packages
COPY --from=builder /usr/local/bin /usr/local/bin
COPY . .

EXPOSE 8000
CMD ["gunicorn", "-b", "0.0.0.0:8000", "app:app"]

Key Points:

The dependencies are installed in the build phase and copied to the production phase.
Uses gunicorn as production server instead of Flask’s development server.
The final image does not include development tools like pip, which reduces size and vulnerabilities.

Java applications

Scenario:
A Java application that uses Maven or Gradle for dependency management and packaging.

Dockerfile example:

#Stage 1: Build stage
FROM maven:3.8.6-openjdk-17 AS builder
WORKDIR /app

#Copy the source code and build
COPY pom.xml .
COPY src ./src
RUN mvn clean package

#Stage 2: Production phase
FROM openjdk:17-slim AS production
WORKDIR /app

#Copy the JAR file from the builder stage
COPY --from=builder /app/target/app.jar .
CMD ["java", "-jar", "app.jar"]

Key Points:

The Maven build takes place in the first phase so that the dependencies remain isolated.
The production phase contains only the JAR file and the JRE.
The use of openjdk:17-slim ensures a smaller image footprint.

Go applications

Scenario:
A Go application that can be compiled into a single binary file and executed without runtime dependencies.

Dockerfile example:

#Stage 1: Build Stage
FROM golang:1.20 AS builder
WORKDIR /app
COPY .
RUN go build -o app

#Phase 2: Production phase
FROM alpine:latest AS production
WORKDIR /app

#Copy the compiled binary file
COPY --from=builder /app/app .
CMD ["./app"]

Key Points:

The build phase compiles the Go application into a binary file.
For the final image, alpine is used, which is extremely lightweight.
The resulting image size is often less than 10 MB.

PHP applications

Scenario:
A PHP application that uses Composer for dependency management.

Dockerfile example:

#Stage 1: Build Stage
FROM composer:2 AS builder
WORKDIR /app

#Install PHP dependencies
COPY composer.json composer.lock ./
RUN composer install --no-dev --optimize-autoloader

#Phase 2: Production phase
FROM php:8.2-fpm-alpine AS production
WORKDIR /app

#Copy the source code and dependencies
COPY --from=builder /app/vendor ./vendor
COPY . .

CMD ["php-fpm"]

Key points:

Composer is used in the build phase to install dependencies without development tools.
In the production phase, PHP-FPM is used for an optimized PHP runtime.
The final image is minimalistic and focuses solely on deploying PHP code.

C++ applications

Scenario:
A C++ application that requires a compiler during the build but should not use it in production.

Dockerfile example:

#Stage 1: Build stage
FROM gcc:12 AS builder
WORKDIR /app
COPY . .
RUN g++ -o app main.cpp

#Stage 2: Production Stage
FROM alpine:latest
WORKDIR /app
COPY --from=builder /app/app .
CMD ["./app"]

Key points:

Uses GCC in the build stage for compilation.
Production stage only contains the compiled binary, which reduces the size considerably.

Models for machine learning (Python)

Scenario:
A Python machine learning application that requires dependencies such as TensorFlow or PyTorch during training but not during inference.

Dockerfile example:

#Stage 1: Training Stage
FROM python:3.10 AS trainer
WORKDIR /app
COPY requirements-train.txt .
RUN pip install -r requirements-train.txt
COPY . .
RUN python train_model.py

#Stage 2: Inference Stage
FROM python:3.10-slim AS inference
WORKDIR /app
COPY requirements-infer.txt .
RUN pip install -r requirements-infer.txt
COPY --from=trainer /app/model.pkl ./model.pkl
CMD ["python", "predict.py"]

Key points:

Keeps the training and inference environments separate.
The last stage includes only the trained model and dependencies to minimize the size and dependencies.

Improving CI/CD pipelines with multistage builds

Continuous Integration and Continuous Deployment (CI/CD) pipelines have become an integral part of modern software development, enabling faster releases and higher software quality. Docker’s multi-stage builds complement CI/CD pipelines by simplifying the build, testing and deployment processes. In this section, you will learn how multi-stage builds can improve CI/CD workflows and make deployment more efficient.

Why multi-stage builds are ideal for CI/CD pipelines

Challenges in traditional CI/CD pipelines:

Large image sizes slow down builds, tests and deployments.
Dependencies for development and production are bundled, which increases security risks.
Test phases often require separate scripts or tools, resulting in complex pipelines.
Managing multiple Dockerfiles for different phases increases maintenance efforts.

Multi-stage build advantages in CI/CD pipelines:

Single Dockerfile for all stages: Bundles builds, tests and deployments into a single file.
Optimized images: Creates lightweight production images without build dependencies.
Consistency across environments: Ensures that development, test and production environments are consistent.
Simplified deployment workflow: Reduces reliance on external dependency management tools.

CI/CD workflow example with multistage builds

Let’s look at an example of a CI/CD pipeline for a Node.js application with GitHub Actions and Docker multi-stage builds.

Dockerfile:

#Stage 1: Building and testing
FROM node:18-alpine AS builder
WORKDIR /app

#Install dependencies
COPY package.json package-lock.json ./
RUN npm install

#Copy the source code and run the tests
COPY . .
RUN npm test

#Build the application
RUN npm run build

#Phase 2: Production
FROM node:18-alpine AS production
WORKDIR /app

#Copy only the created artifacts
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules

#Set the default command
CMD ["node", "dist/server.js"]

GitHub Actions Workflow (.github/workflows/docker-build.yml):

name: CI/CD Pipeline

on:
  push:
    branches:
- main

jobs:
  build-and-deploy:
    runs-on: ubuntu-latest

    steps:
- name: Checkout Code
        uses: actions/checkout@v3

- name: Set Up Docker Buildx
        uses: docker/setup-buildx-action@v2

- name: Build Docker Image
        run: docker build -t my-app:latest .

- name: Test Image Locally
        run: docker run --rm my-app:latest npm test

- name: Push Image to Docker Hub
        run: |
          echo "${{ secrets.DOCKER_PASSWORD }}" | docker login -u "${{ secrets.DOCKER_USERNAME }}" --password-stdin
          docker tag my-app:latest my-dockerhub-username/my-app:latest
          docker push my-dockerhub-username/my-app:latest

- name: Deploy to Server
        run: |
          ssh user@server "docker pull my-dockerhub-username/my-app:latest && docker-compose up -d"

Key features in the pipeline:

Build and Test Stages: Ensures that only successful builds are deployed.
Multi-stage Dockerfile integration: Creates lean, production-ready images.
Automated deployment: Deploys the latest image directly to the server.
Secret management: Uses GitHub secrets for secure authentication of the Docker registry.

CI/CD pipeline stages with multistage builds

Build Stage:

Installs the dependencies.
Compiles the code into a usable format (e.g. static files or binary files).
Optimizes and prepares the code for production.

Test Stage:

Runs unit tests, integration tests and linting tools.
Ensures that only successful builds are deployed.

Packaging Stage:

Creates optimized Docker images.
Uses multi-stage builds to remove unnecessary files and dependencies.

Deployment Stage:

Deploys the final production image to a server, Kubernetes cluster or cloud provider.
Monitors the deployment process for errors.

Best practices for using multistage builds in CI/CD

Use caching for faster builds:
Structure Dockerfiles to utilize Docker’s caching mechanism.

 COPY package.json package-lock.json ./
 RUN npm install
 COPY . .

Automate tests in Build Stages:
Include unit and integration tests directly in the build process.

 RUN npm test

Separate build and runtime dependencies:
Keep the build dependencies in the first phase and exclude them from the final image to reduce size and security risks.
Use environment variables for configuration:
Avoid encoding sensitive values using build arguments or environment variables.

 ARG NODE_ENV=production
 ENV NODE_ENV=$NODE_ENV

Optimize images for cloud deployments:
Use minimal base images like alpine or scratch to reduce costs for cloud deployments.
Add health checks:
Make sure containers are working correctly after deployment.

 HEALTHCHECK CMD curl --fail http://localhost:3000 || exit 1

Test images locally before deployment:
Always test locally with the same Docker image to avoid inconsistencies between different environments.

Advantages of multistage builds in CI/CD pipelines

Consistency across different environments:
One and the same Dockerfile can be used for local development, testing and production deployment, reducing discrepancies.
Faster builds and deployments:
Smaller image sizes reduce build times and speed up the transfer of images during deployment.
Increased security:
The final images contain only the bare essentials at runtime and minimize vulnerabilities.
Simplified maintenance:
Managing a single Dockerfile with multiple stages reduces complexity and simplifies updates.
Cloud-ready deployments:
Multi-stage builds are optimized for modern cloud-native infrastructures such as Kubernetes and serverless platforms.

Debugging multi-level builds

Debugging Docker multi-stage builds can be challenging, especially when dealing with multiple stages, dependencies and build artifacts. Errors can occur due to syntax issues, mismatched dependencies or incorrect file transfers between stages. This section presents a systematic approach to debugging multi-stage builds that identifies common errors and introduces tools to simplify the debugging process.

Common challenges with multistage builds

Dependency issues:

Missing dependencies in the final image because they were not copied correctly from the build phase.
Version mismatches between build and runtime dependencies.

File transfer problems:

Incorrect paths when copying artifacts between stages with COPY --from.
Missing files due to incorrect .dockerignore rules.

Build errors in intermediate stages:

Errors when installing the build tools or when compiling the code.
Missing environment variables required for builds.

Caching issues:

Docker cache is not invalidated when source files change, resulting in outdated builds.

Runtime errors in the production Stage:

Configuration issues related to environment variables.
The application cannot be started because important files or binaries are missing.

Troubleshooting in the intermediate stage

Inspection of intermediate screens

Use the --target flag to build up to a certain stage:

docker build --target builder -t debug-stage .

This command stops the build after the specified stage (e.g. “builder”).
You can then run the container interactively to check files:

 docker run -it debug-stage sh

Key benefit:
Allows you to test the intermediate stages of the build before proceeding with the final image.

Saving intermediate images

To save an intermediate image for further analysis:

docker save debug-stage -o debug-stage.tar

Then load it later for verification:

docker load -i debug-stage.tar

Verify file transfers between stages

Check copies of artifacts

List the files in the intermediate container to check whether the artifacts were copied correctly:

docker run --rm debug-stage ls -l /app

Common problem:
Path mismatches between stages often lead to missing files.

Solution:
Make sure that the source and destination paths in the COPY --from command match exactly:

COPY --from=builder /app/dist ./dist

Debugging build caches

Problem:

Changes to the source files may not trigger a rebuild due to Docker’s caching mechanism.

Solution:

Use the --no-cache flag to force the rebuild of all layers:

docker build --no-cache -t my-app .

Alternatively, you can reorder the Dockerfile directives to maximize cache reuse for unmodified steps:

COPY package.json package-lock.json ./
RUN npm install
COPY . .
RUN npm run build

Tip:
Put commands that change frequently (e.g. COPY . .) at the end to avoid invalidating caches unnecessarily.

Debugging environment variables

Problem:

Environment variables used during the build may not be available at runtime.

Solution:

Check the environment within a running container:

 docker exec -it <container_id> env

Check the variables passed during the build with the ARG and ENV statements:

ARG NODE_ENV=production
ENV NODE_ENV=$NODE_ENV

Pass the arguments when building:

docker build --build-arg NODE_ENV=production -t my-app .

Debugging application errors in the final Stage

Problem:

The application cannot be started because required files or dependencies are missing in the production image.

Solution:

Compare the content of the final container with the intermediate stage of the build:

 docker run --rm production-image ls -l /app
 docker run --rm builder-image ls -l /app

Check the logs and errors:

 docker logs <container_id>

Common corrections:

Make sure that all necessary files are copied with COPY --from.
Validate the runtime dependencies with:

 ldd <binary_name>

Debugging of multi-platform builds

Problem:

Multi-platform builds can fail due to incompatible binaries or architecture-specific issues.

Solution:

Create a build for a specific platform:

 docker buildx build --platform linux/amd64 -t my-app .

Test the build on the desired platform with the Docker Desktop emulation or deploy it to a test server before production.

Tools for debugging Docker builds

Dive

Analyze the Docker images layer by layer.
Check which files contribute the most to the size of the image.

docker run --rm -v /var/run/docker.sock:/var/run/docker.sock wagoodman/dive:latest

Hadolint

Linter for Dockerfiles to detect syntax errors and violations of best practices.

docker run --rm -i hadolint/hadolint < Dockerfile

BuildKit

Enhanced builder for Docker with improved debugging features.

DOCKER_BUILDKIT=1 docker build -t my-app .

Docker Squash

Reduces the size of the image by merging layers, useful for debugging multi-stage builds.

docker-squash -t my-app:squashed <image_id>

Debugging logs and errors

Check container logs:

 docker logs  <container_id>

Operation of the container in interactive mode:

 docker run -it --entrypoint sh my-app

Monitoring the build logs with verbose output:

 docker build --progress=plain -t my-app .

Advanced techniques and use cases

Multi-stage builds in Docker are not just about reducing the size of the image — they also enable advanced workflows, optimizations and customizations for complex development and deployment scenarios. This section looks at advanced techniques and real-world use cases where multi-stage builds excel.

Combining multistage builds with caching strategies

Problem:

Docker builds can be slow, especially for applications with large dependencies or frequently changing codebases.

Solution:

Use Docker’s caching mechanisms to speed up builds:

Example for Node.js applications:

#Use cache-friendly steps for dependencies
FROM node:18-alpine AS builder
WORKDIR /app

#Cache dependencies
COPY package.json package-lock.json ./
RUN npm install

#Copy the source code and build
COPY . .
RUN npm run build

Key knowledge:

The npm install step is cached if the package.json does not change.
Only the source code is rebuilt when changes are made, which saves time.

Multistage test pipelines

Problem:

Tests are often skipped when creating production files, which leads to runtime errors.

Solution:

Build tests into intermediate stages to ensure quality before deployment.

Example for Python applications:

#Stage 1: Install dependencies and run tests
FROM python:3.10 AS tester
WORKDIR /app

COPY requirements-dev.txt .
RUN pip install -r requirements-dev.txt
COPY . .
RUN pytest tests/

#Step 2: Production image
FROM python:3.10-slim AS production
WORKDIR /app
COPY --from=tester /app/src ./src
CMD ["python", "src/app.py"]

Key findings:

In the first phase, the tests are executed and only successful builds go into production.
Test dependencies are excluded from the final image to keep it lean.

Use Secrets Management

Problem:

Embedding sensitive credentials (e.g. API keys, SSH keys) in Docker images can lead to security vulnerabilities.

Solution:

Use the Docker BuildKit to securely manage secrets during build without storing them in the final image.

Example for Go applications:

#Enable BuildKit
#DOCKER_BUILDKIT=1 docker build --secret id=ssh_key,src=~/.ssh/id_rsa -t my-app .

#Stage 1: Build stage with secret
FROM golang:1.20 AS builder
WORKDIR /app
RUN --mount=type=secret,id=ssh_key cp /run/secrets/ssh_key ~/.ssh/id_rsa
COPY .
RUN go build -o app

#Step 2: Production
FROM alpine:latest
WORKDIR /app
COPY --from=builder /app/app .
CMD ["./app"]

Key insights:

Secrets are passed securely and not stored in the final image.
Requires BuildKit-enabled builds for handling secrets.

Multiplatform builds

Problem:

Applications often need to support multiple architectures (e.g. amd64 for server and arm64 for Raspberry Pi).

Solution:

Create multiplatform images with Docker Buildx.

Example for Java applications:

docker buildx create --use 
docker buildx build --platform linux/amd64,linux/arm64 -t my-java-app .

Insights:

Enables cross-platform builds with a single command.
Ideal for IoT and hybrid infrastructures.

On-the-fly compilation for performance optimization

Problem:

Applications with compiled languages (e.g. C++, Go) often contain compilers in the image, which increases its size.

Solution:

Separate compilation from runtime execution with multi-stage builds.

Example for C++ applications:

#Stage 1: Build
FROM gcc:12 AS builder
WORKDIR /app
COPY .
RUN g++ -o app main.cpp

#Stage 2: Production
FROM alpine:latest
WORKDIR /app
COPY --from=builder /app/app .
CMD ["./app"]

Key findings:

In the build phase, a large image is used for compilation.
In the production phase, only the compiled binary file is used, which significantly reduces the size.

Use temporary debug phases

Problem:

Debugging production builds is difficult if intermediate files are lost during multi-stage builds.

Solution:

Add temporary debugging stages to the Dockerfile.

Example debug stage:

FROM node:18-alpine AS builder
WORKDIR /app
COPY . .
RUN npm install && npm run build

#Debug stage
FROM builder AS debug
CMD ["sh"]

#Final production stage
FROM node:18-alpine AS production
WORKDIR /app
COPY --from=builder /app/dist ./dist
CMD ["node", "dist/server.js"]

Key Insights:

The debug stage allows developers to execute shell commands for debugging.
It can be skipped in production but kept for local testing.

Run multiple services in one Dockerfile

Problem:

Applications with microservices often require multiple images for different services, which increases the maintenance effort.

Solution:

Use multi-level builds to combine services in one Dockerfile.

Example for frontend and backend services:

#Backend
FROM python:3.10 AS backend
WORKDIR /app
COPY backend/ .
RUN pip install -r requirements.txt

#Frontend
FROM node:18-alpine AS frontend
WORKDIR /app
COPY frontend/ .
RUN npm install && npm run build

#production
FROM nginx:stable-alpine AS production
COPY --from=frontend /app/build /usr/share/nginx/html
COPY --from=backend /app/api /usr/share/nginx/api
EXPOSE 80
CMD ["nginx", "-g", "daemon off;"]

Key Insights:

Combines multiple services (frontend and backend) in one build pipeline.
Simplifies deployment for microservices architectures.

Dealing with static and dynamic assets

Problem:

Separating static assets (images, CSS, JavaScript) and server-side code can be a challenge.

Solution:

Use multi-stage builds to process and deploy static elements separately.

Example:

#Build static files
FROM node:18-alpine AS static
WORKDIR /app
COPY . .
RUN npm install && npm run build

#Deploy static files
FROM nginx:stable-alpine
COPY --from=static /app/build /usr/share/nginx/html
EXPOSE 80
CMD ["nginx", "-g", "daemon off;"]

Key Insights:

Separates asset compilation from server-side processing.
Ideal for static websites or single-page applications.

Conclusion and best practices

Multi-tier Docker builds have revolutionized the process of creating and deploying containerized applications. They enable developers to create lean, secure and production-ready images by separating the build and runtime environments. This section summarizes key learnings, provides best practices for writing efficient Dockerfiles, and offers guidance for further exploration of multi-stage builds.

Summary of the benefits of multi-stage builds

Optimized image size

Multi-stage builds significantly reduce the size of Docker images by including only the necessary runtime dependencies and omitting build tools and source code. Smaller images:

Reduce storage requirements.
Improve deployment speed.
Reduce the attack surface and increase security.

Simplified build pipelines

With multiple stages in a single Dockerfile, developers can:

Consolidate build, test and deployment processes.
Eliminate the need for complex shell scripts or multiple Dockerfiles.
Maintain consistency between development, test and production environments.

Increased security

Multi-stage builds minimize vulnerabilities by removing unnecessary binaries, tools and libraries. They also support the management of secrets during build time without exposing sensitive information in the final image.

Improved CI/CD integration

Multi-stage builds integrate seamlessly into CI/CD pipelines and make automated tests, builds and deployments more efficient. They promote reliability and scalability in cloud-based environments.

Best practices for multistage builds

Use minimal base images

Choose lean base images like alpine or scratch for the production stages.
Avoid large images with unnecessary utilities unless they are needed for the build process.

Example

FROM node:18-alpine AS production

Optimize the layer caching

Utilize Docker’s caching mechanism by ordering instructions from the least frequently changing to the most frequently changing.
Separate dependency installation from source code copying to maximize cache reuse.

Example:

COPY package.json package-lock.json ./
RUN npm install
COPY . .
RUN npm run build

Use multi-stage builds for testing

Include unit and integration tests in the build stages to detect problems early.
If the tests fail, the build is aborted so that only verified code is provided.

Example

RUN npm test

Remove unnecessary files

Use .dockerignore to exclude logs, temporary files and sensitive data from the build context.

Example .dockerignore:

node_modules
*.log
.env
.git

Secure handling of secrets

Use Docker BuildKit to include secrets during the build without storing them in the final image.
Never encrypt sensitive data in Dockerfiles.

Example (BuildKit):

DOCKER_BUILDKIT=1 docker build --secret id=ssh_key,src=~/.ssh/id_rsa -t app .

Add health checks

Use the HEALTHCHECK statement to monitor the health of the application and ensure that containers are restarted if they fail.

Example:

HEALTHCHECK CMD curl --fail http://localhost:3000 || exit 1

Keep docker files readable and maintainable

Use comments to document each step and its purpose.
Name the build phases clearly with “AS” to improve readability.

Example:

FROM node:18-alpine AS builder # Build stage

Build for multiple architectures

Use Docker Buildx to achieve multiple platforms like amd64 and arm64 for broader compatibility.

Example:

docker buildx build --platform linux/amd64,linux/arm64 -t my-app .

Test locally before deployment

Test each phase by building to a specific target using the --target flag.

Example:

docker build --target builder -t debug-stage .
docker run -it debug-stage sh

Monitor image size

Use tools like Dive to check Docker images and find ways to reduce the size.

Command:

docker run --rm -v /var/run/docker.sock:/var/run/docker.sock wagoodman/dive:latest

Encouragement to experiment

Multi-stage builds provide endless opportunities for optimization and customization. Developers are encouraged to:

Test new patterns for complex workflows, such as parallel builds for microservices.
Experiment with advanced features like BuildKit and Buildx for caching, secret management and multi-platform support.
Integrate tools like Jenkins, GitHub Actions and Kubernetes for scalable CI/CD pipelines.

Pitfalls to avoid

Overcomplicated Dockerfiles:
Avoid excessive phases and commands. Simplify them by grouping related tasks.
Omitted cache optimization:
Restoring unchanged dependencies wastes time — use caching effectively.
Omitting tests in the build stages:
Always run tests as part of the build to ensure stability in production.
Forgetting to clean up temporary files:
Remove build artifacts and unused dependencies before finalizing the image.

Whether you’re developing a small web application, a backend API or a machine learning pipeline, multi-stage builds simplify the process and improve scalability, performance and security. Start experimenting today and unlock the full potential of Docker multi-stage builds!