
Docker Multistage Build: A Complete Guide
Introduction
Docker Multistage Build: In today’s fast-paced software development landscape, Docker has become a cornerstone technology for building, deploying and running applications in lightweight, portable containers. It simplifies the deployment process by encapsulating applications and their dependencies in isolated environments and ensuring consistency between development, test and production systems.
However, as applications become more complex, the size of Docker images can increase significantly, leading to slower builds, increased memory requirements and inefficient deployments. This is particularly problematic in resource-constrained environments such as CI/CD pipelines or cloud platforms, where optimization has a direct impact on performance and cost.
This is where Docker Multi-Stage Builds come into play. Introduced in Docker 17.05, multi-stage builds have revolutionized Dockerfile creation by allowing developers to create cleaner, smaller and more secure images without sacrificing build flexibility. By splitting the build process into multiple stages, unnecessary dependencies and artifacts can be removed from the final production image, resulting in optimized containers that are ready for immediate use.
In this blog post, we’ll dive deep into the concept of multi-stage builds, explore their benefits and go through practical examples and use cases. Whether you’re a beginner or an experienced developer, mastering multi-stage builds can drastically improve your Docker workflows and help you deploy production-ready applications more efficiently.
What are Docker Multistage Builds?
Docker Multi-Stage Builds are a powerful feature introduced in Docker 17.05 to streamline the process of building and optimizing container images. They allow developers to define multiple stages within a single Dockerfile, creating leaner and more efficient images by removing unnecessary dependencies from the final production image during build time.
How do Multistage Builds work?
Traditionally, Docker images were built in a one-step process where all build tools, dependencies and artifacts were combined into one package. This often resulted in bloated images that contained components that were only needed during the build process but were irrelevant to the runtime.
Multi-stage builds solve this problem by splitting the build and runtime environments into different stages. Each stage in the Dockerfile starts with a “FROM” statement, and developers can copy artifacts — such as compiled binaries, libraries or static files — from one stage to another using the “COPY –from” command. This modular approach allows you to refine the image incrementally, leaving unnecessary layers behind.
Example of a Multistage build
Here is a simple example of a multi-stage build for a Node.js application:
#Stage 1: Build Stage
FROM node:18-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
RUN npm run build
#Stage 2: Production Stage
FROM node:18-alpine AS production
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
CMD ["node", "dist/server.js"]
Explanation:
- Stage 1 (Builder):
- Installs the dependencies and compiles the application code.
- It leaves source files and build tools once the build process is complete.
- Stage 2 (Production):
- Copies only the compiled output (
dist
) and the requirednode_modules
from the previous stage. - A minimal image is created containing only the runtime environment required to run the application.
- Copies only the compiled output (
Comparison with traditional builds
- Single-Stage Builds:
- Larger image size as build tools and dependencies are not removed.
- Potential security risks due to leftover development tools.
- Slower deployment times and higher storage costs.
- Multistage Builds:
- Smaller, cleaner and production-optimized end images.
- Better security by eliminating unnecessary components.
- Easier maintenance and improved CI/CD workflows.
By using multi-stage builds, developers can effectively separate the creation and operation of their applications, resulting in more manageable and powerful Docker images.
Why use Multistage builds?
Creating efficient and secure Docker images is a top priority for developers, especially as applications grow in size and complexity. Multi-stage builds offer a streamlined approach to creating production-ready images by removing some of the limitations of traditional single-stage builds. Learn in detail why multi-stage builds have become an indispensable tool in modern Docker workflows:
Smaller image sizes
One of the main motivations for introducing multi-stage builds is to reduce the size of Docker images. Traditional builds often contain unnecessary dependencies, libraries and tools that are only needed during the build process and remain in the final image, adding to its size.
Example problem:
A Node.js application might contain npm
and development dependencies in the final image, even though only the compiled JavaScript files are required at runtime.
Multi-stage solution:
Multi-stage builds allow you to install dependencies, compile code and run tests in intermediate stages, copying only the most important results to the final stage. The result is smaller, production-ready images.
Improved security
Reducing the image size not only improves performance, but also minimizes the attack surface. Images with fewer packages and tools are less likely to contain vulnerabilities that attackers can exploit.
Example
Development tools such as compilers and debugging utilities are often unnecessary in production environments and can pose security risks if exposed. Multi-stage builds ensure that these tools are not included in the final image, leaving a minimal and hardened runtime environment.
Simplified build pipelines
Multi-stage builds integrate seamlessly into CI/CD pipelines and make it easier to automate builds, tests and deployments. By defining multiple stages in a single Dockerfile, developers can consolidate complex build processes without relying on external scripts or tools.
Example workflow:
- Stage 1: Build the application and run tests.
- Stage 2: Extract only the compiled artifacts.
- Stage 3: Package the runtime image for deployment.
This approach eliminates the need to maintain multiple Dockerfiles or custom shell scripts, simplifying build and deployment pipelines.
Faster build and deployment times
Multi-stage builds effectively utilize Docker’s caching mechanisms. Since each stage of the build process is cached independently, only the layers that change need to be rebuilt.
Example benefit:
If code changes only affect a specific stage (e.g. the application source code), the dependencies and base image layers remain cached, speeding up recovery times during development and deployment.
Isolation of the environment
Multi-stage builds allow developers to isolate different parts of the build process in separate environments. This ensures that the tools and dependencies required to build the application do not conflict with the final runtime environment.
Example use case:
- Use a large image (e.g. Ubuntu) in the build phase to compile the code.
- Use a lightweight base image (e.g. Alpine Linux) in the production phase for minimal runtime requirements.
Easier maintenance and updates
By making the build process modular, multi-stage builds make it easier to maintain Dockerfiles over time. Developers can update certain stages without impacting other stages, reducing the risk of breaking the build.
Example
Updating dependencies or changing build tools only affects the build phase and leaves the runtime phase unaffected.
Cost efficiency
Smaller and optimized Docker images reduce storage and data transfer costs, particularly in cloud environments where costs increase with resource usage. Multi-stage builds allow teams to create cost-efficient images without compromising functionality.
Real-World Use Cases
- Frontend applications: Build React or Angular applications in one phase and deliver static files with NGINX in the final phase.
- Backend applications: Compile Java or Go binaries in one build phase and copy them to a minimal runtime image.
- Machine learning models: Train models in the build phase and provide only the serialized models and dependencies needed for inference.
The most important thing at the end
Multi-stage builds are not only about reducing image size, but also about increasing security, simplifying pipelines and improving performance. Whether you’re developing monolithic applications or microservices, adopting multi-stage builds can significantly optimize your Docker workflow and deployment process.
Setting up a basic Multistage Build
Docker multi-stage builds simplify the creation of optimized and lightweight container images. In this section, we’ll walk through the steps to set up a basic multi-stage build and explain the key concepts required to create efficient Dockerfiles.
The key components of a Multistage build
A multi-stage build consists of multiple “FROM” statements within the same Dockerfile, with each “FROM” starting a new stage. You can name these stages and selectively copy artifacts between them.
The most important commands used in multi-stage builds:
FROM <image>
– Specifies the base image for each stage.AS <name>
– Assigns a name to the stage so that it can be referenced more easily later.COPY --from=<stage>
– Copies files or artifacts from a previous stage to the current stage.
Example: Node.js application
Let’s create a simple multi-stage Dockerfile for a Node.js application.
Dockerfile:
#Stage 1: Build stage
FROM node:18-alpine AS builder
WORKDIR /app
#Copy dependencies and install them
COPY package.json package-lock.json ./
RUN npm install
#Copy source code and build the app
COPY . .
RUN npm run build
#Stage 2: Production stage
FROM node:18-alpine AS production
WORKDIR /app
#Copy only the built files and node modules from the builder stage
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
#Define the command to run the app
CMD ["node", "dist/server.js"]
Step-by-step explanation
Stage 1 – Build stage:
FROM node:18-alpine AS builder
- Uses a lightweight Alpine Linux image with Node.js installed.
- Name this stage as
Builder
to use it in later stages.
WORKDIR /app
- Defines the working directory within the container.
COPY package.json package-lock.json ./
- Copies dependency-related files into the container.
RUN npm install
- Installs all dependencies required to develop and build the app.
COPY ..
- Copies the entire source code of the application into the container.
RUN npm run build
- Compiles the source code (e.g. transpiling TypeScript or bundling JavaScript).
Stage 2 – Production stage:
FROM node:18-alpine AS production
- Starts a new stage with the same lightweight Node.js base image.
WORKDIR /app
- Sets up the working directory for the production environment.
COPY --from=builder /app/dist ./dist
- Copies only the compiled
dist
folder from thebuilder
stage.
COPY --from=builder /app/node_modules ./node_modules
- Copies only the required
node_modules
from the previous stage.
CMD ["node", "dist/server.js"]
- Defines the command to start the application when the container is running.
Executing the Multistage build
To build and run the Docker image:
- Build the image:
docker build -t my-node-app .
- Start the container:
docker run -p 3000:3000 my-node-app
- Check the application:
- Open
http://localhost:3000
in your browser to check if the application is running.
Observe the advantages
- Smaller image size:
The production image contains no development dependencies and no source files, making it significantly smaller. - Faster deployment:
Lightweight images load faster and can be deployed quickly in cloud environments. - Simplified debugging:
Intermediate steps can be checked during the build process using Bash:
docker build --target builder -t temp-build .
docker run -it temp-build sh
Common variations
- Use different base images:
Use a fully functional base image for the build phase and a minimal image (e.g.scratch
oralpine
) for production. - Adding tests in the build phase:
RUN npm test
Integrate tests into the build phase and only continue if the tests are successful.
- Multi-platform builds:
Build images for different architectures with:
docker buildx build --platform linux/arm64,linux/amd64 -t my-image .
Optimization of production builds
Optimizing Docker production builds is critical to creating lightweight, secure, high-performance containers. Multi-stage builds provide a framework to achieve these goals by separating build dependencies from runtime requirements. This section introduces techniques to further refine production builds with multi-stage Dockerfiles.
Stripping development dependencies
Problem:
Development tools, compilers and libraries used during the build process often remain in the final image, increasing its size and attack surface.
Solution:
Use multi-stage builds to exclude these dependencies from the production phase.
Example for a Node.js application:
#Stage 1: Build stage
FROM node:18-alpine AS builder
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm install
COPY . .
RUN npm run build
#Stage 2: Production stage
FROM node:18-alpine
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
CMD ["node", "dist/server.js"]
Key optimizations:
- The
npm install
command is executed in the builder stage, but only thedist
folder and the required dependencies are copied to the production stage. - The development dependencies are never included in the final image.
Minimize layers
Problem:
Docker images consist of multiple layers, and too many layers increase complexity and size.
Solution:
- Combine commands into fewer layers wherever possible.
- Use “&&” to concatenate commands and “” for readability.
Example
RUN apt-get update && apt-get install -y \
curl \
git \
&& rm -rf /var/lib/apt/lists/*
Key optimizations:
- Reduces the number of levels by combining commands into one step.
- Cleans up temporary files (
rm -rf
) to minimize leftover artifacts.
Use of .dockerignore
Problem:
Copying unnecessary files (e.g. logs, temporary files and local configurations) into the image leads to its enlargement.
Solution:
Use a .dockerignore
file to exclude unwanted files and directories.
Example .dockerignore
:
node_modules
npm-debug.log
Dockerfile
.dockerignore
.git
.idea
*.md
.env
Key optimizations:
- Ensures that sensitive files and unnecessary artifacts are excluded from the context of image creation.
- Reduces creation time and prevents accidental disclosure of credentials.
Selection of minimal base images
Problem:
Using large base images adds unnecessary overhead to the final container.
Solution:
Choose smaller, security-oriented base images such as alpine
or scratch
.
Example:
FROM golang:1.20 AS builder
WORKDIR /app
COPY .
RUN go build -o app
#Minimum runtime image
FROM alpine:latest
WORKDIR /app
COPY --from=builder /app/app .
CMD ["./app"]
Key optimizations:
- Uses
golang
for creation and the lightweightalpine
image for production. - The result is a final image size of less than 10 MB compared to 800+ MB for traditional images.
Multi-level secret management
Problem:
Sensitive data such as API keys or SSH credentials can be exposed if they are included in the final image.
Solution:
Use multi-level builds to securely handle secrets and prevent them from being persisted.
Example:
#Stage 1: Build stage
FROM golang:1.20 AS builder
WORKDIR /app
ARG API_KEY
ENV API_KEY=$API_KEY
RUN echo $API_KEY > ./key.txt
RUN go build -o app
#Stage 2: Last stage
FROM alpine:latest
WORKDIR /app
COPY --from=builder /app/app .
CMD ["./app"]
Key optimization:
- Passes secrets as build arguments (
ARG API_KEY
) only in the build phase. - Secrets are excluded from the final image.
Multiplatform builds
Problem:
Applications must run on different architectures (e.g. x86, ARM).
Solution:
Use the Docker BuildKit and multiplatform builds to support different architectures.
Example:
docker buildx build --platform linux/amd64,linux/arm64 -t my-app .
Key optimizations:
- Enables builds for multiple architectures without the need for separate Dockerfiles.
- Ensures compatibility between different environments.
Leveraging Build Cache
Problem:
Building Docker images from scratch can be slow, especially in CI/CD pipelines.
Solution:
Optimize caching by carefully structuring Dockerfile instructions.
Example:
#Copy the dependencies first to use the caching
COPY package.json package-lock.json ./
RUN npm install
#Copy the rest of the source code
COPY . .
RUN npm run build
Key optimizations:
- Separates dependencies and source code to maximize caching of layers.
- Accelerates incremental builds by reusing cached layers.
Add health checks
Problem:
If the application stops responding after deployment, the deployment process can fail.
Solution:
Use Docker’s HEALTHCHECK
statement to define health probes.
Example:
HEALTHCHECK CMD curl --fail http://localhost:3000 || exit 1
Key optimization:
- Enables monitoring tools to check the state of the container and restart it if necessary.
Testing in build stages
Issue:
Skipping tests during the build can lead to undetected problems in production.
Solution:
Run tests as part of the build phase to detect bugs early.
Example:
RUN npm test
Key optimizations:
- Ensures that only tested code gets into production and thus reduces runtime errors.
Final recommendations
- Use multi-stage builds to minimize image size and increase security.
- Optimize the layer structure and caching to speed up builds.
- Use
.dockerignore
to exclude unnecessary files. - Choose minimal base images to reduce runtime overhead.
- Implement secrets management strategies to protect sensitive data.
- Integrate tests and health checks to ensure application reliability.
Multi-level builds for language-specific projects
Multi-level builds are very versatile and can be customized for different programming languages and frameworks. Each language has its own requirements for compiling, building and packaging, which makes multi-level builds an indispensable tool for creating optimized images. This section uses language-specific examples to show how multi-level builds streamline the development and deployment process.
Node.js applications
Scenario:
A React or Angular frontend application that requires dependencies and a build process, but only needs static assets in production.
Dockerfile example:
#Stage 1: Build Stage
FROM node:18-alpine AS builder
WORKDIR /app
#Install the dependencies and build the application
COPY package*.json ./
RUN npm install
COPY . .
RUN npm run build
#Phase 2: Production phase
FROM nginx:stable-alpine AS production
COPY --from=builder /app/build /usr/share/nginx/html
EXPOSE 80
CMD ["nginx", "-g", "daemon off;"]
Key Points:
- In the build phase, the JavaScript code is compiled into static files.
- In the production phase, an NGINX server is used to provide the static files.
- The final image contains only the necessary files for deployment and is therefore lightweight.
Python applications
Scenario:
A Flask or Django backend application that requires dependencies during development but should do without unnecessary tools in production.
Dockerfile example:
#Stage 1: Build Stage
FROM python:3.10-slim AS builder
WORKDIR /app
#Install the dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
#Phase 2: Production phase
FROM python:3.10-slim AS production
WORKDIR /app
#Copy only the necessary files
COPY --from=builder /usr/local/lib/python3.10/site-packages /usr/local/lib/python3.10/site-packages
COPY --from=builder /usr/local/bin /usr/local/bin
COPY . .
EXPOSE 8000
CMD ["gunicorn", "-b", "0.0.0.0:8000", "app:app"]
Key Points:
- The dependencies are installed in the build phase and copied to the production phase.
- Uses
gunicorn
as production server instead of Flask’s development server. - The final image does not include development tools like pip, which reduces size and vulnerabilities.
Java applications
Scenario:
A Java application that uses Maven or Gradle for dependency management and packaging.
Dockerfile example:
#Stage 1: Build stage
FROM maven:3.8.6-openjdk-17 AS builder
WORKDIR /app
#Copy the source code and build
COPY pom.xml .
COPY src ./src
RUN mvn clean package
#Stage 2: Production phase
FROM openjdk:17-slim AS production
WORKDIR /app
#Copy the JAR file from the builder stage
COPY --from=builder /app/target/app.jar .
CMD ["java", "-jar", "app.jar"]
Key Points:
- The Maven build takes place in the first phase so that the dependencies remain isolated.
- The production phase contains only the JAR file and the JRE.
- The use of
openjdk:17-slim
ensures a smaller image footprint.
Go applications
Scenario:
A Go application that can be compiled into a single binary file and executed without runtime dependencies.
Dockerfile example:
#Stage 1: Build Stage
FROM golang:1.20 AS builder
WORKDIR /app
COPY .
RUN go build -o app
#Phase 2: Production phase
FROM alpine:latest AS production
WORKDIR /app
#Copy the compiled binary file
COPY --from=builder /app/app .
CMD ["./app"]
Key Points:
- The build phase compiles the Go application into a binary file.
- For the final image,
alpine
is used, which is extremely lightweight. - The resulting image size is often less than 10 MB.
PHP applications
Scenario:
A PHP application that uses Composer for dependency management.
Dockerfile example:
#Stage 1: Build Stage
FROM composer:2 AS builder
WORKDIR /app
#Install PHP dependencies
COPY composer.json composer.lock ./
RUN composer install --no-dev --optimize-autoloader
#Phase 2: Production phase
FROM php:8.2-fpm-alpine AS production
WORKDIR /app
#Copy the source code and dependencies
COPY --from=builder /app/vendor ./vendor
COPY . .
CMD ["php-fpm"]
Key points:
- Composer is used in the build phase to install dependencies without development tools.
- In the production phase, PHP-FPM is used for an optimized PHP runtime.
- The final image is minimalistic and focuses solely on deploying PHP code.
C++ applications
Scenario:
A C++ application that requires a compiler during the build but should not use it in production.
Dockerfile example:
#Stage 1: Build stage
FROM gcc:12 AS builder
WORKDIR /app
COPY . .
RUN g++ -o app main.cpp
#Stage 2: Production Stage
FROM alpine:latest
WORKDIR /app
COPY --from=builder /app/app .
CMD ["./app"]
Key points:
- Uses GCC in the build stage for compilation.
- Production stage only contains the compiled binary, which reduces the size considerably.
Models for machine learning (Python)
Scenario:
A Python machine learning application that requires dependencies such as TensorFlow or PyTorch during training but not during inference.
Dockerfile example:
#Stage 1: Training Stage
FROM python:3.10 AS trainer
WORKDIR /app
COPY requirements-train.txt .
RUN pip install -r requirements-train.txt
COPY . .
RUN python train_model.py
#Stage 2: Inference Stage
FROM python:3.10-slim AS inference
WORKDIR /app
COPY requirements-infer.txt .
RUN pip install -r requirements-infer.txt
COPY --from=trainer /app/model.pkl ./model.pkl
CMD ["python", "predict.py"]
Key points:
- Keeps the training and inference environments separate.
- The last stage includes only the trained model and dependencies to minimize the size and dependencies.
Improving CI/CD pipelines with multistage builds
Continuous Integration and Continuous Deployment (CI/CD) pipelines have become an integral part of modern software development, enabling faster releases and higher software quality. Docker’s multi-stage builds complement CI/CD pipelines by simplifying the build, testing and deployment processes. In this section, you will learn how multi-stage builds can improve CI/CD workflows and make deployment more efficient.
Why multi-stage builds are ideal for CI/CD pipelines
Challenges in traditional CI/CD pipelines:
- Large image sizes slow down builds, tests and deployments.
- Dependencies for development and production are bundled, which increases security risks.
- Test phases often require separate scripts or tools, resulting in complex pipelines.
- Managing multiple Dockerfiles for different phases increases maintenance efforts.
Multi-stage build advantages in CI/CD pipelines:
- Single Dockerfile for all stages: Bundles builds, tests and deployments into a single file.
- Optimized images: Creates lightweight production images without build dependencies.
- Consistency across environments: Ensures that development, test and production environments are consistent.
- Simplified deployment workflow: Reduces reliance on external dependency management tools.
CI/CD workflow example with multistage builds
Let’s look at an example of a CI/CD pipeline for a Node.js application with GitHub Actions and Docker multi-stage builds.
Dockerfile:
#Stage 1: Building and testing
FROM node:18-alpine AS builder
WORKDIR /app
#Install dependencies
COPY package.json package-lock.json ./
RUN npm install
#Copy the source code and run the tests
COPY . .
RUN npm test
#Build the application
RUN npm run build
#Phase 2: Production
FROM node:18-alpine AS production
WORKDIR /app
#Copy only the created artifacts
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
#Set the default command
CMD ["node", "dist/server.js"]
GitHub Actions Workflow (.github/workflows/docker-build.yml
):
name: CI/CD Pipeline
on:
push:
branches:
- main
jobs:
build-and-deploy:
runs-on: ubuntu-latest
steps:
- name: Checkout Code
uses: actions/checkout@v3
- name: Set Up Docker Buildx
uses: docker/setup-buildx-action@v2
- name: Build Docker Image
run: docker build -t my-app:latest .
- name: Test Image Locally
run: docker run --rm my-app:latest npm test
- name: Push Image to Docker Hub
run: |
echo "${{ secrets.DOCKER_PASSWORD }}" | docker login -u "${{ secrets.DOCKER_USERNAME }}" --password-stdin
docker tag my-app:latest my-dockerhub-username/my-app:latest
docker push my-dockerhub-username/my-app:latest
- name: Deploy to Server
run: |
ssh user@server "docker pull my-dockerhub-username/my-app:latest && docker-compose up -d"
Key features in the pipeline:
- Build and Test Stages: Ensures that only successful builds are deployed.
- Multi-stage Dockerfile integration: Creates lean, production-ready images.
- Automated deployment: Deploys the latest image directly to the server.
- Secret management: Uses GitHub secrets for secure authentication of the Docker registry.
CI/CD pipeline stages with multistage builds
Build Stage:
- Installs the dependencies.
- Compiles the code into a usable format (e.g. static files or binary files).
- Optimizes and prepares the code for production.
Test Stage:
- Runs unit tests, integration tests and linting tools.
- Ensures that only successful builds are deployed.
Packaging Stage:
- Creates optimized Docker images.
- Uses multi-stage builds to remove unnecessary files and dependencies.
Deployment Stage:
- Deploys the final production image to a server, Kubernetes cluster or cloud provider.
- Monitors the deployment process for errors.
Best practices for using multistage builds in CI/CD
- Use caching for faster builds:
Structure Dockerfiles to utilize Docker’s caching mechanism.
COPY package.json package-lock.json ./
RUN npm install
COPY . .
- Automate tests in Build Stages:
Include unit and integration tests directly in the build process.
RUN npm test
- Separate build and runtime dependencies:
Keep the build dependencies in the first phase and exclude them from the final image to reduce size and security risks. - Use environment variables for configuration:
Avoid encoding sensitive values using build arguments or environment variables.
ARG NODE_ENV=production
ENV NODE_ENV=$NODE_ENV
- Optimize images for cloud deployments:
Use minimal base images likealpine
orscratch
to reduce costs for cloud deployments. - Add health checks:
Make sure containers are working correctly after deployment.
HEALTHCHECK CMD curl --fail http://localhost:3000 || exit 1
- Test images locally before deployment:
Always test locally with the same Docker image to avoid inconsistencies between different environments.
Advantages of multistage builds in CI/CD pipelines
- Consistency across different environments:
One and the same Dockerfile can be used for local development, testing and production deployment, reducing discrepancies. - Faster builds and deployments:
Smaller image sizes reduce build times and speed up the transfer of images during deployment. - Increased security:
The final images contain only the bare essentials at runtime and minimize vulnerabilities. - Simplified maintenance:
Managing a single Dockerfile with multiple stages reduces complexity and simplifies updates. - Cloud-ready deployments:
Multi-stage builds are optimized for modern cloud-native infrastructures such as Kubernetes and serverless platforms.
Debugging multi-level builds
Debugging Docker multi-stage builds can be challenging, especially when dealing with multiple stages, dependencies and build artifacts. Errors can occur due to syntax issues, mismatched dependencies or incorrect file transfers between stages. This section presents a systematic approach to debugging multi-stage builds that identifies common errors and introduces tools to simplify the debugging process.
Common challenges with multistage builds
Dependency issues:
- Missing dependencies in the final image because they were not copied correctly from the build phase.
- Version mismatches between build and runtime dependencies.
File transfer problems:
- Incorrect paths when copying artifacts between stages with
COPY --from
. - Missing files due to incorrect
.dockerignore
rules.
Build errors in intermediate stages:
- Errors when installing the build tools or when compiling the code.
- Missing environment variables required for builds.
Caching issues:
- Docker cache is not invalidated when source files change, resulting in outdated builds.
Runtime errors in the production Stage:
- Configuration issues related to environment variables.
- The application cannot be started because important files or binaries are missing.
Troubleshooting in the intermediate stage
Inspection of intermediate screens
Use the --target
flag to build up to a certain stage:
docker build --target builder -t debug-stage .
- This command stops the build after the specified stage (e.g. “builder”).
- You can then run the container interactively to check files:
docker run -it debug-stage sh
Key benefit:
Allows you to test the intermediate stages of the build before proceeding with the final image.
Saving intermediate images
To save an intermediate image for further analysis:
docker save debug-stage -o debug-stage.tar
Then load it later for verification:
docker load -i debug-stage.tar
Verify file transfers between stages
Check copies of artifacts
List the files in the intermediate container to check whether the artifacts were copied correctly:
docker run --rm debug-stage ls -l /app
Common problem:
Path mismatches between stages often lead to missing files.
Solution:
Make sure that the source and destination paths in the COPY --from
command match exactly:
COPY --from=builder /app/dist ./dist
Debugging build caches
Problem:
Changes to the source files may not trigger a rebuild due to Docker’s caching mechanism.
Solution:
Use the --no-cache
flag to force the rebuild of all layers:
docker build --no-cache -t my-app .
Alternatively, you can reorder the Dockerfile directives to maximize cache reuse for unmodified steps:
COPY package.json package-lock.json ./
RUN npm install
COPY . .
RUN npm run build
Tip:
Put commands that change frequently (e.g. COPY . .
) at the end to avoid invalidating caches unnecessarily.
Debugging environment variables
Problem:
Environment variables used during the build may not be available at runtime.
Solution:
- Check the environment within a running container:
docker exec -it <container_id> env
- Check the variables passed during the build with the
ARG
andENV
statements:
ARG NODE_ENV=production
ENV NODE_ENV=$NODE_ENV
Pass the arguments when building:
docker build --build-arg NODE_ENV=production -t my-app .
Debugging application errors in the final Stage
Problem:
The application cannot be started because required files or dependencies are missing in the production image.
Solution:
- Compare the content of the final container with the intermediate stage of the build:
docker run --rm production-image ls -l /app
docker run --rm builder-image ls -l /app
- Check the logs and errors:
docker logs <container_id>
Common corrections:
- Make sure that all necessary files are copied with
COPY --from
. - Validate the runtime dependencies with:
ldd <binary_name>
Debugging of multi-platform builds
Problem:
Multi-platform builds can fail due to incompatible binaries or architecture-specific issues.
Solution:
- Create a build for a specific platform:
docker buildx build --platform linux/amd64 -t my-app .
- Test the build on the desired platform with the Docker Desktop emulation or deploy it to a test server before production.
Tools for debugging Docker builds
Dive
- Analyze the Docker images layer by layer.
- Check which files contribute the most to the size of the image.
docker run --rm -v /var/run/docker.sock:/var/run/docker.sock wagoodman/dive:latest
Hadolint
- Linter for Dockerfiles to detect syntax errors and violations of best practices.
docker run --rm -i hadolint/hadolint < Dockerfile
BuildKit
- Enhanced builder for Docker with improved debugging features.
DOCKER_BUILDKIT=1 docker build -t my-app .
Docker Squash
- Reduces the size of the image by merging layers, useful for debugging multi-stage builds.
docker-squash -t my-app:squashed <image_id>
Debugging logs and errors
- Check container logs:
docker logs <container_id>
- Operation of the container in interactive mode:
docker run -it --entrypoint sh my-app
- Monitoring the build logs with verbose output:
docker build --progress=plain -t my-app .
Advanced techniques and use cases
Multi-stage builds in Docker are not just about reducing the size of the image — they also enable advanced workflows, optimizations and customizations for complex development and deployment scenarios. This section looks at advanced techniques and real-world use cases where multi-stage builds excel.
Combining multistage builds with caching strategies
Problem:
Docker builds can be slow, especially for applications with large dependencies or frequently changing codebases.
Solution:
Use Docker’s caching mechanisms to speed up builds:
Example for Node.js applications:
#Use cache-friendly steps for dependencies
FROM node:18-alpine AS builder
WORKDIR /app
#Cache dependencies
COPY package.json package-lock.json ./
RUN npm install
#Copy the source code and build
COPY . .
RUN npm run build
Key knowledge:
- The
npm install
step is cached if thepackage.json
does not change. - Only the source code is rebuilt when changes are made, which saves time.
Multistage test pipelines
Problem:
Tests are often skipped when creating production files, which leads to runtime errors.
Solution:
Build tests into intermediate stages to ensure quality before deployment.
Example for Python applications:
#Stage 1: Install dependencies and run tests
FROM python:3.10 AS tester
WORKDIR /app
COPY requirements-dev.txt .
RUN pip install -r requirements-dev.txt
COPY . .
RUN pytest tests/
#Step 2: Production image
FROM python:3.10-slim AS production
WORKDIR /app
COPY --from=tester /app/src ./src
CMD ["python", "src/app.py"]
Key findings:
- In the first phase, the tests are executed and only successful builds go into production.
- Test dependencies are excluded from the final image to keep it lean.
Use Secrets Management
Problem:
Embedding sensitive credentials (e.g. API keys, SSH keys) in Docker images can lead to security vulnerabilities.
Solution:
Use the Docker BuildKit to securely manage secrets during build without storing them in the final image.
Example for Go applications:
#Enable BuildKit
#DOCKER_BUILDKIT=1 docker build --secret id=ssh_key,src=~/.ssh/id_rsa -t my-app .
#Stage 1: Build stage with secret
FROM golang:1.20 AS builder
WORKDIR /app
RUN --mount=type=secret,id=ssh_key cp /run/secrets/ssh_key ~/.ssh/id_rsa
COPY .
RUN go build -o app
#Step 2: Production
FROM alpine:latest
WORKDIR /app
COPY --from=builder /app/app .
CMD ["./app"]
Key insights:
- Secrets are passed securely and not stored in the final image.
- Requires BuildKit-enabled builds for handling secrets.
Multiplatform builds
Problem:
Applications often need to support multiple architectures (e.g. amd64
for server and arm64
for Raspberry Pi).
Solution:
Create multiplatform images with Docker Buildx.
Example for Java applications:
docker buildx create --use
docker buildx build --platform linux/amd64,linux/arm64 -t my-java-app .
Insights:
- Enables cross-platform builds with a single command.
- Ideal for IoT and hybrid infrastructures.
On-the-fly compilation for performance optimization
Problem:
Applications with compiled languages (e.g. C++, Go) often contain compilers in the image, which increases its size.
Solution:
Separate compilation from runtime execution with multi-stage builds.
Example for C++ applications:
#Stage 1: Build
FROM gcc:12 AS builder
WORKDIR /app
COPY .
RUN g++ -o app main.cpp
#Stage 2: Production
FROM alpine:latest
WORKDIR /app
COPY --from=builder /app/app .
CMD ["./app"]
Key findings:
- In the build phase, a large image is used for compilation.
- In the production phase, only the compiled binary file is used, which significantly reduces the size.
Use temporary debug phases
Problem:
Debugging production builds is difficult if intermediate files are lost during multi-stage builds.
Solution:
Add temporary debugging stages to the Dockerfile.
Example debug stage:
FROM node:18-alpine AS builder
WORKDIR /app
COPY . .
RUN npm install && npm run build
#Debug stage
FROM builder AS debug
CMD ["sh"]
#Final production stage
FROM node:18-alpine AS production
WORKDIR /app
COPY --from=builder /app/dist ./dist
CMD ["node", "dist/server.js"]
Key Insights:
- The
debug
stage allows developers to execute shell commands for debugging. - It can be skipped in production but kept for local testing.
Run multiple services in one Dockerfile
Problem:
Applications with microservices often require multiple images for different services, which increases the maintenance effort.
Solution:
Use multi-level builds to combine services in one Dockerfile.
Example for frontend and backend services:
#Backend
FROM python:3.10 AS backend
WORKDIR /app
COPY backend/ .
RUN pip install -r requirements.txt
#Frontend
FROM node:18-alpine AS frontend
WORKDIR /app
COPY frontend/ .
RUN npm install && npm run build
#production
FROM nginx:stable-alpine AS production
COPY --from=frontend /app/build /usr/share/nginx/html
COPY --from=backend /app/api /usr/share/nginx/api
EXPOSE 80
CMD ["nginx", "-g", "daemon off;"]
Key Insights:
- Combines multiple services (frontend and backend) in one build pipeline.
- Simplifies deployment for microservices architectures.
Dealing with static and dynamic assets
Problem:
Separating static assets (images, CSS, JavaScript) and server-side code can be a challenge.
Solution:
Use multi-stage builds to process and deploy static elements separately.
Example:
#Build static files
FROM node:18-alpine AS static
WORKDIR /app
COPY . .
RUN npm install && npm run build
#Deploy static files
FROM nginx:stable-alpine
COPY --from=static /app/build /usr/share/nginx/html
EXPOSE 80
CMD ["nginx", "-g", "daemon off;"]
Key Insights:
- Separates asset compilation from server-side processing.
- Ideal for static websites or single-page applications.
Conclusion and best practices
Multi-tier Docker builds have revolutionized the process of creating and deploying containerized applications. They enable developers to create lean, secure and production-ready images by separating the build and runtime environments. This section summarizes key learnings, provides best practices for writing efficient Dockerfiles, and offers guidance for further exploration of multi-stage builds.
Summary of the benefits of multi-stage builds
Optimized image size
Multi-stage builds significantly reduce the size of Docker images by including only the necessary runtime dependencies and omitting build tools and source code. Smaller images:
- Reduce storage requirements.
- Improve deployment speed.
- Reduce the attack surface and increase security.
Simplified build pipelines
With multiple stages in a single Dockerfile, developers can:
- Consolidate build, test and deployment processes.
- Eliminate the need for complex shell scripts or multiple Dockerfiles.
- Maintain consistency between development, test and production environments.
Increased security
Multi-stage builds minimize vulnerabilities by removing unnecessary binaries, tools and libraries. They also support the management of secrets during build time without exposing sensitive information in the final image.
Improved CI/CD integration
Multi-stage builds integrate seamlessly into CI/CD pipelines and make automated tests, builds and deployments more efficient. They promote reliability and scalability in cloud-based environments.
Best practices for multistage builds
Use minimal base images
- Choose lean base images like
alpine
orscratch
for the production stages. - Avoid large images with unnecessary utilities unless they are needed for the build process.
Example
FROM node:18-alpine AS production
Optimize the layer caching
- Utilize Docker’s caching mechanism by ordering instructions from the least frequently changing to the most frequently changing.
- Separate dependency installation from source code copying to maximize cache reuse.
Example:
COPY package.json package-lock.json ./
RUN npm install
COPY . .
RUN npm run build
Use multi-stage builds for testing
- Include unit and integration tests in the build stages to detect problems early.
- If the tests fail, the build is aborted so that only verified code is provided.
Example
RUN npm test
Remove unnecessary files
- Use
.dockerignore
to exclude logs, temporary files and sensitive data from the build context.
Example .dockerignore
:
node_modules
*.log
.env
.git
Secure handling of secrets
- Use Docker BuildKit to include secrets during the build without storing them in the final image.
- Never encrypt sensitive data in Dockerfiles.
Example (BuildKit):
DOCKER_BUILDKIT=1 docker build --secret id=ssh_key,src=~/.ssh/id_rsa -t app .
Add health checks
- Use the
HEALTHCHECK
statement to monitor the health of the application and ensure that containers are restarted if they fail.
Example:
HEALTHCHECK CMD curl --fail http://localhost:3000 || exit 1
Keep docker files readable and maintainable
- Use comments to document each step and its purpose.
- Name the build phases clearly with “AS” to improve readability.
Example:
FROM node:18-alpine AS builder # Build stage
Build for multiple architectures
- Use Docker Buildx to achieve multiple platforms like
amd64
andarm64
for broader compatibility.
Example:
docker buildx build --platform linux/amd64,linux/arm64 -t my-app .
Test locally before deployment
- Test each phase by building to a specific target using the
--target
flag.
Example:
docker build --target builder -t debug-stage .
docker run -it debug-stage sh
Monitor image size
- Use tools like
Dive
to check Docker images and find ways to reduce the size.
Command:
docker run --rm -v /var/run/docker.sock:/var/run/docker.sock wagoodman/dive:latest
Encouragement to experiment
Multi-stage builds provide endless opportunities for optimization and customization. Developers are encouraged to:
- Test new patterns for complex workflows, such as parallel builds for microservices.
- Experiment with advanced features like BuildKit and Buildx for caching, secret management and multi-platform support.
- Integrate tools like Jenkins, GitHub Actions and Kubernetes for scalable CI/CD pipelines.
Pitfalls to avoid
- Overcomplicated Dockerfiles:
Avoid excessive phases and commands. Simplify them by grouping related tasks. - Omitted cache optimization:
Restoring unchanged dependencies wastes time — use caching effectively. - Omitting tests in the build stages:
Always run tests as part of the build to ensure stability in production. - Forgetting to clean up temporary files:
Remove build artifacts and unused dependencies before finalizing the image.
Whether you’re developing a small web application, a backend API or a machine learning pipeline, multi-stage builds simplify the process and improve scalability, performance and security. Start experimenting today and unlock the full potential of Docker multi-stage builds!

