Reverse Engineering Docker Images into Dockerfiles with Dive and Dedockify
This tutorial explains how to analyze Docker image internals, use the Dive tool and Docker Engine Python API to extract layer information, and employ the Dedockify script to reconstruct a functional Dockerfile from any pre‑built container image.
Introduction
As public Docker registries become more popular, administrators and developers often pull images from unknown sources, treating them as black boxes without verifying their safety or origin. Rebuilding a Dockerfile from an existing image is possible because most of the required information is stored in the image layers.
Using Dive
Dive is a visual tool that inspects each layer of a Docker image. By creating a simple Dockerfile, building an image (example1), and running Dive, you can see the files added in each layer and the commands that created them.
mkdir $HOME/test1
cd $HOME/test1
cat > Dockerfile <<EOF
FROM scratch
COPY testfile1 /
COPY testfile2 /
COPY testfile3 /
EOF
docker build . -t example1Running Dive on example1 shows three COPY commands and the empty layers they produce.
Docker History
The built‑in docker history command lists each layer’s creation command. Using the --no-trunc flag reveals the full #(nop) COPY statements, although the original filenames are hashed.
docker history example1 --no-truncUsing Python Docker Engine API
The Docker Engine API for Python can retrieve the same history data programmatically.
#!/usr/bin/python3
import docker
cli = docker.APIClient(base_url='unix://var/run/docker.sock')
print(cli.history('example1'))Parsing the JSON output and reversing the command order yields a skeleton Dockerfile.
Dedockify
Dedockify is a Python script that automates the parsing of docker history output and prints a reconstructed Dockerfile. Running it on example1 produces:
FROM example1:latest
COPY file:e3c862... in /
COPY file:2a949a... in /
COPY file:aa717f... in /The script correctly identifies the COPY steps but may mis‑guess the base image (it shows example1:latest instead of scratch).
Generating Initial Dockerfile
By building a more realistic image (example2) based on ubuntu:latest and running Dedockify, the output matches the original Dockerfile closely, including RUN and COPY commands.
FROM ubuntu:latest
RUN mkdir testdir1
COPY testfile1 /testdir1
RUN mkdir testdir2
COPY testfile2 /testdir2
RUN mkdir testdir3
COPY testfile3 /testdir3Arbitrary Dockerfile Reconstruction
For a multi‑stage image (example3) that includes a compiled binary, the workflow is:
Load the image from a base64 archive.
Run Dedockify to obtain a skeleton Dockerfile.
Use Dive to discover the exact file paths and names inside the layers.
Copy the needed files from a running container.
Assemble a complete Dockerfile with WORKDIR, COPY, and ENTRYPOINT statements.
The final reconstructed Dockerfile successfully builds a new image that behaves identically to the original.
Conclusion
Combining Dive, Docker history, and the Dedockify script enables reliable reverse engineering of Docker images into usable Dockerfiles. While multi‑stage builds and certain ADD operations may still require manual inference, the approach works for most single‑stage images and can be further automated to infer base images and extract files directly from containers.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Raymond Ops
Linux ops automation, cloud-native, Kubernetes, SRE, DevOps, Python, Golang and related tech discussions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
