Cloud Native 20 min read

Reverse‑Engineer Docker Images into Dockerfiles with Dive and Dedockify

This guide explains how to dissect Docker images, inspect their layers with Dive, extract build history via Docker commands and the Python Docker Engine API, and automatically reconstruct a functional Dockerfile using the open‑source Dedockify tool, complete with code snippets and practical examples.

Liangxu Linux
Liangxu Linux
Liangxu Linux
Reverse‑Engineer Docker Images into Dockerfiles with Dive and Dedockify

Introduction

Public Docker registries make it easy to pull images from unknown sources, turning containers into black boxes whose provenance and security are hard to verify. By examining a Docker image’s internal layers, we can recover most of the information needed to rebuild its Dockerfile.

Using Dive

Dive is a visual tool that inspects each layer of an image. The workflow starts with a minimal Dockerfile, builds an image, and then runs Dive to explore the resulting layers.

mkdir $HOME/test1
cd $HOME/test1
cat > Dockerfile << EOF
FROM scratch
COPY testfile1 /
COPY testfile2 /
COPY testfile3 /
EOF

docker build . -t example1

docker run --rm -it -v /var/run/docker.sock:/var/run/docker.sock wagoodman/dive:latest example1

Dive shows the three COPY commands, the hash of each added file, and the directory tree for each layer.

Docker History

The built‑in docker history command lists the commands that created each layer. Adding --no‑trunc reveals the full #(nop) COPY … lines, which are essential for reconstructing the Dockerfile.

docker history example1 --no-trunc
# output shows full COPY commands with hashed file identifiers

Python Docker Engine API

Docker provides a Python client for the Engine API. The following script queries an image’s history and prints the raw JSON structures.

#!/usr/bin/python3
import docker
cli = docker.APIClient(base_url='unix://var/run/docker.sock')
print(cli.history('example1'))

The output contains the CreatedBy strings that correspond to the original Dockerfile instructions.

Dedockify

Dedockify is a Python utility that parses the history data, reverses the command order, and prints a reconstructed Dockerfile. The core logic extracts #(nop) lines as Dockerfile directives and formats RUN statements.

from sys import argv
import docker

class ImageNotFound(Exception):
    pass

class MainObj:
    def __init__(self):
        self.commands = []
        self.cli = docker.APIClient(base_url='unix://var/run/docker.sock')
        self._get_image(argv[-1])
        self.hist = self.cli.history(self.img['RepoTags'][0])
        self._parse_history()
        self.commands.reverse()
        self._print_commands()
    # ... (methods omitted for brevity) ...

__main__ = MainObj()

Rebuilding Dockerfiles

Running Dedockify on example1 yields a Dockerfile that correctly lists the three COPY commands but mistakenly assumes the base image is example1:latest instead of scratch. A more realistic example uses an ubuntu:latest base, which Dedockify reconstructs accurately:

$ python3 dedockify.py 05651f084d67
FROM ubuntu:latest
RUN /bin/sh -c mkdir testdir1
COPY file:cc4f6e89... in /testdir1
RUN /bin/sh -c mkdir testdir2
COPY file:a04cdcdf... in /testdir2
RUN /bin/sh -c mkdir testdir3
COPY file:2ed8ccde... in /testdir3

For a multi‑stage image ( example3) the tool recovers the final stage’s commands, including WORKDIR and COPY of a compiled binary, allowing a full rebuild.

Limitations and Further Work

Dedockify cannot recover original file names (they are hashed) or multi‑stage build information that is not present in the final image. Future improvements could automate layer‑by‑layer analysis, extract files directly from containers, and infer the correct base image (e.g., scratch) to produce a fully functional Dockerfile without manual tweaks.

图片
图片
图片
图片
图片
图片
图片
图片
图片
图片
图片
图片
图片
图片
图片
图片
图片
图片
图片
图片
图片
图片
图片
图片
图片
图片
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

dedockifydiveimage-reverse-engineering
Liangxu Linux
Written by

Liangxu Linux

Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.