How can we determine that two Docker images have exactly the same file system structure, and that the content of corresponding files is the same, irrespective of file timestamps?
I tried the image IDs but they differ when building from the same Dockerfile and a clean local repository. I did this test by building one image, cleaning the local repository, then touching one of the files to change its modification date, then building the second image, and their image IDs do not match. I used Docker 17.06 (the latest version I believe).
To analyze a Docker image, simply run dive command with Docker "Image ID". You can get your Docker images' IDs using "sudo docker images" command. Here, ea4c82dcd15a is Docker image id. The Dive command will quickly analyze the given Docker image and display its contents in the Terminal.
Note: Container names must be unique. That means you can only call one container web . If you want to re-use a container name you must delete the old container (with docker container rm ) before you can create a new container with the same name. As an alternative you can use the --rm flag with the docker run command.
Use the docker history commandAnd use docker history to show the layers.
Docker rmi To remove the image, you first need to list all the images to get the Image IDs, Image name and other details. By running simple command docker images -a or docker images . After that you make sure which image want to remove, to do that executing this simple command docker rmi <your-image-id> .
After some research I came up with a solution which is fast and clean per my tests.
The overall solution is this:
docker create ...
docker export ...
And that's it.
Technically, this can be done as follows:
1) Create file md5docker, and give it execution rights, e.g., chmod +x md5docker:
#!/bin/sh
dir=$(dirname "$0")
docker create $1 | { read cid; docker export $cid | $dir/tarcat | md5; docker rm $cid > /dev/null; }
2) Create file tarcat, and give it execution rights, e.g., chmod +x tarcat:
#!/usr/bin/env python3
# coding=utf-8
if __name__ == '__main__':
    import sys
    import tarfile
    with tarfile.open(fileobj=sys.stdin.buffer, mode="r|*") as tar:
        for tarinfo in tar:
            if tarinfo.isfile():
                print(tarinfo.name, flush=True)
                with tar.extractfile(tarinfo) as file:
                    sys.stdout.buffer.write(file.read())
            elif tarinfo.isdir():
                print(tarinfo.name, flush=True)
            elif tarinfo.issym() or tarinfo.islnk():
                print(tarinfo.name, flush=True)
                print(tarinfo.linkname, flush=True)
            else:
                print("\33[0;31mIGNORING:\33[0m ", tarinfo.name, file=sys.stderr)
3) Now invoke ./md5docker <image>, where <image> is your image name or id, to compute an MD5 hash of the entire file system of your image.
To verify if two images have the same contents just check that their hashes are equal as computed in step 3).
Note that this solution only considers as content directory structure, regular file contents, and symlinks (soft and hard). If you need more just change the tarcat script by adding more elif clauses testing for the content you wish to include (see Python's tarfile, and look for methods TarInfo.isXXX() corresponding to the needed content).
The only limitation I see in this solution is its dependency on Python (I am using Python3, but it should be very easy to adapt to Python2). A better solution without any dependency, and probably faster (hey, this is already very fast), is to write the tarcat script in a language supporting static linking so that a standalone executable file was enough (i.e., one not requiring any external dependencies, but the sole OS). I leave this as a future exercise in C, Rust, OCaml, Haskell, you choose.
Note, if MD5 does not suit your needs, just replace md5 inside the first script with your hash utility.
Hope this helps anyone reading.
Amazes me that docker doesn't do this sort of thing out of the box. Here's a variant on @mljrg's technique:
#!/bin/sh
docker create $1 | {
  read cid
  docker export $cid | tar Oxv 2>&1 | shasum -a 256
  docker rm $cid > /dev/null
}
It's shorter, doesn't need a python dependency or a second script at all, I'm sure there are downsides but it seems to work for me with the few tests I've done.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With