Install with Scoop (Windows)
scoop bucket add scoop-miketvo https://github.com/miketvo/scoop-miketvo
scoop install scoop-miketvo/imdupes
Install with Homebrew (Linux and MacOS)
brew tap miketvo/apps
brew install imdupes
See Releases for latest versions.
Download and extract the archive containing the imdupes
executable for your operating system. Currently, pre-built binaries are available for these OSes:
- Windows 7 or above (x64)
- Any Linux Distro with Kernel version 5.15 or above (x64)
- MacOS Monterey 12 or above (x64)
usage: imdupes {info,scan,clean} ...
Quickly detects and removes identical images. Has 3 modes:
- 'info' collects and displays statistics and information of images in a directory
- 'scan' scans and console prints detected identical image paths/filenames
- 'clean' scans and removes detected identical images (keeping only the first copy by default)
See "imdupes {info,scan,clean} --help" for more information
options:
-h, --help show this help message and exit
-v, --version show version information and exit
run modes:
{info,scan,clean}
Note: This program ignores any non-image file in the target directory
Algorithm: Average Hash (https://www.hackerfactor.com/blog/index.php?/archives/432-Looks-Like-It.html)
See below for detailed description of each mode and their arguments:
Info Mode:
usage: imdupes info [options] directory
collect and display statistics and information of images in a directory
positional arguments:
directory target image directory
options:
-h, --help show this help message and exit
-e REGEX, --exclude REGEX
exclude matched filenames based on REGEX pattern
-r, --recursive recursively search for images in subdirectories in addition to the specified parent directory
-V {1,2}, --verbose {1,2}
explain what is being done
-p {0,1,2}, --progress-bar {0,1,2}
specify verbose mode (-V/--verbose) progress bar detail level, 0 disables the progress bar
entirely (default: 2)
-f {absolute,cwd-relative,target-dir-relative,filename}, --format {absolute,cwd-relative,target-dir-relative,filename}
console output file path format, (default: target-dir-relative)
Note: This program ignores any non-image file in the target directory
Scan Mode:
usage: imdupes scan [options] directory [-o OUTPUT]
scan and console print detected identical image paths/filenames
positional arguments:
directory target image directory
options:
-h, --help show this help message and exit
-m {color-hist-hashing,grayscale-hashing,rgb-hashing,rgba-hashing}, --hashing-method {color-hist-hashing,grayscale-hashing,rgb-hashing,rgba-hashing}
specify a hashing method (default: color-hist-hashing)
-a {max-dim,max-dims-mean,avg-dim,avg-dims-mean}, --auto-hash-size {max-dim,max-dims-mean,avg-dim,avg-dims-mean}
automatic hash size calculation (default: max-dims-mean)
-s HASH_SIZE, --hash-size HASH_SIZE
specify a preferred hash size (integer)*
-e REGEX, --exclude REGEX
exclude matched filenames based on REGEX pattern
-r, --recursive recursively search for images in subdirectories in addition to the specified parent directory
-V {1,2}, --verbose {1,2}
explain what is being done
-p {0,1,2}, --progress-bar {0,1,2}
specify verbose mode (-V/--verbose) progress bar detail level, 0 disables the progress bar
entirely (default: 2)
-H, --show-hash show hash value of each duplication in output
-f {absolute,cwd-relative,target-dir-relative,filename}, --format {absolute,cwd-relative,target-dir-relative,filename}
console output file path format, (default: target-dir-relative)
-S, --silent no console output, -o/--output must be specified
-o DUPFILE, --output DUPFILE
save the output to the specified DUPFILE (JSON formatted .imdup) file (overwriting if file
already exist)
Note: This program ignores any non-image file in the target directory
*: Smaller hash sizes are better for detecting visually similar images, while larger hash sizes are better for
identifying identical images; The smaller the hash size, the better the performance; Smallest accepted hash size
is 8
Clean Mode:
usage: imdupes clean [options] input
scan and remove detected identical images (keeping only the first copy by default); deleted files are not
recoverable, proceed with caution
positional arguments:
input a directory containing the target images to be processed and clean; or a valid JSON formatted
.imdup file containing duplicated image paths (can be generated using scan mode with
-o/--output
flag), in which case only the following flags are available:
-h/--help
-e/--exclude
-V/--verbose
-i/--interactive
see options below for more information
options:
-h, --help show this help message and exit
-m {color-hist-hashing,grayscale-hashing,rgb-hashing,rgba-hashing}, --hashing-method {color-hist-hashing,grayscale-hashing,rgb-hashing,rgba-hashing}
specify a hashing method (default: color-hist-hashing)
-a {max-dim,max-dims-mean,avg-dim,avg-dims-mean}, --auto-hash-size {max-dim,max-dims-mean,avg-dim,avg-dims-mean}
automatic hash size calculation (default: max-dims-mean)
-s HASH_SIZE, --hash-size HASH_SIZE
specify a preferred hash size (integer)*
-e REGEX, --exclude REGEX
exclude matched filenames based on REGEX pattern
-r, --recursive recursively search for images in subdirectories in addition to the specified parent directory
-V {1,2}, --verbose {1,2}
explain what is being done
-p {0,1,2}, --progress-bar {0,1,2}
specify verbose mode (-V/--verbose) progress bar detail level, 0 disables the progress bar
entirely (default: 2)
-i, --interactive prompt for every duplication and let the user choose which file to delete
-f {absolute,cwd-relative,target-dir-relative,filename}, --format {absolute,cwd-relative,target-dir-relative,filename}
console output file path format, ignored if -V/--verbose and -i/--interactive are both not
enabled (default: target-dir-relative)
Note: This program ignores any non-image file in the target directory
*: Smaller hash sizes are better for detecting visually similar images, while larger hash sizes are better for
identifying identical images; The smaller the hash size, the better the performance; Smallest accepted hash size
is 8
- Extension:
.imdup
- Format: JSON
- Indent: 2 spaces
These files are JSON-formatted text file generated from running:
imdupes scan ... --output DUPFILE
They have the following format:
[
[
"abspath/to/image/file",
"abspath/to/image/file",
"abspath/to/image/file",
...
],
[
"abspath/to/image/file",
"abspath/to/image/file",
"abspath/to/image/file",
...
],
[
"abspath/to/image/file",
"abspath/to/image/file",
"abspath/to/image/file",
...
],
...
]
They can then be further edited by the user for more fine control over which file is deleted, then loaded back into the clean
mode for automated or interactive cleaning:
imdupes clean path/to/dupfile.imdup # Automated cleaning
imdupes clean -i path/to/dupfile.imdup # Interactive cleaning
When loaded into clean
mode, all duplication will be sorted in order from the largest dimension to the smallest dimension, so that during automatic cleaning, only the largest file (1st copy) is kept, to ensure that the program preserve as much information as possible.
The user can also specify -e/--exclude REGEX
flag when cleaning this way to further filtering.
File type | Extension | Note |
---|---|---|
Blizzard Mipmap Format | .blp |
|
Bitmap | .bmp , .dib |
|
DirectDraw Surface | .dds |
|
Encapsulated PostScript | .eps |
User needs to have installed Ghostscript |
Graphics Interchange Format | .gif |
|
Icon | .ico , .icns |
|
Cursor | .cur |
|
LabEye Image Bitmap | .im |
|
Joint Photographic Experts Group | .jpg , .jpeg , .jpe , .jfif , .jif |
|
JPEG 2000 | .jp2 , .j2k , .jpf , .jpm , .jpg2 , .j2c , .jpc , .jpx , .mj2 |
|
Picture Exchange | .pcx |
|
Portable Network Graphics | .png |
|
Portable Bitmap | .pbm , .pgm , .ppm , .pnm |
|
Silicon Graphics Image | .sgi |
|
SPIDER image | .spi |
|
Truevision TGA | .tga |
|
Tag Image File Format | .tif , .tiff |
|
WebP | .webp |
|
Flexible Image Transport System | .fits , .fit , .fts |
|
Pixar Image File Format | .pxr |
|
Adobe Photoshop Document | .psd |
|
Sun Raster | .ras , .sun |
|
X Bitmap | .xbm |
|
X Pixmap | .xpm |