Skip to content

armagansalman/filedups

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

103 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

filedups


Given a sequence of paths (full directory paths), finds and groups duplicate files recursively.
Doesn't provide 100% accuracy. Reported files in a group might not be exactly the same.

Example Scan on Windows

How to use filedups on Windows

HOW TO USE

Go to src/filedups in terminal.
Put full paths of directories you want to search in in-dirs.txt file on separate lines.
Options (M, X are the number of bytes) (default value for M is 1024000 (1000 KB), default value for X is None):
--min-file-size M
--max-file-size X


Then run main.py:
For Linux:
python3 main.py in-dirs.txt


1000 KB minimum file size:
python3 main.py in-dirs.txt --min-file-size 1024000


200 KB minimum, 2000 KB maximum file size:
python3 main.py in-dirs.txt --min-file-size 204800 --max-file-size 2048000


For Windows:
py main.py in-dirs.txt


Results will be in a text file of current working directory of command line
, which starts with filedups and contains timestamp of the scan.

Notes

It takes at least 3 minutes to filter 284000 files to 40300 files and then find duplicates.
It takes at least 19 minutes to filter 286000 files to 140000 files and then find duplicates.


Executables: https://mega.nz/folder/9MtnBS6Y#mX-uxPin8hcAnt5ENvXBOg

About

Finds local duplicate files. Duplicate file groups might contain non-duplicate files (less than 100% accuracy).

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages