I need some help with this. I have thousands of photo files on my computer, and in the cloud, and hundreds of them are duplicated in my folders. I have one folder for photos and videos that contains subfolders for every year, special events, work and family presentation/video projects, and other categories, etc. Over the years they’ve become completely disorganized.
Does anyone know of an easy app to help me efficiently find all the duplicate files?
Thanks!
If Linux or MacOS 10+, you may be able to do something like this (with the top level directory of your photos in place of $directory):
find $directory -type f -exec sha1sum {} ; | sort > output1
awk ‘{ print $1 }’ output-file-1 | uniq -d > output2
Now look in output2. You will see a bunch of hashes. Look in output1 to find the matching hashes and their corresponding files. It is very likely that two files with the same hash are the same, but you want to verify that by looking at them (in rare cases, two different files could give the same hash).
output-file-1 in the second line of the code in reply #1 should be output1
One way of consolidation is to save all files from year X event Y folders located on different drives into one folder and say “yes to all” at the prompt to overwrite if a file with the same name already exists. Then delete those other folders. It only works if the files have the same name. Otherwise, you would need some sort of image recognition software.
OK thanks, for the help @ucbalumnus and @BunsenBurner .
I don’t know if I could figure out how to install Linux and do the stuff in #1. I don’t have a Mac. I’ll do research into that and see if I think I’m capable of it.
And I kind of don’t want to put all my thousands of photos/videos/images in one big folder - I’d never be able to find anything if I did that. Also I know that some files that aren’t identical have the same name/number - I guess our camera isn’t giving unique names to every file because I found a few instances of that last night when I was moving some photos around. So I don’t want to do a “yes to all” overwrite as I’d be erasing some files.
I think I’m going to have to find some kind of image recognition software/app if I can’t figure out the Linux thing.
I think most of the problem is from my H not deleting files off his camera card after he downloads them on my computer, so the same photos are getting downloaded two or three times into separate folders. Then I’ve moved files around (copied not cut) over the years while doing projects.
I just checked my folder and I have over 55,000 files in it (17 years worth)! That’s after I’ve spent several hours already deleting junk. Holy cow, no wonder I’m overwhelmed. Maybe I’ll spend some more time purging junk and things will get easier.
LeastComplicated, every memory stick I get starts re-numbering from 1, so I have some duplicate file names across my directories, though not in the same directories. I’m in a similar boat, only I’m at 93,000, and I know there are others in other non-photo directories. I also have directories by subject matter, as well as chrono and vacation-centric.
Am following to hear if there is a good way to ID the dups and cull where needed!
I’ve been loading files up to Google Drive, but it annoys the daylights of me that the album size is limited. What’s one directory on my computer turns into 7 or 8 on Drive. Not helpful for me!!!
In that case, probably not a good option if you have a purely Microsoft Windows computer, unless you have access to someone with a good knowledge of installing or using Linux to access files on a Microsoft Windows computer / filesystem. Or can copy all of it to a MacOS 10+ or Linux computer first.
It is probably not a good idea for a novice user to try to install Linux on an existing Microsoft Windows computer that has valuable data on it. (Is that data backed up somewhere, in case your disk fails?)
Oh my, @CountingDown you’ve got almost twice the amount of files I do! Hope we find someone to help us out!
@ucbalumnus , yes, I’ve got them all backed up on the cloud and most of them on a external hard drive also (need to update the back up while I’m thinking about it). My computer crashed about a year ago and the hard drive couldn’t be repaired, so I finally got a subscription to a cloud back-up service. Luckily I’d done a fairly recent backup on the external hard drive, so I only lost a few files.
Actually that’s one reason I want to get things organized and cleaned up, because even though my computer was under warranty, the company wouldn’t replace it - they just replaced the hard drive, but it’s still acting wonky and now the warranty has expired, so I’m going to purchase a new one (different brand) and I don’t want to carry over my mess of a directory onto my new computer.
But yes, I agree, I don’t think I trust myself doing the Linux stuff even though I’ve backed up my files.