Avoiding Duplicate Images

Duplicate images in an image collection is not good. Never mind all the extra disk space it uses, more important is the confusion it can cause and the time it wastes. For instance, someone searches for a photo and finds multiples of the same image. They don’t know which to use. They may be different sizes or qualities. They may have to take the time to inquire with colleagues to understand which should be used. This wastes both peoples time. Another example: someone wants up add data such as tags, author or copyright, but end up updating only one or have to bother updating both. To compound matters, often there’s not just a single duplicate. There might be 3 or 5 or 10 copies of an image. A clean, duplicate free image collection is so much more valuable than one littered with confusing and time wasting copies.

No Reason for Duplicates with DBGallery

Avoiding duplicates within DBGallery falls into two categories:

1. Tools to negate the need for them

2. Detecting when they exist

3. Finding related images

Tools to negate the need for duplicates

There are times when someone on a team will be tempted to create duplicates.  Two clear examples for why: 

  • They're creating a promotional campaign and look around for the images they wish to include, making a copy of the file each time they frequently build out a campaign.  This is the most common reason for creating duplicate images.

Avoiding this within DBGallery: Use "Collections".  This is one the product's most loved and most convenient features, where it creates pointers back to original images using "Collections".  Drag an image from any folder and drop it onto a collection to create a pointer to the original image.  In the Collection, the thumb will show as usual, and opening it will show data, allow downloads, etc., as with an image in a folder.  Collections look like folders, and can have a similiar structure of sub-Collections, but store only pointers (or shortcuts) rather than making a copy of the file.  When no longer needed, a Collection can just be deleted without effecting the original images.  "Collections" can be configured in the UI to be named anything, the most popular being "Projects", "Campaigns", and "Light Boxes".  See Collections in our Knowledge Base for a full description of this great feature. 

  • Storage of various size images so they can be easily downloaded.

Avoiding this within DBGallery: There is a dropdown in each image preview to choose various image sizes for download.  No need to store them seperately.

Tools to detect and cleanup duplicates

DBGallery is able to detect duplicates as they are uploaded and can check for them across the entire collection.

Detection during Upload

The most appropiate place to check is when they're being added to the system.  Why let them be added at all right?  DBGallery has a checkbox in its upload dialog which everyone should use: Detect Duplicates.  When duplicates are detected the upload page lights up a "Resolve Duplicates" button (Figures 1), which leads to a "Resolve Duplicates" page (Figure 2).

Figure 1: The upload dialog having detected duplicates.


Figure 2: The resolve duplicates page shown when there are duplicates detected during upload.


Global Duplicates Detection

Unfortunately the upload process by itself isn't sufficient.  Duplicates can sneak through or exist in the initial set of images added to DBGallery during system setup. 

For these scenarios there is a Global Duplicates Check.  It is found in the Tools menu of DBGallery's main page.  It looks and operates very much the same as the upload check, with some additional options because it can be quite some effort to clean up a large image collection after initially populating it.  One difference: there is an option to ignore a group of duplicates when in rare case they're valid or need to be kept around while it's decided what the initial creator wants to do with them.

Figure 3: The global duplicates check page.

Finding Related Images

When the above exact duplicates isn't sufficiuent, finding near-duplicates through related images can be effective.  This is when the images aren't exactly the same but so close they aren't needed, such as when burst mode on a camera is used.  Introduced in the winter of 2024, Find Related Images searches for photos that are visually similar, are taken 5 minutes before or after the selected image, as well as those with similar data.  The screenshot below shows how this looks.  See Related Images in our Knowledge Base for a more detailed look at this capability.

Figure 4: The Related Images sidebar.


In summary, I'll repeat what was said above: a clean, duplicate free image collection is so much more valuable than one littered with confusing and time wasting copies.  Duplicate images are evil!  The good news there is no need for them to be a problem when provided with ways to detect them and tools to negate the need to have them in the first place.  Both of which are provided in DBGallery.