June 15, 2020, by Faraz Khan

Trouble Finding Wally? Train a Custom Object Detection Framework to Find Him For You

As part of the Discovery Programme vision, which is to support and empower researchers by providing an advanced digital environment, this blog post aims to introduce researchers to an open source object detection framework that can be re-trained using their respective datasets.

Opensource libraries and API’s have made it possible for researchers to easily train their own object detection frameworks to be used with their respective image datasets. Object detection algorithms make up an integral part of many advanced computer vision solutions to some extent. The benefit that object detection provides over conventional classification of images is to classify, locate and to draw a bounding box around the identified object, thereby producing a more localized classification result. There are several object detection algorithms popular in this area (such as Faster R-CNN, YOLO, RetinaNet, SSD etc.) each having their own advantages and limitations, however, this blog post will not dive into the technical details of such algorithms but will focus on the practical aspect and possibilities of object detection networks by utilizing readily available API’s such as the TensorFlow Object Detection API.

The TensorFlow Object Detection API is an opensource framework that allows the use of pretrained object detection models in order to detect common everyday items. The models within the API have been trained using the COCO dataset (Common Objects in Context), which is a collection of 300,000 images of 90 most common objects from everyday life.

Multiple pre-trained models are made available by the API that provide a tradeoff between detection speed and bounding box accuracy. These pretrained models can be further re-trained to locate and detect a specific object from within our own dataset using transfer learning.

Using your own custom dataset and the pretrained models made available in the API, the following steps are needed to create a custom object detection solution:

  • Collect images (at least a few hundred) that represent our object of interest. The collected dataset must show our object of interest in a large variation of scale, pose and lighting.
  • Prepare and label the dataset in a format that is compatible with the object detection API. Various free and open source tools are available that can aid in the labeling process. One of such tools is LabelImg.
  • Re-train a model using your custom dataset.
  • Test your trained object detector on new and unseen images or videos.

Examples of Custom Object Detection Frameworks

Training an object detection framework to find wally for you

Using the TensorFlow object detection API and the method explained above, one individual was able to train a model for the purpose of finding Wally within “Where’s Wally?” images. A dataset of images containing Wally was collected and labelled using the above mentioned tool. The input to the network was in the form of images accompanied by bounding box information that identified the location of Wally within the training images to the network.

This allowed the network to learn the look and shape of Wally within such images. Once the dataset is ready, one of the pre-trained models is used and re-trained to find Wally. This simple process produced impressive results as the newly trained model was able to find the target object even in some of the more noisy examples. Some examples of input to the network and their respective outputs can be seen from the images below (click to enlarge):

Finding Wally Example 1 Wally Found Tensorflow

The full project files for “Here-is-Wally” along with the trained model can be downloaded from the following github repository.

Using Object Detection API to train a Custom Raccoon Detector


Another individual used the pre-trained object detection models made available in the API and retained them to produce a dedicated raccoon detector. The trained raccoon detector is capable of detecting raccoons within images, videos and live camera feeds.

As with other object detection models, the training process relied on having a healthy dataset of raccoons along with the accompanying label information. The author of this project scraped Google images and collected over 200 images of raccoons and manually labelled them using LabelImg. Transfer learning was then used to retrain the models using the raccoon data. The result of this process was a model whose sole task was to identify raccoons in images and videos. A demo of what the output looks like can be seen from the gif below and  project files itself can be downloaded from the following github repository.

The above examples demonstrate that by effectively using opensource tools, it is quite possible to produce a custom object detection framework, one that can identify and locating your object of interest within images and videos. Researchers can build upon this by labeling their own datasets and training their own custom model to automate the task of object detection, segmentation and classification.

If you are interested in exploring the possibilities of a custom object detection framework or incorporating computer vision within your research, please get in touch with a Digital Research Specialist.

Posted in Discovery ProgrammeProcess Automation