Mastering AI Vision: Essential Best Practices for Custom Object Recognition Training

AI animation graphic

In the rapidly evolving field of artificial intelligence, custom object recognition models have become invaluable tools across industries. It can be used in a variety of applications, including retail, healthcare, manufacturing, and other businesses. However, the success of these models hinges critically on the quality and preparation of training data.

This post outlines key best practices for dataset curation and model training to help you achieve optimal performance in your object identification projects, where detecting, tracking, and classification of objects are critical. By following these guidelines you can get started on a precisely trained AI object detection model which has real-world applicability.  If there is already a model who's accuracy isn't as high as needed, you can significantly improve its accuracy and efficiency.

Dataset Preparation

Image Selection
  • Choose clear, easily recognizable examples. Remove ambiguous images where even humans might struggle to identify the object.
  • Select images where the primary object:
    • Is the main focus of the photo.
    • Occupies at least 40% of the image area, ideally 70%.  Cropping can work here.  Or use bounding boxes when context is needed.
  • Avoid images containing multiple objects unless context is crucial, in which case bounding boxes should be used.
  • Include various lighting conditions and angles to improve model robustness.
  • Avoid overlapping samples: If "Kitchen Sinks" and "Kitchen Faucets" are objects to be recognized, don't have photos in the "Kitchen Faucets" samples that also include the whole sink.
  • When selecting images, have those who will use the outcome of the model help in the selection, or at a minimum, review those chosen.
Category Guidelines
  • In general, aim for 250-500 high-quality images per label or category. This may increase when training for complex images where the subtle differences require more samples.
  • Maintain a balanced number of images across categories to prevent bias in the model. Try not to have 100 image for one category and 2000 for another. This may be acceptable for some circumstances, such as 100 for a well defined object, but 2000 for a more complex category.
  • In cases where there may be multiple objects in the same images, group related categories together when cleaning or optimizing datasets. For example, it may be difficult to seperate washers and dryers in a laundry room.  Rather than having a seperate Washer category and a Dryer category, consider a name such Laundry Room Equipment instead.
Contextual Considerations
  • When context is critical (e.g., distinguishing between living room and bathroom ceiling lighting), include some contextual elements in the images.  E.g. For Bathroom Lighting, have other aspects of a bathroom, such as a mirror or shower, on the periphery of the lights. 
  • Choose the appropriate level of granularity for object categories. Combine similar categories that are difficult to distinguish (e.g., different types of flooring).
Bounding Boxes
_1b336de7-c590-414d-9a34-74e6a326c65a.jpeg

Drawing bounding boxes around objects in an image is time consuming.  It's best to avoid them when possible.  Choosing clean and consice examples, as in the Image Selection section above, can be a great help in avoiding them.  But when context is required, bounding boxes may be required.

Use bounding boxes when:

  • Context is required.
  • Multiple objects are present in an image.
  • The primary object doesn't fill a significant portion of the image.
  • Precise object localization is required.

After deciding to use them:

  • Ensure bounding boxes are applied consistently in both training and testing datasets.
  • For objects that don't fit well in rectangles (e.g., fences), consider using more advanced annotation techniques like polygonal segmentation.

Model Training and Optimization

  • Use 15 or 20 percent of the sample images as test images.  As part of learning, either you choose a test set of image or the AI can automatically pick some percent of the sample feed to test its model.  This is also used to report on accuracy, such as providing a F1 score.
  • Continuously add excellent examples to the training dataset as they are found.
  • Regularly evaluate the model's performance on a diverse test set to ensure generalization.
  • The AI's accuracy report is a good start in measuring the model's performance, but running a set of real-world images is the means to truly determine accuracy. 

Additional Best Practices

  • Data Augmentation: Apply techniques like rotation, flipping, and color jittering to artificially expand your dataset and improve model robustness.
  • Transfer Learning: Utilize pre-trained models on large datasets as a starting point to improve performance, especially with limited data.
  • Regular Model Updates: Periodically retrain your model with new, diverse data to maintain and improve its accuracy over time.
  • Quality Control: Implement a systematic review process for your dataset, ensuring all images and annotations meet your quality standards.
  • Performance Metrics: Use appropriate metrics (e.g., F1 score) to evaluate your model's performance. This is provided after training in the training tool being used, such as AWS's Rekognition or Microsoft's Vision.

 

Getting Started

Your model will be as accurate as the images provided the AI. If you're just getting started, here is what's most important:

  1. Select the right images!
  2. Have the right number of images, 250 - 500 is sufficient in most cases.
  3. Getting slightly more advanced, consider bounding boxes where context is important.
  4. Be prepared to iteratively train until the desired accuracy rate is achieved.

An easy to path to trying out custom-trained object detection would be to go to Amazon's AWS Custom Labels or Microsoft's Azure Custom Vision and start a project.  

Here are 3 easy to follow Amazon AWS custom-label recognition videos, which in our experience makes it super clear as to how to get started in training a model:
- Create the project: https://youtu.be/Mse5Jgh9n3M
- Training a model: https://youtu.be/662YN3jeJWU
- Evaluating a model: https://youtu.be/b6_-h84qxU8

Once completed, use it by coping the model's ARN to DBGallery's Advanced tab in Preferences.  That's all that's needed to using it in DBGallery.  Then as images are uploaded the will be auto-tagged with results from your very own model!  If not already using DBGallery, request a trial that has custom-model capabilities using our Contact Form or emailing DBGallery Support.

Conclusion

By following these best practices, you can significantly improve the quality and effectiveness of your custom AI object recognition model. Remember that the process is iterative, and continuous refinement based on performance feedback is key to achieving optimal results.

Navigating the complexities of data preparation, model selection, and training optimization often requires specialized expertise. If you're looking to maximize the potential of your AI projects, consider partnering with experienced consultants who can guide you through every step of the process. Our experienced team offers comprehensive model training consultation services, helping you turn your object recognition challenges into powerful, production-ready solutions. Contact us today to learn how we can elevate your AI initiatives and drive tangible results for your business. For a real-world example where a customer has successfully utilized a custom-trained model, see our Lane Consulting Services Case Study or the LCS Revolutionizes Multifamily Housing Sector press release where they cover the effectiveness of their use of the technology. 

DBGallery Logo