The first step in object detection

In computer vision and image processing, object detection is a process of detecting instances of semantic objects of a certain class/group such as humans, cats, dogs, buildings, cars and so on in digital images and videos.

To build an Object detection model, you need a dataset of tagged/labelled images. Your model learns by examples, It looks into each image in your dataset and then looks up the areas (bounding boxes or polygons) you tagged.

For example here, I wanted to build a model to detect my dog @cookietheStaffy in the photos. I tagged the areas that my dog is in the photo and labelled it.



There are many tools that you can use for tagging images like

https://github.com/Microsoft/VoTT
http://www.robots.ox.ac.uk/~vgg
https://www.labelbox.com/


I used VGG to tag my images. You can export the JSON result of your tagged images easily.





In the real-world scenarios, you might have thousands of images that you want to label. Obviously,  it's not a one-person job.
My advice is before tagging you need to make sure all the images have a unique name. Then, split your images into small batches and tag them and then export the JSON object for each batch.

With below script, you can merge all of the tagged images and the exported JSON results into one file.

https://github.com/Azadehkhojandi/VGG-Image-Annotator-Json-Merger/blob/master/jsonmerge.ipynb



0