Image annotation plays a significant role in computer vision, the technology that allows computers
to gain high-level understanding from digital images or videos and to see and interpret visual information
just like humans.
We need to create a novel labelled dataset for use in computer vision projects, data scientists and ML
engineers have the choice between a variety of annotation types they can apply to images.
Researchers will use an image markup tool to help with the actual labelling.
To create a novel labelled dataset for use in computer vision projects,
data scientists and ML engineers have the choice between a variety of annotation
types they can apply to images. Researchers will use an image markup tool to help
with the actual labelling.
Let’s compare and summarise the three common annotation types within computer vision:
In image classification, Our goal is to simply identify which objects and other properties exist
in an image without localizing them within the image.
Image object detection, we go one step further to find the position (established by using bounding boxes)
of individual objects within the image.
In image segmentation, Our goal is to recognize and understand what's in the image at the pixel level.
Every pixel in an image is assigned to at least one class, as opposed to object detection, where the bounding
boxes of objects can overlap. This is also known as semantic segmentation.
For that we need to be acquainted with the different image annotation methods themselves.
Let’s analyse the most common image annotation techniques.
1. Bounding Boxes
Bounding boxes are one of the most commonly used types of image annotation in all of computer vision,
thanks in part to their versatility and simplicity. Bounding boxes enclose objects and assist the computer
vision network in locating objects of interest.
2. Polygonal Segmentation
Another type of image annotation is polygonal segmentation, and the theory behind it
is just an extension of the theory behind bounding boxes. Polygonal segmentation tells a
computer vision system where to look for an object, but thanks to using complex polygons
and not simply a box, the object’s location and boundaries can be determined with much greater accuracy.
3. Line Annotation
Line annotation involves the creation of lines and splines, which are used primarily to delineate
boundaries between one part of an image and another. Line annotation is used when a region that needs
to be annotated can be conceived of as a boundary, but it is too small or thin for a bounding box or
other type of annotation to make sense.
4.Landmark Annotation
A fourth type of image annotation for computer vision systems is landmark annotation, sometimes
referred to as dot annotation, owing to the fact that it involves the creation of dots/points across an image.
Just a few dots can be used to label objects in images containing many small objects, but it is common for
many dots to be joined together to represent the outline or skeleton of an object.
5. 3D Cuboids
3D cuboids are a powerful type of image annotation, similar to bounding boxes in that they distinguish
where a classifier should look for objects. However, 3D cuboids have depth in addition to height and width.
6. Semantic Segmentation
Semantic segmentation is a form of image annotation that involves separating an image into different
regions, assigning a label to every pixel in an image. Regions of an image that carry different semantic
meanings/definitions are considered separate from other regions. For example, one portion of an image
could be “sky”, while another could be “grass”. The key idea is that regions are defined based on semantic
information, and that the image classifier gives a label to every pixel that comprises that region.