Annotating images
Creating high-quality structured training data from an unstructured dataset of images
The way you label your image data depends entirely on your use case and ontology. For example, if you’re building an app that recognizes types of common houseplants, a simple image classification algorithm might do the job, and you’ll only need to label each image to categorize the plant in the image. If your algorithm is being used to identify multiple objects in the ocean in satellite imagery, however, you’ll need to draw bounding boxes around each instance of objects of each class and label them correctly. And if your machine learning model needs to complete a more complex task, such as finding potential hazards from images captured by a dash camera on a car, drawing segmentation masks will likely be the best choice.
Image classification
Object detection
Image segmentation
Image classification is usually a simple labeling task, where the labeler only has to identify the class that the image belongs to. This type of labeling can be used for use cases where the model will only receive information containing one object or class, and doesn’t need to derive any further information from it, such as location, pose, shape, etc.
Object detection requires labelers to draw bounding boxes around objects of each instance of each class within an image. They will then need to annotate the boxes accurately. This type of label can be used for a wide variety of use cases, from identifying products in a warehouse to identifying cells, growths, or specific abnormalities in medical imagery.
Image segmentation is often a complex and time-consuming labeling task, as labelers need to carefully draw the outline of each object in the image. Segmentation masks allow algorithms to not only identify the object and location shown within the image, but also understand occluded or overlapped objects, the pose and shape of each object, and more.
Image classification
Image classification is usually a simple labeling task, where the labeler only has to identify the class that the image belongs to. This type of labeling can be used for use cases where the model will only receive information containing one object or class, and doesn’t need to derive any further information from it, such as location, pose, shape, etc.
Object detection
Object detection requires labelers to draw bounding boxes around objects of each instance of each class within an image. They will then need to annotate the boxes accurately. This type of label can be used for a wide variety of use cases, from identifying products in a warehouse to identifying cells, growths, or specific abnormalities in medical imagery.
Image segmentation
Image segmentation is often a complex and time-consuming labeling task, as labelers need to carefully draw the outline of each object in the image. Segmentation masks allow algorithms to not only identify the object and location shown within the image, but also understand occluded or overlapped objects, the pose and shape of each object, and more.
Choosing tools for image annotation
The tools you choose for your image annotation machine learning project should depend on your current resources, use case, and future goals.
Resources
- How many ML engineers, data scientists, and stakeholders are on your team?
- What does your labeling budget look like?
- Are you going to use an internal or external labeling team?
- Which ML tools do your engineers prefer?
Use case
- Is your labeling task simple (image classification) or more complex (image segmentation)?
- How much data do you need to label to get your model up and running?
- How quickly do you need your model to be ready?
- Do the labeling tools you need for your ontology already exist?
Future goals
- How will your ML projects scale in the next two to five years?
- Will you need to label different types of data in the future?
- Will you need to label data differently for future use cases?
- Will your labeling budget grow enough to support your requirements?
While your team can rely on in-house or open source labeling tools, they are almost certain to cause delays or come up short as you scale your machine learning operations. A training data platform can help your team with even small, simple, and short term projects, often for free. You can then increase usage and complexity gradually as your projects scale and your needs evolve. Your labelers will also find it easier to use one editor for all your image annotation projects, whether they involve image classification, object detection, or image segmentation. This way, your team will also save time creating or finding a new labeling solution for every new computer vision project.