# Data augmentation

# What is data augmentation?

Any training - be it classification or object detection - requires an appropriate dataset. The accuracy of the model depends on it. Data augmentation is a process aimed at increasing the amount of data and thus the quality of the dataset.

This is usually done by adding slightly modified copies of already existing data, or by creating synthetic data based on existing data.

# OSAI Dataset Manipulator

Many frameworks provide data augmentation functions, but it is a blind process, meaning that the user is unable to verify it.

The OSAI Dataset manipulator gives more control over this process. Thanks to cloud-based operations performed by our servers with powerful GPUs, you can increase the value of the dataset in training in a controlled way by manually setting the data augmentation parameters.

To access the Dataset Manipulator, press the Dataset manipulator button in the detailed Datasets view.

Dataset manipulator view
Dataset manipulator view

# Types of data augmentation

The dataset manipulator creates a new, larger dataset by adding slightly modified copies of already existing images. Copies are created by applying a selection of geometric operations to the source image.

By default, the operations are performed in the order specified below. You can change the order by using the arrows next to each operation.

There are currently 3 operations supported in the manipulator:

  • Rotation
  • Brightness and Contrast
  • Resize

# Rotation

Based on a set of input images (or, if this is not the first operation, previously generated data), it iteratively creates a series of new images. In each iteration, the input image is rotated by a multiple of the Angle specified in the parameter. The loop ends when the rotation reaches 360 degrees, or when the limit of images per rotation specified in the Generate up to parameter is reached.

For example, for an Angle of 30 degrees, 12 new photos rotated by angles of 30°, 60°, 90°, 120°, 150°, 180°, 210°, 240°, 270°, 300°, 330° and 360° will be generated.

A maximum of 35 images can be created from a single photo. Blank spaces are replaced by a black area in JPG format or by transparency in PNG format.

The rotation operation may cause some part of the annotated object to fall outside the image, i.e. to be cropped. The Crop threshold function determines how much of the object must be visible in the resulting image for the annotation to make sense. Setting the crop threshold will cause annotations of objects with visibility less than the indicated percentage to be removed.

Rotation
Rotation

# Resize

The target dimensions are specified by the Width and Height parameters. The aspect ratio can also be changed. The modification is applied to all input images (or, if this is not the first operation, to already generated data).

The size of the output image can range from 8x8 to 4096x3112 px.

Resize
Resize

# Brightness and Contrast

The number of new images generated from one input is determined by the Generate up to parameter.

The formula used here is:

α ⋅ i (x,y) + β

Where:

α - the parameter responsible for changing the contrast in the range [0.1-3.0]. (Contrast min and Contrast Max)

i(x,y) - the value of the input pixel of the image with x and y coordinates

β - the parameter responsible for changing the brightness in the range [0,255]. (Brightness var)

Each created image will have a brightness value which is the sum of the original value and a random value from the [-β, β] range. The parameter α, responsible for contrast, will also be randomly generated, taking into account the lower and upper values specified in the parameter (they cannot exceed the range specified above).

The number of new images generated from one input is determined by the Generate up to parameter.

Brightness and contrast
Brightness and contrast