Image augmentation: missing rotation #800

ivanquirino · 2019-06-07T01:10:54Z

It's nice to have those nice image augmentations in both GluonCV and Gluon APIs, but I am missing rotation utilities, to randomly rotate images and bounding boxes. These would be a nice feature to have.

Another missing thing is that in Gluon we have the nice Compose API, to chain together various transforms in a nice way. It would be cool if the GluonCV transforms could be mixed in the compose API.

zhreshold · 2019-06-07T19:00:47Z

The existing network and loss don't respect rotated bounding boxes, so the best bet is that we can adjust the bounding boxes to rotated objects. This way is doable, but expensive. But just as you have said, it's a nice feature to have.
For image classification, we use Compose in GluonCV, but with object detection, it's getting more complicated, I will rethink about it.

ivanquirino · 2019-06-08T00:18:24Z

@zhreshold thanks for answering.

Adjusting the bouding boxes is exactly what I was thinking.
I've built a custom transform class myself for my custom training script which is based on the training script on the VOC dataset example, I use Compose transforms on it, because I needed a YOLOv3 transform without random flipping or cropping:

class CustomTransform(object):
    def __init__(self, net=None, size=(SIZE, SIZE), mean=(0.485, 0.456, 0.406),
    std=(0.229, 0.224, 0.225), **kwargs):
        self._size = size
        self._mean = mean
        self._std = std
        
        self._img_transform = transforms.Compose([
            transforms.Resize(size, keep_ratio=True),
            transforms.RandomColorJitter(0.1, 0.1, 0.1, 0.1),
            transforms.RandomLighting(0.1),
            transforms.ToTensor(),
            transforms.Normalize(mean=mean, std=std)
        ])
        
        self._target_generator = None
        
        if net is None:
            return
        
        # in case network has reset_ctx to gpu
        self._fake_x = nd.zeros((1, 3, size[0], size[0]))
        net = copy.deepcopy(net)
        net.collect_params().reset_ctx(None)
        
        with autograd.train_mode():
            _, self._anchors, self._offsets, self._feat_maps, _, _, _, _ = net(self._fake_x)
            
        self._target_generator = YOLOV3PrefetchTargetGenerator(
            num_class=len(net.classes), **kwargs)
        
    def __call__(self, src, label):
        h, w, _ = src.shape        
        
        img = self._img_transform(src)
        label = bbox.resize(label, (w,h), (self._size[0], self._size[1]))
        
        if  self._target_generator is None:
            return img, label.astype(img.dtype)
        
        gt_bboxes = nd.array(label[np.newaxis, :, :4])
        gt_ids = nd.array(label[np.newaxis, :, 4:5])
        gt_mixratio = None
        
        objectness, center_targets, scale_targets, weights, class_targets = self._target_generator(
            self._fake_x, self._feat_maps, self._anchors, self._offsets,
            gt_bboxes, gt_ids, gt_mixratio)
        
        return (img, objectness[0], center_targets[0], scale_targets[0], weights[0],
                class_targets[0], gt_bboxes[0])

It would be nice if GluonCV provided official trainings script for custom datasets on the computer vision models: classification, detection and segmentation. There's a tutorial for preparing your custom data for object detection, but we end up having to find a script on the web of adapting our own from the scripts suited for the standard datasets.

ivanquirino · 2019-06-08T00:24:47Z

I have a problem with my traning script on the validation stage, I get somethng like the following output for both VOCApMetric and VOC07ApMetric:

CLASS_1=NaN
CLASS_2=NaN
CLASS_3=NaN
CLASS_4=99%
CLASS_5=98%
CLASS_6=96%
mAP=97%

I think this problem may be

my dataset is really small
CLASS_3 and CLASS_6 are too similar and can be turned into a single class
VOCApMetric classes are not suited for custom data and I should do my own metrics.

I would appreciate some input on this since I'm kinda new to ML in general, but I'm looking into this for an object detection application.

zhreshold · 2019-06-10T18:34:07Z

The nan's looks suspicious to me, may be you don't have such classes in your validation set?

ivanquirino · 2019-06-10T20:12:21Z

That's my validation LST file. It contains all classes.
pads.val.txt

What's strange is that after training the network can detect correctly every image in the validation set.

ivanquirino · 2019-06-10T20:33:32Z

Correction: there is one image in the validation set that it wrongly classifies. I think the main problem is in my labeling, every label in classes 3 and 6 relate to an object which there's no left and right but those objects are being labeled as left and right. The other four classes have left and right labeled correctly. The correct way to label is to put classes 3 and 6 as the same class, have have 5 classes instead of 6.

Thanks for your input, @zhreshold , you made me see the error better.

zhreshold added the enhancement New feature or request label Jun 7, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Image augmentation: missing rotation #800

Image augmentation: missing rotation #800

ivanquirino commented Jun 7, 2019

zhreshold commented Jun 7, 2019

ivanquirino commented Jun 8, 2019

ivanquirino commented Jun 8, 2019

zhreshold commented Jun 10, 2019

ivanquirino commented Jun 10, 2019

ivanquirino commented Jun 10, 2019

Image augmentation: missing rotation #800

Image augmentation: missing rotation #800

Comments

ivanquirino commented Jun 7, 2019

zhreshold commented Jun 7, 2019

ivanquirino commented Jun 8, 2019

ivanquirino commented Jun 8, 2019

zhreshold commented Jun 10, 2019

ivanquirino commented Jun 10, 2019

ivanquirino commented Jun 10, 2019