k-means anchors ratios #308

mnslarcher · 2020-05-17T14:36:45Z

Hi,

If someone want to give it a try, like https://github.com/Cli98/anchor_computation_tool, I also adapted the code of https://github.com/zhouyuangan/K-Means-Anchors for obtaining anchors ratios using K-Means.

The main difference is that my implementation is faster (less for loops), it takes as input a .json file (not a xml) and has some additional settings (but the Changlin Li repo has more visualization).

https://github.com/mnslarcher/kmeans-anchors-ratios

If you try my code on COCO you will obtain anchors ratios in line with this repo (but for your data set they can be different and this is why it's useful).

Any feedback is appreciated.

UPDATE: I added the commands to get the optimal anchors ratios for COCO, the returned ratios are [(0.6, 1.5), (1.0, 1.0), (1.4, 0.7)], very near to [(0.7, 1.4), (1.0, 1.0), (1.4, 0.7)] (default values of this repo).

The text was updated successfully, but these errors were encountered:

Cli98 · 2020-05-21T04:44:02Z

Hi,

If someone want to give it a try, like https://github.com/Cli98/anchor_computation_tool, I also adapted the code of https://github.com/zhouyuangan/K-Means-Anchors for obtaining anchor ratios using K-Means.

The main difference is that my implementation is faster (less for loops), it takes as input a .json file (not a xml) and has some additional settings (but the Changlin Li repo has more visualization).

https://github.com/mnslarcher/kmeans-anchor-ratios

If you try my code on COCO you will obtain anchor ratios in line with this repo (but for your data set they can be different and this is why it's useful).

Any feedback is appreciated.

UPDATE: I added the commands to get the optimal anchor ratios for COCO, the returned ratios are [(0.6, 1.5), (1.0, 1.0), (1.4, 0.7)], very near to [(0.7, 1.4), (1.0, 1.0), (1.4, 0.7)] (default values of this repo).

@mnslarcher Good to see this.

mnslarcher · 2020-05-21T05:50:55Z

Thanks @Cli98 !

Cli98 · 2020-05-27T06:47:24Z

Hi,

If someone want to give it a try, like https://github.com/Cli98/anchor_computation_tool, I also adapted the code of https://github.com/zhouyuangan/K-Means-Anchors for obtaining anchors ratios using K-Means.

The main difference is that my implementation is faster (less for loops), it takes as input a .json file (not a xml) and has some additional settings (but the Changlin Li repo has more visualization).

https://github.com/mnslarcher/kmeans-anchors-ratios

If you try my code on COCO you will obtain anchors ratios in line with this repo (but for your data set they can be different and this is why it's useful).

Any feedback is appreciated.

UPDATE: I added the commands to get the optimal anchors ratios for COCO, the returned ratios are [(0.6, 1.5), (1.0, 1.0), (1.4, 0.7)], very near to [(0.7, 1.4), (1.0, 1.0), (1.4, 0.7)] (default values of this repo).

@mnslarcher lol you claimed that your repo is faster than mine. I don't agree on that since I see no reasons to justify this.

mnslarcher · 2020-05-27T07:31:53Z

Hi,
If someone want to give it a try, like https://github.com/Cli98/anchor_computation_tool, I also adapted the code of https://github.com/zhouyuangan/K-Means-Anchors for obtaining anchors ratios using K-Means.
The main difference is that my implementation is faster (less for loops), it takes as input a .json file (not a xml) and has some additional settings (but the Changlin Li repo has more visualization).
https://github.com/mnslarcher/kmeans-anchors-ratios
If you try my code on COCO you will obtain anchors ratios in line with this repo (but for your data set they can be different and this is why it's useful).
Any feedback is appreciated.
UPDATE: I added the commands to get the optimal anchors ratios for COCO, the returned ratios are [(0.6, 1.5), (1.0, 1.0), (1.4, 0.7)], very near to [(0.7, 1.4), (1.0, 1.0), (1.4, 0.7)] (default values of this repo).

@mnslarcher lol you claimed that your repo is faster than mine. I don't agree on that since I see no reasons to justify this.

Hi @Cli98 :)! I'm saying this because you use the implementation of https://github.com/zhouyuangan/K-Means-Anchors. From that implementation I removed this

        for row in range(rows):
            distances[row] = 1 - iou(boxes[row], clusters)

by vectoring the iou calculation.

From you notebook:

    rows = boxes.shape[0]

    distances = np.empty((rows, k))
    last_clusters = np.zeros((rows,))

    np.random.seed()

    # the Forgy method will fail if the whole array contains the same rows
    clusters = boxes[np.random.choice(rows, k, replace=False)]

    while True:
        for row in range(rows):
            distances[row] = 1 - iou(boxes[row], clusters)

        nearest_clusters = np.argmin(distances, axis=1)

        if (last_clusters == nearest_clusters).all():
            break

        for cluster in range(k):
            clusters[cluster] = dist(boxes[nearest_clusters == cluster], axis=0)

        last_clusters = nearest_clusters

    return clusters

This is mine:

    np.random.seed(seed)
    last_nearest_centroids = np.ones(boxes.shape[0]) * -1
    # the Forgy method will fail if the whole array contains the same rows
    centroids = boxes[np.random.choice(boxes.shape[0], num_clusters, replace=False)]
    i = 0
    while True:
        if i >= max_iter:
            logger.warning(
                f"[{datetime.now().strftime('%m/%d %H:%M:%S')}] "
                "Maximum number of iterations reached, increase max_inter to do more "
                f"iterations (max_inter = {max_iter})"
            )
            break

        nearest_centroids = np.argmax(iou(boxes, centroids), axis=1)

        for centroid in range(num_clusters):
            centroids[centroid] = centroid_calc_fn(
                boxes[nearest_centroids == centroid], axis=0
            )

        if (nearest_centroids == last_nearest_centroids).all():
            break

        last_nearest_centroids = nearest_centroids
        i += 1

    return centroids

This made the implementation faster as one would expect. I have an option to run K-Means more than one time but if you set --num-runs=1 you should see some benefit, obviously feel free to correct me but this is the reason

Cli98 · 2020-05-27T08:31:34Z

Hi,
If someone want to give it a try, like https://github.com/Cli98/anchor_computation_tool, I also adapted the code of https://github.com/zhouyuangan/K-Means-Anchors for obtaining anchors ratios using K-Means.
The main difference is that my implementation is faster (less for loops), it takes as input a .json file (not a xml) and has some additional settings (but the Changlin Li repo has more visualization).
https://github.com/mnslarcher/kmeans-anchors-ratios
If you try my code on COCO you will obtain anchors ratios in line with this repo (but for your data set they can be different and this is why it's useful).
Any feedback is appreciated.
UPDATE: I added the commands to get the optimal anchors ratios for COCO, the returned ratios are [(0.6, 1.5), (1.0, 1.0), (1.4, 0.7)], very near to [(0.7, 1.4), (1.0, 1.0), (1.4, 0.7)] (default values of this repo).

@mnslarcher lol you claimed that your repo is faster than mine. I don't agree on that since I see no reasons to justify this.

Hi @Cli98 :)! I'm saying this because you use the implementation of https://github.com/zhouyuangan/K-Means-Anchors. From that implementation I removed this
        for row in range(rows):
            distances[row] = 1 - iou(boxes[row], clusters)
by vectoring the iou calculation.

From you notebook:
    rows = boxes.shape[0]

    distances = np.empty((rows, k))
    last_clusters = np.zeros((rows,))

    np.random.seed()

    # the Forgy method will fail if the whole array contains the same rows
    clusters = boxes[np.random.choice(rows, k, replace=False)]

    while True:
        for row in range(rows):
            distances[row] = 1 - iou(boxes[row], clusters)

        nearest_clusters = np.argmin(distances, axis=1)

        if (last_clusters == nearest_clusters).all():
            break

        for cluster in range(k):
            clusters[cluster] = dist(boxes[nearest_clusters == cluster], axis=0)

        last_clusters = nearest_clusters

    return clusters
This is mine:
    np.random.seed(seed)
    last_nearest_centroids = np.ones(boxes.shape[0]) * -1
    # the Forgy method will fail if the whole array contains the same rows
    centroids = boxes[np.random.choice(boxes.shape[0], num_clusters, replace=False)]
    i = 0
    while True:
        if i >= max_iter:
            logger.warning(
                f"[{datetime.now().strftime('%m/%d %H:%M:%S')}] "
                "Maximum number of iterations reached, increase max_inter to do more "
                f"iterations (max_inter = {max_iter})"
            )
            break

        nearest_centroids = np.argmax(iou(boxes, centroids), axis=1)

        for centroid in range(num_clusters):
            centroids[centroid] = centroid_calc_fn(
                boxes[nearest_centroids == centroid], axis=0
            )

        if (nearest_centroids == last_nearest_centroids).all():
            break

        last_nearest_centroids = nearest_centroids
        i += 1

    return centroids
This made the implementation faster as one would expect. I have an option to run K-Means more than one time but if you set --num-runs=1 you should see some benefit, obviously feel free to correct me but this is the reason

@mnslarcher

Where is your vectorized iou code? Are you referring to this?

nearest_centroids = np.argmax(iou(boxes, centroids), axis=1)

mnslarcher · 2020-05-27T08:38:23Z

Where is your vectorized iou code? Are you referring to this?

nearest_centroids = np.argmax(iou(boxes, centroids), axis=1)

In that line I'm using the vectorized iou, this is my implementation:

def iou(boxes, anchors):
    """Calculates the Intersection over Union (IoU) between a numpy array of
    n boxes and an array of k anchors.
    Arguments:
        boxes (array_like): array of shape (n, 2) of boxes' widths and heights.
        anchors (array_like): array of shape (k, 2) of anchors' widths and heights.
    Returns:
        A numpy array of shape (n, k) containing the IoU values for each combination
        of boxes and anchors.
    """
    intersection_width = np.minimum(anchors[:, [0]], boxes[:, 0]).T
    intersection_height = np.minimum(anchors[:, [1]], boxes[:, 1]).T

    if np.any(intersection_width == 0) or np.any(intersection_height == 0):
        raise ValueError("Some boxes have zero size.")

    intersection_area = intersection_width * intersection_height
    boxes_area = np.prod(boxes, axis=1, keepdims=True)
    anchors_area = np.prod(anchors, axis=1, keepdims=True).T
    union_area = boxes_area + anchors_area - intersection_area
    return intersection_area / union_area

This is the original one:

def iou(box, clusters):
    """
    Calculates the Intersection over Union (IoU) between a box and k clusters.
    :param box: tuple or array, shifted to the origin (i. e. width and height)
    :param clusters: numpy array of shape (k, 2) where k is the number of clusters
    :return: numpy array of shape (k, 0) where k is the number of clusters
    """
    x = np.minimum(clusters[:, 0], box[0])
    y = np.minimum(clusters[:, 1], box[1])
    if np.count_nonzero(x == 0) > 0 or np.count_nonzero(y == 0) > 0:
        raise ValueError("Box has no area")

    intersection = x * y
    box_area = box[0] * box[1]
    cluster_area = clusters[:, 0] * clusters[:, 1]

    iou_ = intersection / (box_area + cluster_area - intersection)

    return iou_

If you see, one return a (n, k) array and the other a (k, 0)

Cli98 · 2020-05-27T09:05:40Z

Where is your vectorized iou code? Are you referring to this?
nearest_centroids = np.argmax(iou(boxes, centroids), axis=1)

In that line I'm using the vectorized iou, this is my implementation:

def iou(boxes, anchors):
    """Calculates the Intersection over Union (IoU) between a numpy array of
    n boxes and an array of k anchors.
    Arguments:
        boxes (array_like): array of shape (n, 2) of boxes' widths and heights.
        anchors (array_like): array of shape (k, 2) of anchors' widths and heights.
    Returns:
        A numpy array of shape (n, k) containing the IoU values for each combination
        of boxes and anchors.
    """
    intersection_width = np.minimum(anchors[:, [0]], boxes[:, 0]).T
    intersection_height = np.minimum(anchors[:, [1]], boxes[:, 1]).T

    if np.any(intersection_width == 0) or np.any(intersection_height == 0):
        raise ValueError("Some boxes have zero size.")

    intersection_area = intersection_width * intersection_height
    boxes_area = np.prod(boxes, axis=1, keepdims=True)
    anchors_area = np.prod(anchors, axis=1, keepdims=True).T
    union_area = boxes_area + anchors_area - intersection_area
    return intersection_area / union_area

This is the original one:

def iou(box, clusters):
    """
    Calculates the Intersection over Union (IoU) between a box and k clusters.
    :param box: tuple or array, shifted to the origin (i. e. width and height)
    :param clusters: numpy array of shape (k, 2) where k is the number of clusters
    :return: numpy array of shape (k, 0) where k is the number of clusters
    """
    x = np.minimum(clusters[:, 0], box[0])
    y = np.minimum(clusters[:, 1], box[1])
    if np.count_nonzero(x == 0) > 0 or np.count_nonzero(y == 0) > 0:
        raise ValueError("Box has no area")

    intersection = x * y
    box_area = box[0] * box[1]
    cluster_area = clusters[:, 0] * clusters[:, 1]

    iou_ = intersection / (box_area + cluster_area - intersection)

    return iou_

@mnslarcher I see.
But that will not change too much if you only vectorize iou function. Normally we have ~50 bboxs. That's not a large number. Did you time you kmean function? Any stats?

mnslarcher · 2020-05-27T09:39:40Z

@Cli98 , I did it but honestly I don't remember the exact result, maybe I will re-do it this we and post it here :)

@mnslarcher I see.
But that will not change too much if you only vectorize iou function. Normally we have ~50 bboxs. That's not a large number. Did you time you kmean function? Any stats?

Why do you say that we have ~50 bboxs? in my dataset I have 100k images with more ore less 10 boxes each so 1mln bboxes, far more than 50 if I get your point (probably I misunderstood you). Clearly if your dataset is small the benefit is basically zero.

In general the increase of speed is not a crucial feature of my repo, is only a plus, other than that I added a couple of functionalities that were useful for me and I will continue to add more, I have not done anything extraordinary, but if anyone wants to try it, I will be happy to receive any feedback

mnslarcher · 2020-06-06T23:13:22Z

Hi @Cli98 , I did the speed test, the result, if I haven't made any mistake, is that my implementation is much faster (70x faster on COCO train 2017).

I saw in a comment that you then edited that you didn't find any evidence of this so I'll post here the results and also how you can replicate them, I did a very fast experiment so feel free to tell me if you spot any mistake, I didn't use timeit just because I didn't want to wait but feel free to replace %%time with %%timeit.

The results (mine vs your):
with 100 bboxes: 3.47 ms vs 10.7 ms
with 1000 bboxes: 4.16 ms vs 104 ms
with 10000 bboxes: 20.8 ms vs 3.65 s
with 100000 bboxes: 531 ms vs 42.5 s
with 859999 bboxes (coco train 2017): 6.5 s vs 7min 24s

So if you see, for COCO mine take just 6.5s, your 7min 24s, both return the same result (mine is 70x faster in this experiment).

Get the bboxes:

import json
import numpy as np
import os

INSTANCES_PATH = "./annotations/instances_train2017.json"

if not os.path.exists(INSTANCES_PATH):
    !wget http://images.cocodataset.org/annotations/annotations_trainval2017.zip
    !unzip annotations_trainval2017.zip
    
with open(INSTANCES_PATH) as f:
    instances = json.load(f)
    
bboxes = np.array([i["bbox"][2:] for i in instances["annotations"] if np.prod(i["bbox"][2:]) > 0])
del instances

Define the functions (not_vectorized are the original fun that you are also using in your notebook, the others are the ones that I defined for my repo):

def not_vectorized_iou(box, clusters):
    """
    Calculates the Intersection over Union (IoU) between a box and k clusters.
    :param box: tuple or array, shifted to the origin (i. e. width and height)
    :param clusters: numpy array of shape (k, 2) where k is the number of clusters
    :return: numpy array of shape (k, 0) where k is the number of clusters
    """
    x = np.minimum(clusters[:, 0], box[0])
    y = np.minimum(clusters[:, 1], box[1])
    if np.count_nonzero(x == 0) > 0 or np.count_nonzero(y == 0) > 0:
        raise ValueError("Box has no area")

    intersection = x * y
    box_area = box[0] * box[1]
    cluster_area = clusters[:, 0] * clusters[:, 1]

    iou_ = intersection / (box_area + cluster_area - intersection)

    return iou_


def not_vectorized_kmeans(boxes, k, dist=np.median):
    """
    Calculates k-means clustering with the Intersection over Union (IoU) metric.
    :param boxes: numpy array of shape (r, 2), where r is the number of rows
    :param k: number of clusters
    :param dist: distance function
    :return: numpy array of shape (k, 2)
    """
    rows = boxes.shape[0]

    distances = np.empty((rows, k))
    last_clusters = np.zeros((rows,))

    np.random.seed()

    # the Forgy method will fail if the whole array contains the same rows
    clusters = boxes[np.random.choice(rows, k, replace=False)]

    while True:
        for row in range(rows):
            distances[row] = 1 - not_vectorized_iou(boxes[row], clusters)

        nearest_clusters = np.argmin(distances, axis=1)

        if (last_clusters == nearest_clusters).all():
            break

        for cluster in range(k):
            clusters[cluster] = dist(boxes[nearest_clusters == cluster], axis=0)

        last_clusters = nearest_clusters

    return clusters


def iou(boxes, anchors):
    """Calculates the Intersection over Union (IoU) between a numpy array of
    n boxes and an array of k anchors.

    Arguments:
        boxes (array_like): array of shape (n, 2) of boxes' widths and heights.
        anchors (array_like): array of shape (k, 2) of anchors' widths and heights.

    Returns:
        A numpy array of shape (n, k) containing the IoU values for each combination
        of boxes and anchors.
    """
    intersection_width = np.minimum(anchors[:, [0]], boxes[:, 0]).T
    intersection_height = np.minimum(anchors[:, [1]], boxes[:, 1]).T

    if np.any(intersection_width == 0) or np.any(intersection_height == 0):
        raise ValueError("Some boxes have zero size.")

    intersection_area = intersection_width * intersection_height
    boxes_area = np.prod(boxes, axis=1, keepdims=True)
    anchors_area = np.prod(anchors, axis=1, keepdims=True).T
    union_area = boxes_area + anchors_area - intersection_area
    return intersection_area / union_area


def kmeans(boxes, num_clusters=3, max_iter=300, seed=None, centroid_calc_fn=np.median):
    """Calculates K-Means clustering using the Intersection over Union (IoU) metric.

    Arguments:
        boxes (array_like): array of the bounding boxes' heights and widths, with
            shape (n, 2).
        num_clusters (int): the number of clusters to form as well as the number of
            centroids to generate.
        max_iter (int, optional): maximum number of iterations of the K-Means algorithm
        for a single run. Default: 300.
        seed: (int, optional): random seed. Default: None.
        centroid_calc_fn (function, optional): function used for calculating centroids
            Default: numpy.median.

    Returns:
        A numpy array of shape (num_clusters, 2).
    """
    np.random.seed(seed)
    last_nearest_centroids = np.ones(boxes.shape[0]) * -1
    # the Forgy method will fail if the whole array contains the same rows
    centroids = boxes[np.random.choice(boxes.shape[0], num_clusters, replace=False)]
    i = 0
    while True:
        if i >= max_iter:
            logger.warning(
                f"[{datetime.now().strftime('%m/%d %H:%M:%S')}] "
                "Maximum number of iterations reached, increase max_inter to do more "
                f"iterations (max_inter = {max_iter})"
            )
            break

        nearest_centroids = np.argmax(iou(boxes, centroids), axis=1)

        for centroid in range(num_clusters):
            centroids[centroid] = centroid_calc_fn(
                boxes[nearest_centroids == centroid], axis=0
            )

        if (nearest_centroids == last_nearest_centroids).all():
            break

        last_nearest_centroids = nearest_centroids
        i += 1

    return centroids

Run the experiment that you wish, es:

%%time

kmeans(bboxes, 3)

Result:

CPU times: user 6.49 s, sys: 174 µs, total: 6.49 s
Wall time: 6.5 s

array([[201.49, 218.97],
       [ 55.75,  65.25],
       [ 15.97,  19.85]])

%%time

not_vectorized_kmeans(bboxes, 3)

Result:

CPU times: user 7min 28s, sys: 8.91 s, total: 7min 36s
Wall time: 7min 24s

array([[ 15.97,  19.85],
       [ 55.75,  65.25],
       [201.49, 218.97]])

Cli98 · 2020-06-06T23:21:30Z

@Cli98 , I did it but honestly I don't remember the exact result, maybe I will re-do it this we and post it here :)
@mnslarcher I see.
But that will not change too much if you only vectorize iou function. Normally we have ~50 bboxs. That's not a large number. Did you time you kmean function? Any stats?
Why do you say that we have ~50 bboxs? in my dataset I have 100k images with more ore less 10 boxes each so 1mln bboxes, far more than 50 if I get your point (probably I misunderstood you). Clearly if your dataset is small the benefit is basically zero.

In general the increase of speed is not a crucial feature of my repo, is only a plus, other than that I added a couple of functionalities that were useful for me and I will continue to add more, I have not done anything extraordinary, but if anyone wants to try it, I will be happy to receive any feedback

Apologize for late reply. I'm referring to single image instead, not for all of them. Normally we do not conduct kmean experiments on all bounding boxes given their large variation, not to say class imbalance issue.

Cli98 · 2020-06-06T23:32:09Z

Hi @Cli98 , I did the speed test, the result, if I haven't made any mistake, is that my implementation is much faster (70x faster on COCO train 2017).

I saw in a comment that you then edited that you didn't find any evidence of this so I'll post here the results and also how you can replicate them, I did a very fast experiment so feel free to tell me if you spot any mistake, I didn't use timeit just because I didn't want to wait but feel free to replace %%time with %%timeit.

The results (mine vs your):
with 100 bboxes: 3.47 ms vs 10.7 ms
with 1000 bboxes: 4.16 ms vs 104 ms
with 10000 bboxes: 20.8 ms vs 3.65 s
with 100000 bboxes: 531 ms vs 42.5 s
with 859999 bboxes (coco train 2017): 6.5 s vs 7min 24s

So if you see, for COCO mine take just 6.5s, your 7min 24s, both return the same result (mine is 70x faster in this experiment).

Get the bboxes:

import json
import numpy as np
import os

INSTANCES_PATH = "./annotations/instances_train2017.json"

if not os.path.exists(INSTANCES_PATH):
    !wget http://images.cocodataset.org/annotations/annotations_trainval2017.zip
    !unzip annotations_trainval2017.zip
    
with open(INSTANCES_PATH) as f:
    instances = json.load(f)
    
bboxes = np.array([i["bbox"][2:] for i in instances["annotations"] if np.prod(i["bbox"][2:]) > 0])
del instances

Define the functions (not_vectorized are the original fun that you are also using in your notebook, the others are the ones that I defined for my repo):

def not_vectorized_iou(box, clusters):
    """
    Calculates the Intersection over Union (IoU) between a box and k clusters.
    :param box: tuple or array, shifted to the origin (i. e. width and height)
    :param clusters: numpy array of shape (k, 2) where k is the number of clusters
    :return: numpy array of shape (k, 0) where k is the number of clusters
    """
    x = np.minimum(clusters[:, 0], box[0])
    y = np.minimum(clusters[:, 1], box[1])
    if np.count_nonzero(x == 0) > 0 or np.count_nonzero(y == 0) > 0:
        raise ValueError("Box has no area")

    intersection = x * y
    box_area = box[0] * box[1]
    cluster_area = clusters[:, 0] * clusters[:, 1]

    iou_ = intersection / (box_area + cluster_area - intersection)

    return iou_


def not_vectorized_kmeans(boxes, k, dist=np.median):
    """
    Calculates k-means clustering with the Intersection over Union (IoU) metric.
    :param boxes: numpy array of shape (r, 2), where r is the number of rows
    :param k: number of clusters
    :param dist: distance function
    :return: numpy array of shape (k, 2)
    """
    rows = boxes.shape[0]

    distances = np.empty((rows, k))
    last_clusters = np.zeros((rows,))

    np.random.seed()

    # the Forgy method will fail if the whole array contains the same rows
    clusters = boxes[np.random.choice(rows, k, replace=False)]

    while True:
        for row in range(rows):
            distances[row] = 1 - not_vectorized_iou(boxes[row], clusters)

        nearest_clusters = np.argmin(distances, axis=1)

        if (last_clusters == nearest_clusters).all():
            break

        for cluster in range(k):
            clusters[cluster] = dist(boxes[nearest_clusters == cluster], axis=0)

        last_clusters = nearest_clusters

    return clusters


def iou(boxes, anchors):
    """Calculates the Intersection over Union (IoU) between a numpy array of
    n boxes and an array of k anchors.

    Arguments:
        boxes (array_like): array of shape (n, 2) of boxes' widths and heights.
        anchors (array_like): array of shape (k, 2) of anchors' widths and heights.

    Returns:
        A numpy array of shape (n, k) containing the IoU values for each combination
        of boxes and anchors.
    """
    intersection_width = np.minimum(anchors[:, [0]], boxes[:, 0]).T
    intersection_height = np.minimum(anchors[:, [1]], boxes[:, 1]).T

    if np.any(intersection_width == 0) or np.any(intersection_height == 0):
        raise ValueError("Some boxes have zero size.")

    intersection_area = intersection_width * intersection_height
    boxes_area = np.prod(boxes, axis=1, keepdims=True)
    anchors_area = np.prod(anchors, axis=1, keepdims=True).T
    union_area = boxes_area + anchors_area - intersection_area
    return intersection_area / union_area


def kmeans(boxes, num_clusters=3, max_iter=300, seed=None, centroid_calc_fn=np.median):
    """Calculates K-Means clustering using the Intersection over Union (IoU) metric.

    Arguments:
        boxes (array_like): array of the bounding boxes' heights and widths, with
            shape (n, 2).
        num_clusters (int): the number of clusters to form as well as the number of
            centroids to generate.
        max_iter (int, optional): maximum number of iterations of the K-Means algorithm
        for a single run. Default: 300.
        seed: (int, optional): random seed. Default: None.
        centroid_calc_fn (function, optional): function used for calculating centroids
            Default: numpy.median.

    Returns:
        A numpy array of shape (num_clusters, 2).
    """
    np.random.seed(seed)
    last_nearest_centroids = np.ones(boxes.shape[0]) * -1
    # the Forgy method will fail if the whole array contains the same rows
    centroids = boxes[np.random.choice(boxes.shape[0], num_clusters, replace=False)]
    i = 0
    while True:
        if i >= max_iter:
            logger.warning(
                f"[{datetime.now().strftime('%m/%d %H:%M:%S')}] "
                "Maximum number of iterations reached, increase max_inter to do more "
                f"iterations (max_inter = {max_iter})"
            )
            break

        nearest_centroids = np.argmax(iou(boxes, centroids), axis=1)

        for centroid in range(num_clusters):
            centroids[centroid] = centroid_calc_fn(
                boxes[nearest_centroids == centroid], axis=0
            )

        if (nearest_centroids == last_nearest_centroids).all():
            break

        last_nearest_centroids = nearest_centroids
        i += 1

    return centroids

Run the experiment that you wish, es:

%%time

kmeans(bboxes, 3)

Result:

CPU times: user 6.49 s, sys: 174 µs, total: 6.49 s
Wall time: 6.5 s

array([[201.49, 218.97],
       [ 55.75,  65.25],
       [ 15.97,  19.85]])

%%time

not_vectorized_kmeans(bboxes, 3)

Result:

CPU times: user 7min 28s, sys: 8.91 s, total: 7min 36s
Wall time: 7min 24s

array([[ 15.97,  19.85],
       [ 55.75,  65.25],
       [201.49, 218.97]])

@mnslarcher I actually modify my comment and delete the second sentence. But I have no idea why you see can see my previous statement.

I actually conduct my experiment on all code snipes and find avg 1000 trials per 100 bboxs gives similar performance 38/57 ms. But you're get a different way to time it. I think it also makes sense. So I think that's the end of this discussion and I will try to do some optimization for my code XD

mnslarcher · 2020-06-06T23:35:02Z

@Cli98 , I did it but honestly I don't remember the exact result, maybe I will re-do it this we and post it here :)
@mnslarcher I see.
But that will not change too much if you only vectorize iou function. Normally we have ~50 bboxs. That's not a large number. Did you time you kmean function? Any stats?
Why do you say that we have ~50 bboxs? in my dataset I have 100k images with more ore less 10 boxes each so 1mln bboxes, far more than 50 if I get your point (probably I misunderstood you). Clearly if your dataset is small the benefit is basically zero.
In general the increase of speed is not a crucial feature of my repo, is only a plus, other than that I added a couple of functionalities that were useful for me and I will continue to add more, I have not done anything extraordinary, but if anyone wants to try it, I will be happy to receive any feedback
Apologize for late reply. I'm referring to single image instead, not for all of them. Normally we do not conduct kmean experiments on all bounding boxes given their large variation, not to say class imbalance issue.

Mmm I'm not sure I get your point, I understand that maybe you don't want to run K-Means on all you bounding boxes if your dateset is not balanced (mine is balanced) but I don't understand what is the utility of running K-Means on just one image to decide anchors ratios, but is late where I'm and I'm tired so I'll go to sleep now :)...

Cli98 · 2020-06-06T23:38:32Z

@Cli98 , I did it but honestly I don't remember the exact result, maybe I will re-do it this we and post it here :)
@mnslarcher I see.
But that will not change too much if you only vectorize iou function. Normally we have ~50 bboxs. That's not a large number. Did you time you kmean function? Any stats?
Why do you say that we have ~50 bboxs? in my dataset I have 100k images with more ore less 10 boxes each so 1mln bboxes, far more than 50 if I get your point (probably I misunderstood you). Clearly if your dataset is small the benefit is basically zero.
In general the increase of speed is not a crucial feature of my repo, is only a plus, other than that I added a couple of functionalities that were useful for me and I will continue to add more, I have not done anything extraordinary, but if anyone wants to try it, I will be happy to receive any feedback
Apologize for late reply. I'm referring to single image instead, not for all of them. Normally we do not conduct kmean experiments on all bounding boxes given their large variation, not to say class imbalance issue.
Mmm I'm not sure I get your point, I understand that maybe you don't want to run K-Means on all you bounding boxes if your dateset is not balanced (mine is balanced) but I don't understand what is the utility of running K-Means on just one image to decide anchors ratios, but is late where I'm and I'm tired so I'll go to sleep now :)...

"running K-Means on just one image to..."
Nope, that's not what I mean sir... But just let it pass.

zylo117 · 2020-06-11T03:51:19Z

actually, I think sklearn.cluster.KMeans is much faster, considering it's a c impl.

mnslarcher · 2020-06-11T05:25:10Z

actually, I think sklearn.cluster.KMeans is much faster, considering it's a c impl.

Probably, but what we call K-Means here is K-Means with IoU as a distance metric (is the main contribution of this kind of repos) and for my current understanding (in the beginning I tried to see if it was possible to use sklearn) is not straightforward to implement K-Means using as a distance metric IoU using sklearn, if someone know how to do the same using that implementation of K-Means I would be curious to see it and eventually to update my code.

The first problem is that that implementation doesn’t let you choose a distance metric at all. Then there are other stuffs that you need to consider for this implementation, if you look at the k-means implementation of these repos you should be able to see it 😊.

zylo117 · 2020-06-11T07:30:48Z

@mnslarcher
I did replace it with sklearn's kmeans.

def kmeans(boxes, num_clusters=3, max_iter=300, seed=None, centroid_calc_fn=np.median):
    """Calculates K-Means clustering using the Intersection over Union (IoU) metric.

    Arguments:
        boxes (array_like): array of the bounding boxes' heights and widths, with
            shape (n, 2).
        num_clusters (int): the number of clusters to form as well as the number of
            centroids to generate.
        max_iter (int, optional): maximum number of iterations of the K-Means algorithm
        for a single run. Default: 300.
        seed: (int, optional): random seed. Default: None.
        centroid_calc_fn (function, optional): function used for calculating centroids
            Default: numpy.median.

    Returns:
        A numpy array of shape (num_clusters, 2).
    """
    kmeans = KMeans(n_clusters=num_clusters, max_iter=300, tol=1e-3, random_state=seed).fit(boxes)
    centroids = kmeans.cluster_centers_
    return centroids

But yours is obviously much faster and more efficient because sklearn uses all available cpus while yours only takes one.

And if the statistics are right, I think both of them provide the similar correct result.

I think both using iou and euclidean distance are quite similar for this kind of situation. But yes, iou is more straightforward. Just realize this is why iou loss beats l1 loss in training bbox regression model.

your algorithm:

[06/11 15:18:07] Starting the calculation of the optimal anchors ratios
[06/11 15:18:07] Extracting and preprocessing bounding boxes
[06/11 15:18:07] Discarding 0 bounding boxes with size lower or equal to 0
[06/11 15:18:07] K-Means (10 runs): 100%|███████████████| 10/10 [00:04<00:00,  2.07it/s]
[06/11 15:18:12] Runs avg. IoU: 87.90% ± 0.00% (mean ± std. dev. of 10 runs, 0 skipped)
[06/11 15:18:12] Avg. IoU between bboxes and their most similar anchors after normalizing them so that they have the same area (only the anchor ratios matter): 87.90%
[06/11 15:18:12] Avg. IoU between bboxes and their most similar anchors (no normalization, both anchor ratios and sizes matter): 55.65%
[06/11 15:18:13] Num. bboxes with similar anchors (IoU >= 0.5):  79499/124806 (63.70%)
[06/11 15:18:13] Optimal anchors ratios: [(0.78, 1.29), (1.04, 0.96), (1.25, 0.8)]

sklearn's algorithm:

[06/11 15:18:25] Starting the calculation of the optimal anchors ratios
[06/11 15:18:25] Extracting and preprocessing bounding boxes
[06/11 15:18:25] Discarding 0 bounding boxes with size lower or equal to 0
[06/11 15:18:25] K-Means (10 runs): 100%|███████████████| 10/10 [00:18<00:00,  1.89s/it]
[06/11 15:18:44] Runs avg. IoU: 87.37% ± 0.00% (mean ± std. dev. of 10 runs, 0 skipped)
[06/11 15:18:44] Avg. IoU between bboxes and their most similar anchors after normalizing them so that they have the same area (only the anchor ratios matter): 87.37%
[06/11 15:18:44] Avg. IoU between bboxes and their most similar anchors (no normalization, both anchor ratios and sizes matter): 55.73%
[06/11 15:18:44] Num. bboxes with similar anchors (IoU >= 0.5):  79599/124806 (63.78%)
[06/11 15:18:44] Optimal anchors ratios: [(0.74, 1.36), (1.02, 0.99), (1.28, 0.78)]

mnslarcher · 2020-06-11T07:55:59Z

Nice! yes I think is true than one can have a decent approximation of K-Means with IoU using regular K-Means with euclidean distance, in both cases you end up with reasonable anchors ratios

zylo117 · 2020-06-11T09:14:16Z

@mnslarcher
Hi, I tried to compare the optimal anchors with the default anchors.
And for most of the time, it helps, but in some datasets, it doesn't.
Like this:

[06/11 17:08:58] When you use default anchors...
[06/11 17:08:58] Avg. IoU between bboxes and their most similar anchors (no normalization, both anchor ratios and sizes matter): 69.61%
[06/11 17:08:59] Num. bboxes with similar anchors (IoU >= 0.5):  37659/38510 (97.79%)
[06/11 17:08:59] Default anchors ratios: [(0.7, 1.4), (1.0, 1.0), (1.4, 0.7)]
[06/11 17:08:59] Runs avg. IoU: 88.01% ± 0.00% (mean ± std. dev. of 10 runs, 0 skipped)
[06/11 17:08:59] Avg. IoU between bboxes and their most similar anchors after normalizing them so that they have the same area (only the anchor ratios matter): 88.01%
[06/11 17:08:59] Avg. IoU between bboxes and their most similar anchors (no normalization, both anchor ratios and sizes matter): 70.88%
[06/11 17:08:59] Num. bboxes with similar anchors (IoU >= 0.5):  37275/38510 (96.79%)
[06/11 17:08:59] Optimal anchors ratios: [(0.8, 1.3), (1.0, 1.0), (1.2, 0.9)]
[06/11 17:08:59] Improvement: -1.00%

The ratio of bboxes with similar anchors reduces by 1% after using the optimal anchors.
It'd be nice to have a comparison with the default anchors, so that we can know whether we should apply the new one or not.

Also the reason why it sometimes doesn't help after the fitting should be also intriguing.

mnslarcher · 2020-06-11T09:46:54Z

It makes sense, in the end K-Means is not trying to maximize the coverage, is doing something correlated but not exactly the same, in particular is possible that the default anchors have a lower avg IoU with the bounding boxes but they have an IoU > 0.5 more frequently, instead the avg IoU of the K-Means anchors is probably higher but they sacrifice some bounding boxes for this. At least this is my current intuition.

I think that default > k-means is almost always when both have a very good coverage after removing the effect of the anchors sizes (like in the case that you reported)

For giving you an example of the limitations of K-Means, I have one detector that needs to detect electric poles. In almost all the images, for every poles, heights >> widths, but I have also some cases where the helicopter took the photo from a strange perspective and some pole height << pole width.

In my script you can increase num_anchors_ratios but, given that the cases above are rare, this will produce just another anchor with height > width, is a limitation of K-Means.

My solution for these cases is to run 1 time the algorithm with num_anchors_ratios=2 or 3 and then I re-run it with num_anchors_ratios=1 or 2 (depending if I want end up with 3, 4 or 5 anchors ratios) on what in the tutorial I call instances_without_similar_anchors. This produce good results in my case and I think is a reasonable way to increase coverage.

I'll add the comparison with the default anchors in the output of the script, thanks! :)

mnslarcher · 2020-06-13T11:19:53Z

Quick update:

I did what @zylo117 suggested, now the output is something like this:

[06/13 12:57:38] Reading ./annotations/instances_train2017.json
[06/13 12:57:54] Starting the calculation of the optimal anchors ratios
[06/13 12:57:54] Extracting and preprocessing bounding boxes
[06/13 12:57:57] Discarding 2 bounding boxes with size lower or equal to 0
[06/13 12:57:57] K-Means (3 runs): 100%|██████████████████| 3/3 [00:33<00:00, 11.06s/it]
        Runs avg. IoU: 80.48% ± 0.00% (mean ± std. dev. of 3 runs, 0 skipped)
        Avg. IoU between bboxes and their most similar anchors after norm. them to make their area equal (only ratios matter):
80.48%
[06/13 12:58:33] Default anchors ratios: [(0.7, 1.4), (1.0, 1.0), (1.4, 0.7)]
        Avg. IoU between bboxes and their most similar default anchors, no norm. (both ratios and sizes matter): 55.16%
        Num. bboxes without similar default anchors (IoU < 0.5):  253049/860001 (29.42%)
[06/13 12:58:37] K-Means anchors ratios: [(0.6, 1.5), (1.0, 1.0), (1.4, 0.7)]
        Avg. IoU between bboxes and their most similar K-Means anchors, no norm. (both ratios and sizes matter): 55.72%
        Num. bboxes without similar K-Means anchors (IoU < 0.5):  240788/860001 (28.00%)
[06/13 12:58:37] K-Means anchors have an IoU < 50% with bboxes in 1.43% less cases than the default anchors, you should consider to use them

As you can see, I decided to report num. bboxes without similar anchors instead of num. bboxes with similar anchors, because it is easier to interpret when the anchors have an high coverage.

Cli98 · 2020-06-13T15:49:01Z

It makes sense, in the end K-Means is not trying to maximize the coverage, is doing something correlated but not exactly the same, in particular is possible that the default anchors have a lower avg IoU with the bounding boxes but they have an IoU > 0.5 more frequently, instead the avg IoU of the K-Means anchors is probably higher but they sacrifice some bounding boxes for this. At least this is my current intuition.

I think that default > k-means is almost always when both have a very good coverage after removing the effect of the anchors sizes (like in the case that you reported)

For giving you an example of the limitations of K-Means, I have one detector that needs to detect electric poles. In almost all the images, for every poles, heights >> widths, but I have also some cases where the helicopter took the photo from a strange perspective and some pole height << pole width.

In my script you can increase num_anchors_ratios but, given that the cases above are rare, this will produce just another anchor with height > width, is a limitation of K-Means.

My solution for these cases is to run 1 time the algorithm with num_anchors_ratios=2 or 3 and then I re-run it with num_anchors_ratios=1 or 2 (depending if I want end up with 3, 4 or 5 anchors ratios) on what in the tutorial I call instances_without_similar_anchors. This produce good results in my case and I think is a reasonable way to increase coverage.

I'll add the comparison with the default anchors in the output of the script, thanks! :)

Quick update:

I did what @zylo117 suggested, now the output is something like this:

[06/13 12:57:38] Reading ./annotations/instances_train2017.json
[06/13 12:57:54] Starting the calculation of the optimal anchors ratios
[06/13 12:57:54] Extracting and preprocessing bounding boxes
[06/13 12:57:57] Discarding 2 bounding boxes with size lower or equal to 0
[06/13 12:57:57] K-Means (3 runs): 100%|██████████████████| 3/3 [00:33<00:00, 11.06s/it]
        Runs avg. IoU: 80.48% ± 0.00% (mean ± std. dev. of 3 runs, 0 skipped)
        Avg. IoU between bboxes and their most similar anchors after norm. them to make their area equal (only ratios matter):
80.48%
[06/13 12:58:33] Default anchors ratios: [(0.7, 1.4), (1.0, 1.0), (1.4, 0.7)]
        Avg. IoU between bboxes and their most similar default anchors, no norm. (both ratios and sizes matter): 55.16%
        Num. bboxes without similar default anchors (IoU < 0.5):  253049/860001 (29.42%)
[06/13 12:58:37] K-Means anchors ratios: [(0.6, 1.5), (1.0, 1.0), (1.4, 0.7)]
        Avg. IoU between bboxes and their most similar K-Means anchors, no norm. (both ratios and sizes matter): 55.72%
        Num. bboxes without similar K-Means anchors (IoU < 0.5):  240788/860001 (28.00%)
[06/13 12:58:37] K-Means anchors have an IoU < 50% with bboxes in 1.43% less cases than the default anchors, you should consider to use them

As you can see, I decided to report num. bboxes without similar anchors instead of num. bboxes with similar anchors, because it is easier to interpret when the anchors have an high coverage.

@mnslarcher and @zylo117

I don't think this comparison should be suggested, as the result of kmean is not stable and highly depends on distribution of input data.

To prove this, run multiple experiment multiple times, the avg iou may not be the same. And thus it is highly possible that your may see difference of AP flips from -1 to 1. Totally invalid conclusion.

So I do not think there are any ways to solve this, even with the proposed solution discussed above. Let me know your thoughts. If you ask me, I will take the one differ from default. Even the performance drops, given the difference is not too much. As bbox distribution in customized dataset must have some difference compared with default.

Cli98 · 2020-06-13T15:51:24Z

It makes sense, in the end K-Means is not trying to maximize the coverage, is doing something correlated but not exactly the same, in particular is possible that the default anchors have a lower avg IoU with the bounding boxes but they have an IoU > 0.5 more frequently, instead the avg IoU of the K-Means anchors is probably higher but they sacrifice some bounding boxes for this. At least this is my current intuition.

I think that default > k-means is almost always when both have a very good coverage after removing the effect of the anchors sizes (like in the case that you reported)

For giving you an example of the limitations of K-Means, I have one detector that needs to detect electric poles. In almost all the images, for every poles, heights >> widths, but I have also some cases where the helicopter took the photo from a strange perspective and some pole height << pole width.

In my script you can increase num_anchors_ratios but, given that the cases above are rare, this will produce just another anchor with height > width, is a limitation of K-Means.

My solution for these cases is to run 1 time the algorithm with num_anchors_ratios=2 or 3 and then I re-run it with num_anchors_ratios=1 or 2 (depending if I want end up with 3, 4 or 5 anchors ratios) on what in the tutorial I call instances_without_similar_anchors. This produce good results in my case and I think is a reasonable way to increase coverage.

I'll add the comparison with the default anchors in the output of the script, thanks! :)

It's still a trade off here (number of anchors VS AP) and thus not the optimal. The better solution is to jump out of box.

mnslarcher · 2020-06-14T08:23:24Z

To prove this, run multiple experiment multiple times, the avg iou may not be the same. And thus it is highly possible that your may see difference of AP flips from -1 to 1. Totally invalid conclusion.

I don't agree. In my implementation I have a num_runs parameter for this, in most of the cases running K-Means many times produce exactly the same result (at least on fairly large datasets). This because we have just two dimensions and, for reasonable datasets, a lot of samples. In any case is not something that one have to theorize, is enough to use an high value for num_runs and look at the avg. IoU std. dev..

If you ask me, I will take the one differ from default. Even the performance drops, given the difference is not too much. As bbox distribution in customized dataset must have some difference compared with default.

This doesn't make sense to me. Is true that in most of the cases the default anchors are not the best ones but there is no logical reason to think that some other boxes are better just because they are different (especially if you have a metric that is telling you the opposite). Following this logic using default anchors + random noise would be better than using default anchors.

There is a proper way to measure how good the anchors are, you can use a validation set (in this case I don't think we never over-fit the training set so also an evaluation on the training set is OK).

It's still a trade off here (number of anchors VS AP) and thus not the optimal. The better solution is to jump out of box.

I don't understand what do you mean with "jump out of box". In any case I agree, is not optimal, is just an heuristic that I use to balance the limitation of K-Means. This is an example where this heuristic works:

[3, 10] * 1000
[4, 10] * 1000
[5, 10] * 1000
[10, 3] * 100

In this case, with num_anchors_ratios=3, K-means would find 3 anchors ratios with height > width because the centroids will be driven by the shapes with height > width. This is not optimal if the objective is to maximize coverage. To avid this, one can just run K-Means with num_anchors_ratios=1 or 2 and the result will be 1 or 2 anchors that alone cover all the boxes with the exception of the 100 [10, 3]. At this point one can run again K-Means on the residuals (that are only the [10, 3] boxes) and the results will be [10, 3], the set of anchors at this point have an IoU > 50% with all the boxes. This is obviously a toy example just to clarify the idea.

Cli98 · 2020-06-14T14:14:23Z

@mnslarcher Are we on the same page? I just point out my thoughts that the difference added to decide which is more useful may not lead to correct conclusion.

I don't agree. In my implementation I have a num_runs parameter for this, in most of the cases running K-Means many times produce exactly the same result (at least on fairly large datasets). This because we have just two dimensions and, for reasonable datasets, a lot of samples. In any case is not something that one have to theorize, is enough to use an high value for num_runs and look at the avg. IoU std. dev..

Interesting to see you get exactly same result. If this holds, I admit that the decision holds. But Kmeans is not a stable algorithm, is it? Is initialization stable? How could you produce same result? By doing what? I did not see in any part you actually use seed, any other way you have done?

This doesn't make sense to me. Is true that in most of the cases the default anchors are not the best ones but there is no logical reason to think that some other boxes are better just because they are different (especially if you have a metric that is telling you the opposite). Following this logic using default anchors + random noise would be better than using default anchors.

Let me ask you a question, for a given customized dataset, I run your algorithm and get same results as compared with coco default[(1,1), (1.4,0.7), (0.7,1.4)], while I conduct EDA and find the average ratio<1 , If I get kmeans result [(1,1), (0.7,1.4)] and find it lower from default, which should I choose?

Noise can present anywhere in this world. Running kmeans cannot kick all of them out.

There is a proper way to measure how good the anchors are, you can use a validation set (in this case I don't think we never over-fit the training set so also an evaluation on the training set is OK).

It's still a trade off here (number of anchors VS AP) and thus not the optimal. The better solution is to jump out of box.

I don't understand what do you mean with "jump out of box". In any case I agree, is not optimal, is just an heuristic that I use to balance the limitation of K-Means. This is an example where this heuristic works:

[3, 10] * 1000
[4, 10] * 1000
[5, 10] * 1000
[10, 3] * 100

In this case, with num_anchors_ratios=3, K-means would find 3 anchors ratios with height > width because the centroids will be driven by the shapes with height > width. This is not optimal if the objective is to maximize coverage. To avid this, one can just run K-Means with num_anchors_ratios=1 or 2 and the result will be 1 or 2 anchors that alone cover all the boxes with the exception of the 100 [10, 3]. At this point one can run again K-Means on the residuals (that are only the [10, 3] boxes) and the results will be [10, 3], the set of anchors at this point have an IoU > 50% with all the boxes. This is obviously a toy example just to clarify the idea.

Validation is a way. Though not effective to me.
What you state is a solution. However the memory may not able to hold so much anchors.

mnslarcher · 2020-06-14T15:07:24Z

@mnslarcher Are we on the same page? I just point out my thoughts that the difference added to decide which is more useful may not lead to correct conclusion.

I don't agree. In my implementation I have a num_runs parameter for this, in most of the cases running K-Means many times produce exactly the same result (at least on fairly large datasets). This because we have just two dimensions and, for reasonable datasets, a lot of samples. In any case is not something that one have to theorize, is enough to use an high value for num_runs and look at the avg. IoU std. dev..

Interesting to see you get exactly same result. If this holds, I admit that the decision holds. But Kmeans is not a stable algorithm, is it? Is initialization stable? How could you produce same result? By doing what? I did not see in any part you actually use seed, any other way you have done?

This doesn't make sense to me. Is true that in most of the cases the default anchors are not the best ones but there is no logical reason to think that some other boxes are better just because they are different (especially if you have a metric that is telling you the opposite). Following this logic using default anchors + random noise would be better than using default anchors.

Let me ask you a question, for a given customized dataset, I run your algorithm and get same results as compared with coco default[(1,1), (1.4,0.7), (0.7,1.4)], while I conduct EDA and find the average ratio<1 , If I get kmeans result [(1,1), (0.7,1.4)] and find it lower from default, which should I choose?

Noise can present anywhere in this world. Running kmeans cannot kick all of them out.

There is a proper way to measure how good the anchors are, you can use a validation set (in this case I don't think we never over-fit the training set so also an evaluation on the training set is OK).

It's still a trade off here (number of anchors VS AP) and thus not the optimal. The better solution is to jump out of box.

I don't understand what do you mean with "jump out of box". In any case I agree, is not optimal, is just an heuristic that I use to balance the limitation of K-Means. This is an example where this heuristic works:

[3, 10] * 1000

[4, 10] * 1000

[5, 10] * 1000

[10, 3] * 100

In this case, with num_anchors_ratios=3, K-means would find 3 anchors ratios with height > width because the centroids will be driven by the shapes with height > width. This is not optimal if the objective is to maximize coverage. To avid this, one can just run K-Means with num_anchors_ratios=1 or 2 and the result will be 1 or 2 anchors that alone cover all the boxes with the exception of the 100 [10, 3]. At this point one can run again K-Means on the residuals (that are only the [10, 3] boxes) and the results will be [10, 3], the set of anchors at this point have an IoU > 50% with all the boxes. This is obviously a toy example just to clarify the idea.

Validation is a way. Though not effective to me.

What you state is a solution. However the memory may not able to hold so much anchors.

Sorry @Cli98 , I’m not sure I’m following you 100%, in any case just to clarify:

There is the possibility to set a random seed in my K-Means implementation (you can look at the code some messages above) but for obvious reasons I don’t use it when I run K-Means multiple time (I run K-Means multiple times exactly to try different initializations and to keep the best result of them)
What I’m saying is that, even with different initializations, in the special case of using K-Means for this problem where we have just 2 dimensions, usually many data points and well defined clusters, K-Means tends to find always the same solution. I’m not saying this is always true for K-Means, it depend on the type of problem (dataset) where you apply K-Means. This obviously is not a rigorous demonstrations, I’m reporting my empirical evidence resulted from running K-Means many times with different initializations and some different datasets
I don’t understand 100% the rest of your message but strictly speaking I’m just saying that one time you choose a metric that you want to maximize (like number of boxes with IoU > 50%) and you find a solution that maximize it, you should stick with it, independently on how you found it. So, if default values have an higher score for your metric, if you choose your metric well, you should stick with the default values

I’m not sure if I answer to you given that I’m not sure if I understood your point

Cli98 · 2020-06-14T15:29:47Z

@mnslarcher Are we on the same page? I just point out my thoughts that the difference added to decide which is more useful may not lead to correct conclusion.

I don't agree. In my implementation I have a num_runs parameter for this, in most of the cases running K-Means many times produce exactly the same result (at least on fairly large datasets). This because we have just two dimensions and, for reasonable datasets, a lot of samples. In any case is not something that one have to theorize, is enough to use an high value for num_runs and look at the avg. IoU std. dev..

Interesting to see you get exactly same result. If this holds, I admit that the decision holds. But Kmeans is not a stable algorithm, is it? Is initialization stable? How could you produce same result? By doing what? I did not see in any part you actually use seed, any other way you have done?

This doesn't make sense to me. Is true that in most of the cases the default anchors are not the best ones but there is no logical reason to think that some other boxes are better just because they are different (especially if you have a metric that is telling you the opposite). Following this logic using default anchors + random noise would be better than using default anchors.

Let me ask you a question, for a given customized dataset, I run your algorithm and get same results as compared with coco default[(1,1), (1.4,0.7), (0.7,1.4)], while I conduct EDA and find the average ratio<1 , If I get kmeans result [(1,1), (0.7,1.4)] and find it lower from default, which should I choose?
Noise can present anywhere in this world. Running kmeans cannot kick all of them out.

There is a proper way to measure how good the anchors are, you can use a validation set (in this case I don't think we never over-fit the training set so also an evaluation on the training set is OK).

It's still a trade off here (number of anchors VS AP) and thus not the optimal. The better solution is to jump out of box.

I don't understand what do you mean with "jump out of box". In any case I agree, is not optimal, is just an heuristic that I use to balance the limitation of K-Means. This is an example where this heuristic works:

[3, 10] * 1000

[4, 10] * 1000

[5, 10] * 1000

[10, 3] * 100

In this case, with num_anchors_ratios=3, K-means would find 3 anchors ratios with height > width because the centroids will be driven by the shapes with height > width. This is not optimal if the objective is to maximize coverage. To avid this, one can just run K-Means with num_anchors_ratios=1 or 2 and the result will be 1 or 2 anchors that alone cover all the boxes with the exception of the 100 [10, 3]. At this point one can run again K-Means on the residuals (that are only the [10, 3] boxes) and the results will be [10, 3], the set of anchors at this point have an IoU > 50% with all the boxes. This is obviously a toy example just to clarify the idea.

Validation is a way. Though not effective to me.
What you state is a solution. However the memory may not able to hold so much anchors.

Sorry @Cli98 , I’m not sure I’m following you 100%, in any case just to clarify:

There is the possibility to set a random seed in my K-Means implementation (you can look at the code some messages above) but for obvious reasons I don’t use it when I run K-Means multiple time (I run K-Means multiple times exactly to try different initializations and to keep the best result of them)

What I’m saying is that, even with different initializations, in the special case of using K-Means for this problem where we have just 2 dimensions, usually many data points and well defined clusters, K-Means tends to find always the same solution. I’m not saying this is always true for K-Means, it depend on the type of problem (dataset) where you apply K-Means. This obviously is not a rigorous demonstrations, I’m reporting my empirical evidence resulted from running K-Means many times with different initializations and some different datasets

I don’t understand 100% the rest of your message but strictly speaking I’m just saying that one time you choose a metric that you want to maximize (like number of boxes with IoU > 50%) and you find a solution that maximize it, you should stick with it, independently on how you found it. So, if default values have an higher score for your metric, if you choose your metric well, you should stick with the default values

I’m not sure if I answer to you given that I’m not sure if I understood your point

@mnslarcher So here is my point. I am not saying your kmean algorithm is incorrect. I'm just wondering if use the difference to measure the result of kmean really works. Since the result is not stable, such comparison may not really make sense. Just my thought.

So I guess your point is, such randomness may not change conclusion as it is to some degree stable. This may not always hold from my perspective, actually depends on difficulty of dataset.And we agree each other on this. That's enough I guess.

"I’m reporting my empirical evidence resulted from running K-Means many times with different initializations and some different datasets".

I'm glad to see any reports in the near future. And this follows your concern #3. My points are, since you judge the result of kmeans from AP, you may see difference work. But how effective to judge by AP remains controversial to me. Let's see if your experiments can justify this.

zylo117 mentioned this issue May 20, 2020

problem of validation set #316

Open

mnslarcher changed the title ~~k-means anchor ratios~~ k-means anchors ratios May 23, 2020

zylo117 pinned this issue Jun 11, 2020

zylo117 unpinned this issue Jun 11, 2020

zylo117 pinned this issue Jun 11, 2020

zylo117 mentioned this issue Dec 29, 2020

Advice about parameters #572

Closed

This was referenced Apr 12, 2021

anchors #619

Open

如何对efficientdet_test.py中anchor_ratios参数进行设置？ #613

Closed

zylo117 mentioned this issue May 10, 2021

anchor 设置 #645

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

k-means anchors ratios #308

k-means anchors ratios #308

mnslarcher commented May 17, 2020 •

edited

Cli98 commented May 21, 2020

mnslarcher commented May 21, 2020

Cli98 commented May 27, 2020

mnslarcher commented May 27, 2020

Cli98 commented May 27, 2020 •

edited

mnslarcher commented May 27, 2020 •

edited

Cli98 commented May 27, 2020

mnslarcher commented May 27, 2020

mnslarcher commented Jun 6, 2020 •

edited

Cli98 commented Jun 6, 2020

Cli98 commented Jun 6, 2020

mnslarcher commented Jun 6, 2020

Cli98 commented Jun 6, 2020

zylo117 commented Jun 11, 2020

mnslarcher commented Jun 11, 2020

zylo117 commented Jun 11, 2020

mnslarcher commented Jun 11, 2020

zylo117 commented Jun 11, 2020 •

edited

mnslarcher commented Jun 11, 2020

mnslarcher commented Jun 13, 2020

Cli98 commented Jun 13, 2020 •

edited

Cli98 commented Jun 13, 2020 •

edited

mnslarcher commented Jun 14, 2020

Cli98 commented Jun 14, 2020 •

edited

mnslarcher commented Jun 14, 2020

Cli98 commented Jun 14, 2020

k-means anchors ratios #308

k-means anchors ratios #308

Comments

mnslarcher commented May 17, 2020 • edited

Cli98 commented May 21, 2020

mnslarcher commented May 21, 2020

Cli98 commented May 27, 2020

mnslarcher commented May 27, 2020

Cli98 commented May 27, 2020 • edited

mnslarcher commented May 27, 2020 • edited

Cli98 commented May 27, 2020

mnslarcher commented May 27, 2020

mnslarcher commented Jun 6, 2020 • edited

Cli98 commented Jun 6, 2020

Cli98 commented Jun 6, 2020

mnslarcher commented Jun 6, 2020

Cli98 commented Jun 6, 2020

zylo117 commented Jun 11, 2020

mnslarcher commented Jun 11, 2020

zylo117 commented Jun 11, 2020

your algorithm:

sklearn's algorithm:

mnslarcher commented Jun 11, 2020

zylo117 commented Jun 11, 2020 • edited

mnslarcher commented Jun 11, 2020

mnslarcher commented Jun 13, 2020

Cli98 commented Jun 13, 2020 • edited

Cli98 commented Jun 13, 2020 • edited

mnslarcher commented Jun 14, 2020

Cli98 commented Jun 14, 2020 • edited

mnslarcher commented Jun 14, 2020

Cli98 commented Jun 14, 2020

mnslarcher commented May 17, 2020 •

edited

Cli98 commented May 27, 2020 •

edited

mnslarcher commented May 27, 2020 •

edited

mnslarcher commented Jun 6, 2020 •

edited

zylo117 commented Jun 11, 2020 •

edited

Cli98 commented Jun 13, 2020 •

edited

Cli98 commented Jun 13, 2020 •

edited

Cli98 commented Jun 14, 2020 •

edited