Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What's the purpose of pre-trained weights in YOLO? #232

Open
anjanaouseph opened this issue Jun 12, 2021 · 17 comments
Open

What's the purpose of pre-trained weights in YOLO? #232

anjanaouseph opened this issue Jun 12, 2021 · 17 comments
Labels
question A general question

Comments

@anjanaouseph
Copy link

anjanaouseph commented Jun 12, 2021

"Before getting started download the pre-trained YOLOv3 weights and convert them to the keras format", I want to understand why do we need to use pre-trained weights in yolo.

@AntonMu
Copy link
Owner

AntonMu commented Jun 12, 2021

Hi @Voldy1998

On a high level, the yolo network is a deep neural net with millions of parameters and needs millions of labeled images to tune all these parameters.

Because many people don't have the resources to label that amount of images we use a technique called transfer learning where we use the knowledge that the network has acquired by training on millions of similar images. That is why we need the pre-trained weights. And we can get good results with only a few hundred images.

Hope that helps. Google also has a lot of resources about yolo and transfer learning in general.

@anjanaouseph
Copy link
Author

Hi @AntonMu , Thanks for the quick reply!

@anjanaouseph
Copy link
Author

I had plotted a graph for the Precision vs Recall curve for a class car and found the area under the graph (Average Precision) as shown below.
car

I got the Average Precision to be 88.25 %. Do you find anything wrong with the graph?

@AntonMu
Copy link
Owner

AntonMu commented Jun 13, 2021

You probably need to be a bit more specific. What is your IoU constraint here? A common metric in object detection is mAP.

@anjanaouseph
Copy link
Author

@AntonMu iou is 0.5 here.

@AntonMu
Copy link
Owner

AntonMu commented Jun 13, 2021

I see. I think it looks fine. It is a little unusual to use this type of curve. How do you deal with multiple cars in one picture? In your graph do you vary the threshold to say car/no car?

@anjanaouseph
Copy link
Author

The trained model was tested on a test dataset of 120 images of cars. The 120 images had 272 instances of class 'car', out of which 220 were detected as 'True Positive', 26 as 'False Positive', and 26 as "False Negative".

For each class (here there is just one class car) the neural network detection results were sorted by decreasing confidence scores and are assigned to ground-truth objects. It was judged to be true or false positives by measuring bounding box overlap. To be considered a correct detection, the area of overlap between the predicted bounding box
and ground truth bounding box must exceed 50%. Detections output were assigned to ground
truth object annotations satisfying the overlap criterion in order ranked by the (decreasing) confidence output. Object detected and matched to ground-truth is considered as True Positive (TP), Object detected but did not match ground-truth is False Positive (FP). Ground truth objects with no matching detection are False Negatives (FN). In the case of multiple detections of the same object, only one is set as a correct detection and the repeated ones are set as false detections.

Using the above criteria the precision-recall curve was plotted. Then a version of the measured precision/recall curve with precision monotonically decreasing was plotted by setting the precision for recall 'r' to the maximum precision obtained for any recall r' > r. Finally, we compute the Average Precision (AP) as the area under the precision-recall curve (shown in light blue) in the above figure.

@AntonMu
Copy link
Owner

AntonMu commented Jun 13, 2021

Ok. That sounds reasonable for your use case. You are basically treating a object detection problem as a classification problem. Essentially you are capturing one aspects of the model (the classification) with your metric but ignore the aspect of detecting objects accurately. Hope that makes sense.

@anjanaouseph
Copy link
Author

Hi @AntonMu, I actually followed your steps and trained the detector to detect cars and draw bounding boxes around it in an image which is classification + localization. Now the detection results for each and every car in an image (the coordinates of the bounding box and the confidence scores) were saved in a text file. The ground-truth coordinates of each and every car in an image were also saved into a text file.
Now to evaluate the performance of YOLO I tried to find the Average Precision/Mean Average Precision by following this https://github.com/Cartucho/mAP which was suggested by you #63.

@AntonMu
Copy link
Owner

AntonMu commented Jun 13, 2021

Ah Ok - cool. Yes, you are right. I was a bit thrown off by the single class. But it makes sense - if you only have a single class there is no mean to calculate. If you feel comfortable feel free to add a pull request to get that feature added to the repo. Thank you so much!

@anjanaouseph
Copy link
Author

Yes, @AntonMu I hope the graph is not incorrect or anything. How can I add this as a feature? I went to that repo and followed the guidelines and ran his code.

@AntonMu
Copy link
Owner

AntonMu commented Jun 14, 2021

Cool - yes. I think there are several options. One could be to add a description on the steps you did maybe under the 3_Inference section.

There would also be an option to add some code that would handle the computation. Basically you describe what one needs to do to calculate mAP and provide a script that does it.

@anjanaouseph
Copy link
Author

Yes @AntonMu, sure I will add. Issue is I am not 100% sure if what I did is correct.

@AntonMu
Copy link
Owner

AntonMu commented Jun 14, 2021

I see - best thing is to start a PR and then I can check. But if you followed the tutorial it should be fine. To add it here - I would like to also work for multiple classes.

@anjanaouseph
Copy link
Author

Okay @AntonMu, Thanks Will do it!

@Pei648783116
Copy link

Hi Anjanaouseph, do you mind uploading your code for converting our csv to the required file format for mAp calculation?

@anjanaouseph
Copy link
Author

Hi @648783116

#convert .csv as per https://github.com/Cartucho/mAP

from csv import DictReader

INPUT_FILE = 'Detection_Results.csv'

with open(INPUT_FILE, 'rt') as csvfile:
reader = DictReader(csvfile)
for row in reader:
file_name = "{}.txt".format(row["image"])
if row["label"] and row["confidence"] and row["xmin"] and row["ymin"] and row["xmax"] and row["ymax"]: # if this field is not empty
line = row["label"]+" "+row["confidence"]+" "+row["xmin"]+" "+row["ymin"]+" "+row["xmax"]+" "+row["ymax"]
else:
print("Both 'Taaloefening2' and 'Taaloefening2' empty on {}_{}. Skipping.".format(row["id"], row["Label"]))
continue
with open(file_name, 'a') as output:
output.write(line+"\n")

The above is the code that I used.

@AntonMu AntonMu added the question A general question label Apr 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question A general question
Projects
None yet
Development

No branches or pull requests

3 participants