Why I can't get good result in my own datasets? #864

BBuf · 2020-02-22T12:56:16Z

@glenn-jocher Hello, I recently trained two of my private detection datasets using AlexeyAB DarkNet and your project, one of which is two categories and one of which is a category, and my target sizes are all regular. Then I can achieve mAP values of 99.5 +% and 96% + respectively on AlexeyAB DarkNet, but using your project can only reach mAP values of 80% + and 70% +, respectively. I also use GIOU Loss in the AlexeyAB code, and all use the default parameters in your project. In addition, I use YOLOV3-Tiny network. I wonder if there might be some problems with this code?

glenn-jocher · 2020-02-22T17:49:54Z

@BBuf this repo trains yolov3-spp.cfg on COCO to the highest mAP of any reported results we know. See https://github.com/ultralytics/yolov3#map

You may want to tune your hyperparameters #392 or switch from tiny to yolov3-spp.cfg. Other than that general guidance, we don't offer free support or feedback on training custom datasets. I'll leave the issue open for community feedback.

glenn-jocher · 2020-02-22T20:07:49Z

@BBuf two other thoughts are that you may want to try better tiny derivatives like https://github.com/ultralytics/yolov3/blob/master/cfg/yolov3-tiny3.cfg, and that you should also try to test your darknet-trained models here to get an apples to apples mAP comparison:

python3 test.py --data ... --weights ... --cfg ...

And lastly, you need to look at your results.png for training feedback.

BBuf · 2020-02-23T10:21:19Z

OK, now I adjusted the batch_size of both projects to be the same, and then retrained YOLOV3-Tiny, the results are as follows:

In AlexeyAB DarkNet:

Train Loss And Map：

Use your code to test on the best weights model：

In your project：

result.png

Use your code to test on the best pt model：

Apparently, the mAP and F1 scores of the AlexeyAB version of the model are higher than your project. I want to know why? In addition, when I tested, the conf-thres was set to 0.1

glenn-jocher · 2020-02-23T20:55:22Z

@BBuf ah ok, this is a lot more info now. Yes, the darknet training is working better for you. The biggest problem I see with the ultralytics results are that the classification loss is about 10X larger compared to the GIoU and Objectness losses. The 3 should all be roughly in balance, so you probably want to reduce your hyp['cls'] by up to 5X or 10X to bring it in line:

yolov3/train.py

Line 26 in 2624d55

'cls': 37.4, # cls loss gain

This is probably because the hyperparameters are tuned to an 80-class dataset, and you have a 2 class dataset. This is very interesting, maybe there's a way to automate this balancing in the future, perhaps using the mean losses after the first x epochs to adjust the hyps.

I would also cut the training time here down by half, as it looks like after 100 epochs you've already reached a steady state solution. Can you retrain with those two changes and see if it helps?

glenn-jocher · 2020-02-23T20:58:37Z

@BBuf about the batch size, I recommend --batch-size 64 --accum 1 if possible, or if you run out of CUDA memory then you can reduce this down, i.e.

python3 train.py --batch-size 64 --accum 1
python3 train.py --batch-size 32 --accum 2
python3 train.py --batch-size 16 --accum 4

are all roughly similar, but use less memory as you go down. --accum is the number of gradient accumulations before an optimizer update.

BBuf · 2020-02-24T00:53:50Z

Ok, I will try it, thank you。

BBuf · 2020-02-24T03:15:24Z

I changed 'cls': 37.4 to cls: 5.0 and got the following result：

Looks like this hasn't improved significantly

glenn-jocher · 2020-02-24T04:36:44Z

@BBuf yes, the losses are much better balanced now. You should reduce cls by half again to about 2.5, and depending on your dataset, you may also want to train with --multi, which turns on multi-scale training. This is how we train COCO, and I think darknet has it on by default as well.

The other thing is you are training too long. As you can see, all of your results are already done by epoch 100, so you should use --epochs 100:

python3 train.py --epochs 100 --multi ...

One last thing you could try is to apply a cosine LR scheduler, which shows improvement in COCO training. See #238 (comment). You do this by commenting the current scheduler and uncommenting L139-140:

yolov3/train.py

Lines 139 to 142 in a3671bd

 # lf = lambda x: 0.5 * (1 + math.cos(x * math.pi / epochs)) # cosine https://arxiv.org/pdf/1812.01187.pdf 

 # scheduler = lr_scheduler.LambdaLR(optimizer, lr_lambda=lf) 

 # scheduler = lr_scheduler.MultiStepLR(optimizer, milestones=range(59, 70, 1), gamma=0.8) # gradual fall to 0.1*lr0 

 scheduler = lr_scheduler.MultiStepLR(optimizer, milestones=[round(epochs * x) for x in [0.8, 0.9]], gamma=0.1)

BBuf · 2020-02-24T13:00:40Z

I used all the improvements you mentioned above and got the following results:

glenn-jocher · 2020-02-24T23:03:52Z

@BBuf then you are all up to date with this repo. Class 2 exceeds darknet mAP, class 1 does not, overall mAP does not. I'd use the darknet trained results for your custom dataset.

BBuf · 2020-02-25T01:32:20Z

OK, thank you for your patience, then I will use the results of AlexeyAB DarkNet as the result of my custom data set.

nanhui69 · 2020-09-07T09:46:37Z

@glenn-jocher Hello, I recently trained two of my private detection datasets using AlexeyAB DarkNet and your project, one of which is two categories and one of which is a category, and my target sizes are all regular. Then I can achieve mAP values of 99.5 +% and 96% + respectively on AlexeyAB DarkNet, but using your project can only reach mAP values of 80% + and 70% +, respectively. I also use GIOU Loss in the AlexeyAB code, and all use the default parameters in your project. In addition, I use YOLOV3-Tiny network. I wonder if there might be some problems with this code?

which repo do you use? yolov3 _darknet or yolov4 darknet repo? .....

glenn-jocher · 2023-11-14T19:18:02Z

@nanhui69 hi there! It seems like you are comparing the performance of YOLOv3 trained on your private datasets between the Ultralytics YOLOv3 and AlexeyAB's DarkNet. I maintain the YOLOv3 repo at Ultralytics, and we appreciate the comparison. It's great to hear that you achieved excellent mAP values with AlexeyAB DarkNet.

It's important to note that each repository may have different default configurations, including the hyperparameters, architecture, and training settings, which can affect the training results. We continually strive to provide optimal default settings for a wide range of use cases, but there may be specific adjustments needed for individual scenarios.

If you'd like, we can investigate your specific case further to help optimize the training process. Additionally, utilizing the latest YOLOv4 repository may also provide improved results, as it incorporates various advancements over YOLOv3.

Let me know if you need any assistance, and we're here to help!

BBuf added the bug Something isn't working label Feb 22, 2020

BBuf changed the title ~~Why I can't get good result In my own two datasets?~~ Why I can't get good result in my own datasets? Feb 22, 2020

BBuf closed this as completed Feb 25, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why I can't get good result in my own datasets? #864

Why I can't get good result in my own datasets? #864

BBuf commented Feb 22, 2020

glenn-jocher commented Feb 22, 2020

glenn-jocher commented Feb 22, 2020 •

edited

BBuf commented Feb 23, 2020

glenn-jocher commented Feb 23, 2020

glenn-jocher commented Feb 23, 2020 •

edited

BBuf commented Feb 24, 2020

BBuf commented Feb 24, 2020

glenn-jocher commented Feb 24, 2020

BBuf commented Feb 24, 2020

glenn-jocher commented Feb 24, 2020

BBuf commented Feb 25, 2020

nanhui69 commented Sep 7, 2020

glenn-jocher commented Nov 14, 2023

Why I can't get good result in my own datasets? #864

Why I can't get good result in my own datasets? #864

Comments

BBuf commented Feb 22, 2020

glenn-jocher commented Feb 22, 2020

glenn-jocher commented Feb 22, 2020 • edited

BBuf commented Feb 23, 2020

glenn-jocher commented Feb 23, 2020

glenn-jocher commented Feb 23, 2020 • edited

BBuf commented Feb 24, 2020

BBuf commented Feb 24, 2020

glenn-jocher commented Feb 24, 2020

BBuf commented Feb 24, 2020

glenn-jocher commented Feb 24, 2020

BBuf commented Feb 25, 2020

nanhui69 commented Sep 7, 2020

glenn-jocher commented Nov 14, 2023

glenn-jocher commented Feb 22, 2020 •

edited

glenn-jocher commented Feb 23, 2020 •

edited