RTX3090比較
RTX3090を搭載したサーバーが研究室に来たのでyolov5の学習をしてみてGTX 1080TIと比較してみました. 実験設定として, batch 64, epoch1, GPU4基でData Paralelをしてます.
RTX 3090サーバー(RTX 3090×4)
Evaluating pycocotools mAP... saving runs/train/exp15/_predictions.json... loading annotations into memory... Done (t=0.38s) creating index... index created! Loading and preparing results... DONE (t=8.12s) creating index... index created! Running per image evaluation... Evaluate annotation type *bbox* DONE (t=67.78s). Accumulating evaluation results... DONE (t=17.63s). Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.003 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.007 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.001 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.001 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.004 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.003 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.018 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.041 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.046 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.007 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.040 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.075 real 16m15.569s user 68m18.520s sys 5m14.403s
1080TIサーバー(1080TI × 4)
Evaluating pycocotools mAP... saving runs/train/exp11/_predictions.json... loading annotations into memory... Done (t=0.64s) creating index... index created! Loading and preparing results... DONE (t=9.76s) creating index... index created! Running per image evaluation... Evaluate annotation type *bbox* DONE (t=108.04s). Accumulating evaluation results... DONE (t=23.68s). Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.002 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.006 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.001 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.001 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.003 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.003 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.020 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.043 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.048 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.006 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.038 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.081 real 24m17.456s user 105m34.200s sys 10m56.112s
RTX 3090サーバーかなり早いです.
参考までにRTX 3090でbatch 320で計算させてみました。
Evaluating pycocotools mAP... saving runs/train/exp12/_predictions.json... loading annotations into memory... Done (t=1.05s) creating index... index created! Loading and preparing results... DONE (t=8.51s) creating index... index created! Running per image evaluation... Evaluate annotation type *bbox* DONE (t=73.40s). Accumulating evaluation results... DONE (t=20.28s). Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.002 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.005 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.001 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.001 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.002 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.002 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.015 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.031 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.037 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.005 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.033 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.061 real 11m47.344s user 66m23.625s sys 6m45.271s
メモリ24GBの暴力.
P.S サーバー室にいると目がパサパサします.