研究背景
根據老師要求,采用Faster-RCNN算法,使用VOC2007數據集和比賽數據集訓練模型,測試圖片并進行驗證。
論文解讀
整體架構
faster-rcnn原理及相應概念解釋
學習參考
tf-faster rcnn 配置 及自己數據
CPU和GPU的區別、工作原理、及如何tensorflow-GPU安裝等操作
Win-10 安裝 TensorFlow-GPU
基于Faster-RCNN-TF的gpu運行總結(自己準備數據集)
環境配置
github代碼
配置參考
Ubuntu 16.04 LTS
anaconda3
tensorflow1.2.1
python3.6.6
PyCharm Community Edition 2016.3
conda list 的CPU配置如下
henry@henry-Rev-1-0:~$ source activate tensorflow
(tensorflow) henry@henry-Rev-1-0:~$ conda list
# packages in environment at /home/henry/anaconda3/envs/tensorflow:
#
# Name Version Build Channel
_tflow_180_select 3.0 eigen defaults
absl-py 0.2.2 py36_0 defaults
astor 0.6.2 py36_0 defaults
backports.weakref 1.0rc1 <pip>
blas 1.0 mkl defaults
bleach 1.5.0 py36_0 defaults
bzip2 1.0.6 h14c3975_5 defaults
ca-certificates 2018.03.07 0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
cairo 1.14.12 h7636065_2 defaults
certifi 2018.4.16 py36_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
cffi 1.11.5 py36h9745a5d_0 defaults
cudatoolkit 9.0 h13b8566_0 defaults
cudnn 7.1.2 cuda9.0_0 defaults
cycler 0.10.0 py36h93f1223_0 defaults
Cython 0.28.4 <pip>
dbus 1.13.2 h714fa37_1 defaults
easydict 1.6 <pip>
expat 2.2.5 he0dffb1_0 defaults
ffmpeg 4.0 h04d0a96_0 defaults
fontconfig 2.12.6 h49f89f6_0 defaults
freetype 2.8 hab7d2ae_1 defaults
gast 0.2.0 py36_0 defaults
glib 2.56.1 h000015b_0 defaults
graphite2 1.3.11 h16798f4_2 defaults
grpcio 1.12.1 py36hdbcaa40_0 defaults
gst-plugins-base 1.14.0 hbbd80ab_1 defaults
gstreamer 1.14.0 hb453b48_1 defaults
h5py 2.8.0 py36ha1f6525_0 defaults
harfbuzz 1.7.6 h5f0a787_1 defaults
hdf5 1.10.2 hba1933b_1 defaults
html5lib 0.9999999 py36_0 defaults
icu 58.2 h9c2bf20_1 defaults
intel-openmp 2018.0.3 0 defaults
jasper 1.900.1 hd497a04_4 defaults
jpeg 9b h024ee3a_2 defaults
keras 2.2.0 0 defaults
keras-applications 1.0.2 py36_0 defaults
keras-base 2.2.0 py36_0 defaults
keras-preprocessing 1.0.1 py36_0 defaults
kiwisolver 1.0.1 py36h764f252_0 defaults
libedit 3.1.20170329 h6b74fdf_2 defaults
libffi 3.2.1 hd88cf55_4 defaults
libgcc-ng 7.2.0 hdf63c60_3 defaults
libgfortran-ng 7.2.0 hdf63c60_3 defaults
libopencv 3.4.1 h1a3b859_1 defaults
libopus 1.2.1 hb9ed12e_0 defaults
libpng 1.6.34 hb9fc6fc_0 defaults
libprotobuf 3.5.2 h6f1eeef_0 defaults
libstdcxx-ng 7.2.0 hdf63c60_3 defaults
libtiff 4.0.9 he85c1e1_1 defaults
libvpx 1.7.0 h439df22_0 defaults
libxcb 1.13 h1bed415_1 defaults
libxml2 2.9.8 h26e45fe_1 defaults
libxslt 1.1.32 h1312cb7_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
lxml 4.2.2 py36hf71bdeb_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
markdown 2.6.11 py36_0 defaults
matplotlib 2.2.2 py36h0e671d2_1 defaults
mkl 2018.0.3 1 defaults
mkl_fft 1.0.1 py36h3010b51_0 defaults
mkl_random 1.0.1 py36h629b387_0 defaults
nccl 1.3.5 cuda9.0_0 defaults
ncurses 6.1 hf484d3e_0 defaults
ninja 1.8.2 py36h6bb024c_1 defaults
numpy 1.14.5 <pip>
numpy 1.14.5 py36hcd700cb_3 defaults
numpy-base 1.14.5 py36hdbf6ddf_3 defaults
opencv 3.4.1 py36h6fd60c2_2 defaults
opencv-python 3.4.1.15 <pip>
openssl 1.0.2o h20670df_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
pcre 8.42 h439df22_0 defaults
Pillow 5.2.0 <pip>
pip 10.0.1 py36_0 defaults
pixman 0.34.0 hceecf20_3 defaults
protobuf 3.5.2 py36hf484d3e_0 defaults
py-opencv 3.4.1 py36h0676e08_1 defaults
pycparser 2.18 py36hf9f622e_1 defaults
pyparsing 2.2.0 py36hee85983_1 defaults
pyqt 5.9.2 py36h751905a_0 defaults
python 3.6.6 hc3d631a_0 defaults
python-dateutil 2.7.3 py36_0 defaults
pytorch 0.4.0 py36hdf912b8_0 defaults
pytz 2018.5 py36_0 defaults
pyyaml 3.12 py36hafb9ca4_1 defaults
qt 5.9.5 h7e424d6_0 defaults
readline 7.0 ha6073c6_4 defaults
scipy 1.1.0 py36hfc37229_0 defaults
setuptools 39.2.0 py36_0 defaults
sip 4.19.8 py36hf484d3e_0 defaults
six 1.11.0 py36h372c433_1 defaults
sqlite 3.24.0 h84994c4_0 defaults
tensorboard 1.8.0 py36hf484d3e_0 defaults
tensorflow 1.2.1 <pip>
tensorflow 1.8.0 h57681fa_0 defaults
tensorflow-base 1.8.0 py36h5f64886_0 defaults
termcolor 1.1.0 py36_1 defaults
tk 8.6.7 hc745277_3 defaults
tornado 5.0.2 py36_0 defaults
werkzeug 0.14.1 py36_0 defaults
wheel 0.31.1 py36_0 defaults
xz 5.2.4 h14c3975_4 defaults
yaml 0.1.7 had09818_2 defaults
zlib 1.2.11 ha838bed_2 defaults
conda list 的GPU配置如下
(py36) ouc@ouc-yzb:~/LiuHongzhi/tf-faster-rcnn$ conda list
# packages in environment at /home/ouc/anaconda3/envs/py36:
#
# Name Version Build Channel
_tflow_180_select 3.0 eigen
absl-py 0.2.2 py36_0
astor 0.6.2 py36_1
backports 1.0 py36_1
backports.weakref 1.0rc1 py36_0
binutils_impl_linux-64 2.28.1 had2808c_3
binutils_linux-64 7.2.0 had2808c_27
blas 1.0 mkl
bleach 1.5.0 py36_0
ca-certificates 2018.03.07 0
certifi 2018.4.16 py36_0
cudatoolkit 8.0 3
cudnn 6.0.21 cuda8.0_0
cycler 0.10.0 py36_0
cython 0.28.3 py36h14c3975_0
dbus 1.13.2 h714fa37_1
easydict 1.6 <pip>
enum34 1.1.6 <pip>
expat 2.2.5 he0dffb1_0
fontconfig 2.13.0 h9420a91_0
freetype 2.9.1 h8a8886c_0
gast 0.2.0 py36_0
gcc_impl_linux-64 7.2.0 habb00fd_3
gcc_linux-64 7.2.0 h550dcbe_27
glib 2.56.1 h000015b_0
grpcio 1.12.1 py36hdbcaa40_0
gst-plugins-base 1.14.0 hbbd80ab_1
gstreamer 1.14.0 hb453b48_1
gxx_impl_linux-64 7.2.0 hdf63c60_3
gxx_linux-64 7.2.0 h550dcbe_27
h5py 2.8.0 py36h8d01980_0
hdf5 1.10.2 hba1933b_1
html5lib 0.9999999 py36_0
icu 58.2 h9c2bf20_1
intel-openmp 2018.0.3 0
jpeg 9b h024ee3a_2
Keras 2.1.2 <pip>
keras-applications 1.0.2 py36_0
keras-base 2.2.0 py36_0
keras-preprocessing 1.0.1 py36_0
kiwisolver 1.0.1 py36hf484d3e_0
libedit 3.1.20170329 h6b74fdf_2
libffi 3.2.1 hd88cf55_4
libgcc 7.2.0 h69d50b8_2
libgcc-ng 7.2.0 hdf63c60_3
libgfortran-ng 7.2.0 hdf63c60_3
libgpuarray 0.7.6 h14c3975_0
libpng 1.6.34 hb9fc6fc_0
libprotobuf 3.5.2 h6f1eeef_0
libstdcxx-ng 7.2.0 hdf63c60_3
libtiff 4.0.9 he85c1e1_1
libuuid 1.0.3 h1bed415_2
libxcb 1.13 h1bed415_1
libxml2 2.9.8 h26e45fe_1
mako 1.0.7 py36_0
markdown 2.6.11 py36_0
markupsafe 1.0 py36h14c3975_1
matplotlib 2.2.2 py36hb69df0a_2
mkl 2018.0.3 1
mkl-service 1.1.2 py36h651fb7a_4
mkl_fft 1.0.2 py36h651fb7a_0
mkl_random 1.0.1 py36h4414c95_1
ncurses 6.1 hf484d3e_0
numpy 1.14.5 py36h1b885b7_4
numpy-base 1.14.5 py36hdbf6ddf_4
olefile 0.45.1 py36_0
opencv3 3.1.0 py36_0 menpo
openssl 1.0.2o h20670df_0
pcre 8.42 h439df22_0
pillow 5.1.0 py36heded4f4_0
pip 10.0.1 py36_0
pip 18.0 <pip>
protobuf 3.5.2 py36hf484d3e_1
pygpu 0.7.6 py36h035aef0_0
pyparsing 2.2.0 py36_1
pyqt 5.9.2 py36h22d08a2_0
python 3.6.6 hc3d631a_0
python-dateutil 2.7.3 py36_0
pytz 2018.5 py36_0
pyyaml 3.12 py36h14c3975_1
qt 5.9.6 h52aff34_0
readline 7.0 ha6073c6_4
scipy 1.1.0 py36hc49cb51_0
setuptools 39.2.0 py36_0
setuptools 39.1.0 <pip>
sip 4.19.8 py36hf484d3e_0
six 1.11.0 py36_1
sqlite 3.24.0 h84994c4_0
tensorflow-gpu 1.4.0 <pip>
tensorflow-tensorboard 0.4.0 <pip>
termcolor 1.1.0 py36_1
theano 1.0.2 py36h6bb024c_0
tk 8.6.7 hc745277_3
tornado 5.0.2 py36h14c3975_0
werkzeug 0.14.1 py36_0
wheel 0.31.1 py36_0
xz 5.2.4 h14c3975_4
yaml 0.1.7 had09818_2
zlib 1.2.11 ha838bed_2
在anaconda虛擬環境安裝cuda8.0
conda install cudatoolkit=8.0 -c https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/linux-64/
在anaconda虛擬環境安裝cudnn
conda install cudnn=7.0.5 -c https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main/linux-64/
參考ubuntu利用conda創建虛擬環境,并安裝cuda cudnn pytorch
一、Anaconda
官網下載地址
環境遷移
Anaconda入門使用指南
推薦版本 Anaconda 5.2 For Linux Installer
Python 3.6 version
- 將下載文件夾中的腳本文件.sh移動到指定文件夾路徑中,在當前文件夾運行
bash ./Anaconda3-5.0.0-Linux-x86_64.sh
詢問是否把anaconda的bin添加到用戶的環境變量中,選擇yes!安裝完成。
- 運行以下指令建立運行環境,tensorflow為環境名稱,可以自己指定。
conda create -n tensorflow python=3.6
- 激活conda環境,tensorflow為環境名稱
source activate tensorflow
- 在tensorflow環境查看tensorflow版本的命令
Python
import tensorflow as tf
tf.version
- 在tensorflow環境查看已安裝的包
conda list
- 在tensorflow環境安裝如 matplotlib包
conda install matplotlib
- 在tensorflow環境更新如 matplotlib包
conda update matplotlib
- 在tensorflow環境刪除如 matplotlib包
conda remove matplotlib
- conda中安裝cuda
conda install cudatoolkit=8.0 -c https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/linux-64/
conda install cudnn=7.0.5 -c https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main/linux-64/
ubuntu利用conda創建虛擬環境,并安裝cuda,cudnn,pytorch
二、TensorFlow
- Anaconda 鏡像使用幫助,TUNA 還提供了 Anaconda 倉庫的鏡像,運行以下命令:
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main/
conda config --set show_channel_urls yes
- TensorFlow 鏡像使用幫助
TensorFlow 鏡像 - CUDA 8.0下載地址
CUDA8.0
運行Demo
- 安裝指定版本Tensorflow,代碼支持的是1.2的版本
pip install -I tensorflow==1.2.1
- 下載tf-faster-rcnn代碼
Git和GitHub環境的搭建
ubuntu使用Github
基于CPU版本運行Demo
- 修改tf-faster-rcnn/lib/model/nms_wrapper.py
from model.config import cfg
#from nms.gpu_nms
import gpu_nms from nms.cpu_nms
import cpu_nms
def nms(dets, thresh, force_cpu=False):
"""Dispatch to either CPU or GPU NMS implementations."""
if dets.shape[0] == 0:
return []
return cpu_nms(dets, thresh)
# if cfg.USE_GPU_NMS and not force_cpu:
# return gpu_nms(dets, thresh, device_id=0)
# else:
# return cpu_nms(dets, thresh)
- 注釋代碼 tf-faster-rcnn/lib/model/config.py
__C.USE_GPU_NMS = False
- 注釋代碼tf-faster-rcnn/lib/setup.py
CUDA = locate_cuda()
self.src_extensions.append('.cu')
Extension('nms.gpu_nms',
['nms/nms_kernel.cu', 'nms/gpu_nms.pyx'],
library_dirs=[CUDA['lib64']],
libraries=['cudart'],
language='c++',
runtime_library_dirs=[CUDA['lib64']],
# this syntax is specific to this build system
# we're only going to use certain compiler args with nvcc and not with gcc
# the implementation of this trick is in customize_compiler() below extra_compile_args={'gcc': ["-Wno-unused-function"],
'nvcc': ['-arch=sm_52',
'--ptxas-options=-v',
'-c',
'--compiler-options',
"'-fPIC'"]},
include_dirs = [numpy_include, CUDA['include']]
- 到tf-faster-rcnn/lib下編譯Cython 模塊,如果后續Demo運行出錯,需從此處重新編譯
cd tf-faster-rcnn/lib
make clean
make
cd ..
修改tf-faster-rcnn/lib/setup.py代碼中的參數設置
extra_compile_args={'gcc': ["-Wno-unused-function"],
'nvcc': ['-arch=sm_61', # 修改此處
'--ptxas-options=-v',
'-c',
'--compiler-options',
"'-fPIC'"]},
include_dirs = [numpy_include, CUDA['include']]
- 安裝Python COCO API:
cd data
git clone https://github.com/pdollar/coco.git
cd coco/PythonAPI
make
cd ../../..
- 下載預訓練模型voc_0712_80k-110k.tgz,解壓有4個文件
./data/scripts/fetch_faster_rcnn_models.sh
保存路徑tf-faster-rcnn/output/vgg16/voc_2007_trainval+voc_2012_trainval/default
- 運行Demo,使用預處理模型進行測試
./tools/demo.py
建議用Pycharm進行調試,有包缺失或者有錯及時修改
運行后可以看到測試照片的效果
服務器使用GPU訓練模型
-
首先根據GPU的型號來修改計算能力(Architecture)
實驗室服務器使用GTX1080,修改sm_52為sm_61
官網提供模型對應的計算能力值.jpeg - 到tf-faster-rcnn/lib下編譯Cython 模塊,如果后續Demo運行出錯,需從此處重新編譯
cd tf-faster-rcnn/lib
make clean
make
cd ..
- 安裝Python COCO API:
cd data
git clone https://github.com/pdollar/coco.git
cd coco/PythonAPI
make
cd ../../..
- 下載預訓練模型
VGG16模型
路徑 data/imagenet_weights,在/tf-faster-rcnn目錄執行命令
mkdir -p data/imagenet_weights
cd data/imagenet_weights
wget -v http://download.tensorflow.org/models/vgg_16_2016_08_28.tar.gz
tar -xzvf vgg_16_2016_08_28.tar.gz
mv vgg_16.ckpt vgg16.ckpt
cd ../..
- 準備訓練數據
數據集需要參考VOC2007的數據集格式
JPEGImages:存放用來訓練的原始圖像,圖片編號要以6為數字命名,例如000034.jpg,圖片要是JPEG/JPG格式的,圖片的長寬比(width/height)要在0.462-6.828之間;
Annotations :存放原始圖像中的Object的坐標信息,一個訓練圖片對應Annotations下的一個同名的XML文件;
ImageSets/Main :指定用來train,trainval,val和test的圖片的編號,因為VOC的數據集可以做很多的CV任務,比如Object detection, Semantic segementation, Edge detection等,所以Imageset下有幾個子文件夾(Layout, Main, Segementation),修改下Main下的文件 (train.txt, trainval.txt, val.txt, test.txt),里面寫上想要進行任務的圖片的編號。
將上述數據集放在tf-faster-rcnn/data/VOCdevkit2007/VOC2007下面,替換原始VOC2007的JPEGIMages,Imagesets,Annotations,這里也可以直接更換文件夾名稱。
VOC2007數據集下載地址
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCdevkit_08-Jun-2007.tar
數據集解壓命令,在當前文件夾解壓,會自動生成VOCdevkit文件夾。
tar xvf VOCtrainval_06-Nov-2007.tar
tar xvf VOCtest_06-Nov-2007.tar
tar xvf VOCdevkit_08-Jun-2007.tar
- 訓練模型
./experiments/scripts/train_faster_rcnn.sh [GPU_ID] [DATASET] [NET]
# GPU_ID is the GPU you want to test on
# NET in {vgg16, res50, res101, res152} is the network arch to use
# DATASET {pascal_voc, pascal_voc_0712, coco} is defined in train_faster_rcnn.sh
# Examples:
./experiments/scripts/train_faster_rcnn.sh 0 pascal_voc vgg16
./experiments/scripts/train_faster_rcnn.sh 1 coco res101
- Tensorboard查看收斂情況
tensorboard --logdir=tensorboard/vgg16/voc_2007_trainval/ --port=7001
- 訓練的模型4個文件保存在tf-faster-rcnn/output/vgg16/voc_2007_trainval+voc_2012_trainval/default
output/[NET]/[DATASET]/default/
將訓練的模型替換,運行Demo即可看到效果
用自己的數據集進行訓練,需保證JPEGImages,Annotations和ImageSets/Main文件與VOC07數據集保持一致。
修改tf-faster-rcnn/lib/datasets/pascal_voc.py,classes內容與自己數據集一致,' '單引號內是識別的對象
self._classes = ('__background__', # always index 0
'aeroplane', 'bicycle', 'bird', 'boat',
'bottle', 'bus', 'car', 'cat', 'chair',
'cow', 'diningtable', 'dog', 'horse',
'motorbike', 'person', 'pottedplant',
'sheep', 'sofa', 'train', 'tvmonitor')
每次訓練前將tf-faster-rcnn/data/cache和tf-faster-rcnn/output(輸出的model存放的位置,不訓練此文件夾沒有)兩個文件夾刪除。
tf-faster-rcnn測試過程
1、運行demo2.py,可以遍歷測試圖片,并框出物體。
測試數據集保存位置/home/ouc/LiuHongzhi/tf-faster-rcnn-contest -2018/data/demo/.jpg
模型存放在/home/ouc/LiuHongzhi/tf-faster-rcnn-contest -2018/output/vgg16/voc_2007_trainval+voc_2012_trainval/default/,其中有4個文件。
輸出的測試圖片路徑/home/ouc/LiuHongzhi/tf-faster-rcnn-contest -2018/testfigs/*.jpg。注意需要在運行前首先新建testfigs文件夾。
2、運行demo3.py,可以遍歷測試圖片,并輸出真值表。
測試數據集保存位置 /home/ouc/LiuHongzhi/tf-faster-rcnn-contest -2018/data/demo/.jpg。
需要測試圖片的文檔位置 /home/ouc/LiuHongzhi/tf-faster-rcnn-contest -2018/data/VOCdevkit2007/contest/test.txt。
模型存放在 /home/ouc/LiuHongzhi/tf-faster-rcnn-contest -2018/output/vgg16/voc_2007_trainval+voc_2012_trainval/default/,其中有4個文件。
輸出的測試圖片路徑 /home/ouc/LiuHongzhi/tf-faster-rcnn-contest -2018/result.txt。
輸出格式為
<image_id> <class_id> <confidence> <xmin> <ymin> <xmax> <ymax>
1 1 0.377665907145 115.43637085 410.561065674 402.517791748 479.0
tf-faster-rcnn的工程目錄進行簡單介紹
data: 存放數據,以及讀取文件的cache;
experiments: 存放配置文件以及運行的log文件,配置文件
lib: python接口
output: 輸出的model存放的位置,不訓練此文件夾沒有
tensorboard: 可視化部分
tools: 訓練和測試的python文件faster-rcnn檢測出來的結果保存成txt
faster-rcnn檢測出來的結果保存成txt,再轉成xml
訓練過程中出現問題
- 1、訓練自己的數據集時出現error
File "/home/hope/jhson/caffe/py-faster-rcnn2/tools/../lib/datasets/imdb.py", line 67, in roidb
self._roidb = self.roidb_handler()
File "/home/hope/jhson/caffe/py-faster-rcnn2/tools/../lib/datasets/pascal_voc.py", line 103, in gt_roidb
for index in self.image_index]
File "/home/hope/jhson/caffe/py-faster-rcnn2/tools/../lib/datasets/pascal_voc.py", line 208, in _load_pascal_annotation
cls = self._class_to_ind[obj.find('name').text.lower().strip()]
KeyError: 'chair'
首先核對tf-faster-rcnn/lib/datasets/pascal_voc.py文件中self._class內容
其次尋找以下類似代碼
objs = diff_objs (or non_diff_objs)
并在下方添加代碼
cls_objs = [obj for obj in objs if obj.find('name').text in self._classes]
objs = cls_objs
一般可以解決
- 2、訓練自己的數據集時出現error
File “/py-faster-rcnn/tools/../lib/datasets/imdb.py”, line 108, in append_flipped_images
assert (boxes[:, 2] >= boxes[:, 0]).all()
AssertionError
檢查自己數據發現,左上角坐標(x,y)可能為0,或標定區域溢出圖片。而faster rcnn會對Xmin,Ymin,Xmax,Ymax進行減一操作,如果Xmin為0,減一后變為65535。
a、修改lib/datasets/imdb.py,append_flipped_images()函數
數據整理,在一行代碼
boxes[:, 2] = widths[i] - oldx1 - 1
下方加入代碼:
for b in range(len(boxes)):
if boxes[b][2]< boxes[b][0]:
boxes[b][0] = 0
b、修改lib/datasets/pascal_voc.py,_load_pascal_annotation(,)函數
將對Xmin,Ymin,Xmax,Ymax的-1去掉
for ix, obj in enumerate(objs):
bbox = obj.find('bndbox')
# Make pixel indexes 0-based
x1 = float(bbox.find('xmin').text) - 1
y1 = float(bbox.find('ymin').text) - 1
x2 = float(bbox.find('xmax').text) - 1
y2 = float(bbox.find('ymax').text) - 1
cls = self._class_to_ind[obj.find('name').text.lower().strip()]
- 3、TensorBoard可視化結果
TensorBoard是Tensorflow的一個可視化工具,可以看見整個網絡結構,以及將模型訓練過程中的各種匯總數據展示出來,包括標量、圖片、音頻、計算圖、數據分布、直方圖和嵌入向量。
在Terminal終端中運行
tensorboard --logdir=tensorboard/vgg16/voc_2007_trainval/ --port=6006
(tensorflow) henry@henry-Rev-1-0:~$ tensorboard --logdir=tensorboard/vgg16/voc_2007_trainval/ --port=6006
Starting TensorBoard b'54' at http://henry-Rev-1-0:6006
(Press CTRL+C to quit)
WARNING:tensorflow:Found more than one graph event per run, or there was a metagraph containing a graph_def, as well as one or more graph events. Overwriting the graph with the newest event.
WARNING:tensorflow:Found more than one metagraph event per run. Overwriting the metagraph with the newest event.
在此項目中,我的tensorboard保存路徑為/home/henry/tensorboard,只要保證文件結構正確就可以在瀏覽器中搜索http://henry-Rev-1-0:6006,即可自動打開效果。
4、比賽用URPC數據集文件結構
-
Annotation
-
train
- G0024172 1800張
000000.xml-001799.xml - G0024173 1800張
000000.xml-001799.xml - G0024174 1800張
000000.xml-001799.xml - YDXJ0003 7755張
000000.xml-007754.xml - YDXJ0013 4500張
000000.xml-004499.xml
- G0024172 1800張
-
test
- YDXJ0012 1327張
000000.xml-001326.xml
- YDXJ0012 1327張
-
-
ImageSets
- Layout
test.txt 1327張正序排列
train.txt 17655正序排列
val.txt 同test.txt
- Layout
-
JPEGImages
-
*.jpg
- G0024172 1800張
000000.jpg-001799.jpg - G0024173 1800張
000000.jpg-001799.jpg - G0024174 1800張
000000.jpg-001799.jpg - YDXJ0003 7755張
000000.jpg-007754.jpg - YDXJ0013 4500張
000000.jpg-004499.jpg
- G0024172 1800張
-
5、用自己數據集訓練
參考tf-faster rcnn 配置 及自己數據6、運行./tools/demo.py報錯
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
Process finished with exit code 134 (interrupted by signal 6: SIGABRT)
分析原因:
這個錯誤是程序運行時數據量太大。代碼中頻繁的使用 new 生成數組。程序中頻繁的調malloc(),導致可用內存不斷減小,最終內存不夠,無法分配新的空間,程序終止。
解決思路:
free -m #查看運行內存
relaybot@ubuntu:~/swap$ free -m
total used free shared buffers cached
Mem: 7916 7459 456 95 20 1404
-/+ buffers/cache: 6034 1881
Swap: 0 0 0
出現類似error后,可以重啟機器,開機后只運行pycharm或者終端運行demo.py可解決問題。
參考內存不夠程序終止錯誤解決方案
- 7、換數據集后,demo.py部分code未修改產生錯誤
報錯內容
Traceback (most recent call last):
File "/home/henry/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1139, in _do_call
return fn(*args)
File "/home/henry/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1121, in _run_fn
status, run_metadata)
File "/home/henry/anaconda3/envs/tensorflow/lib/python3.6/contextlib.py", line 88, in __exit__
next(self.gen)
File "/home/henry/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [84] rhs shape= [16]
[[Node: save/Assign = Assign[T=DT_FLOAT, _class=["loc:@vgg_16/bbox_pred/biases"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/cpu:0"](vgg_16/bbox_pred/biases, save/RestoreV2)]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/henry/File/tf-faster-rcnn-contest/tools/demo.py", line 153, in <module>
saver.restore(sess, tfmodel)
File "/home/henry/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1548, in restore
{self.saver_def.filename_tensor_name: save_path})
File "/home/henry/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 789, in run
run_metadata_ptr)
File "/home/henry/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 997, in _run
feed_dict_string, options, run_metadata)
File "/home/henry/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1132, in _do_run
target_list, options, run_metadata)
File "/home/henry/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [84] rhs shape= [16]
[[Node: save/Assign = Assign[T=DT_FLOAT, _class=["loc:@vgg_16/bbox_pred/biases"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/cpu:0"](vgg_16/bbox_pred/biases, save/RestoreV2)]]
Caused by op 'save/Assign', defined at:
File "/home/henry/File/tf-faster-rcnn-contest/tools/demo.py", line 152, in <module>
saver = tf.train.Saver()
File "/home/henry/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1139, in __init__
self.build()
File "/home/henry/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1170, in build
restore_sequentially=self._restore_sequentially)
File "/home/henry/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 691, in build
restore_sequentially, reshape)
File "/home/henry/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 419, in _AddRestoreOps
assign_ops.append(saveable.restore(tensors, shapes))
File "/home/henry/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 155, in restore
self.op.get_shape().is_fully_defined())
File "/home/henry/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/state_ops.py", line 271, in assign
validate_shape=validate_shape)
File "/home/henry/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/gen_state_ops.py", line 45, in assign
use_locking=use_locking, name=name)
File "/home/henry/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
op_def=op_def)
File "/home/henry/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2506, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/home/henry/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1269, in __init__
self._traceback = _extract_stack()
InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [84] rhs shape= [16]
[[Node: save/Assign = Assign[T=DT_FLOAT, _class=["loc:@vgg_16/bbox_pred/biases"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/cpu:0"](vgg_16/bbox_pred/biases, save/RestoreV2)]]
Process finished with exit code 1
分析原因
net.create_architecture("TEST", 21,tag='default', anchor_scales=[8, 16, 32])
21是VOC的20種類別+background,但是自己數據集只有3種類別,屬于模型與測試的參數不匹配產生的錯誤,因此需要按如下修改:
net.create_architecture("TEST", 4,tag='default', anchor_scales=[8, 16, 32])
問題解決,可以正常測試,輸出如下:
Loaded network output/vgg16/voc_2007_trainval+voc_2012_trainval/default/vgg16_faster_rcnn_iter_70000.ckpt
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Demo for data/demo/000337.jpg
Detection took 29.147s for 300 object proposals
Process finished with exit code 0
- 8、增加openCV打開攝像頭,識別的代碼。
#im_names = ['000456.jpg', '000542.jpg', '001150.jpg',
# '001763.jpg', '004545.jpg'] #default
#im_names = ['000023.jpg']
#for im_name in im_names:
# print('~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~')
# print('Demo for data/demo/{}'.format(im_name))
# demo(sess, net, im_name)
videoCapture = cv2.VideoCapture(0)
print('~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~')
while 1:
ret, im = videoCapture.read()
cv2.imshow("capture", im)
#print('Demo for data/demo/{}'.format(im))
demo(sess, net, im)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
videoCapture.release()
cv2.destroyAllWindows()
plt.show()
- 9、訓練自己的模型,報錯ZeroDivisionError。
Fix VGG16 layers..
Fixed.
Traceback (most recent call last):
File "./tools/trainval_net.py", line 139, in <module>
max_iters=args.max_iters)
File "/home/ouc/LiuHongzhi/tf-faster-rcnn-contest-release/tools/../lib/model/train_val.py", line 377, in train_net
sw.train_model(sess, max_iters)
File "/home/ouc/LiuHongzhi/tf-faster-rcnn-contest-release/tools/../lib/model/train_val.py", line 278, in train_model
blobs = self.data_layer.forward()
File "/home/ouc/LiuHongzhi/tf-faster-rcnn-contest-release/tools/../lib/roi_data_layer/layer.py", line 87, in forward
blobs = self._get_next_minibatch()
File "/home/ouc/LiuHongzhi/tf-faster-rcnn-contest-release/tools/../lib/roi_data_layer/layer.py", line 83, in _get_next_minibatch
return get_minibatch(minibatch_db, self._num_classes)
File "/home/ouc/LiuHongzhi/tf-faster-rcnn-contest-release/tools/../lib/roi_data_layer/minibatch.py", line 27, in get_minibatch
assert(cfg.TRAIN.BATCH_SIZE % num_images == 0), \
ZeroDivisionError: integer division or modulo by zero
Command exited with non-zero status 1
14.62user 2.53system 0:17.01elapsed 100%CPU (0avgtext+0avgdata 2721756maxresident)k
0inputs+9504outputs (0major+1190329minor)pagefaults 0swaps
解決方式
首先檢查./data/VOCdevkit2007/VOC2007/ImageSets/Main路徑下的train.txt和test.txt文件不為空。
刪除緩存文件,data/VOCdevkit/cache和data/cache/文件。
get zero division errors #160
- 10、訓練自己的模型,報錯AttributeError。
一般是由于/home/ouc/LiuHongzhi/tf-faster-rcnn-contest/data/VOCdevkit2007/VOC2007/Annotations/*.xml文件造成,格式不符合VOC2007,修改xml格式直到符合標準。
Appending horizontally-flipped training examples...
Traceback (most recent call last):
File "./tools/trainval_net.py", line 105, in <module>
imdb, roidb = combined_roidb(args.imdb_name)
File "./tools/trainval_net.py", line 76, in combined_roidb
roidbs = [get_roidb(s) for s in imdb_names.split('+')]
File "./tools/trainval_net.py", line 76, in <listcomp>
roidbs = [get_roidb(s) for s in imdb_names.split('+')]
File "./tools/trainval_net.py", line 73, in get_roidb
roidb = get_training_roidb(imdb)
File "/home/ouc/LiuHongzhi/tf-faster-rcnn-contest (copy)/tools/../lib/model/train_val.py", line 328, in get_training_roidb
imdb.append_flipped_images()
File "/home/ouc/LiuHongzhi/tf-faster-rcnn-contest (copy)/tools/../lib/datasets/imdb.py", line 113, in append_flipped_images
boxes = self.roidb[i]['boxes'].copy()
File "/home/ouc/LiuHongzhi/tf-faster-rcnn-contest (copy)/tools/../lib/datasets/imdb.py", line 74, in roidb
self._roidb = self.roidb_handler()
File "/home/ouc/LiuHongzhi/tf-faster-rcnn-contest (copy)/tools/../lib/datasets/pascal_voc.py", line 111, in gt_roidb
for index in self.image_index]
File "/home/ouc/LiuHongzhi/tf-faster-rcnn-contest (copy)/tools/../lib/datasets/pascal_voc.py", line 111, in <listcomp>
for index in self.image_index]
File "/home/ouc/LiuHongzhi/tf-faster-rcnn-contest (copy)/tools/../lib/datasets/pascal_voc.py", line 148, in _load_pascal_annotation
obj for obj in objs if int(obj.find('difficult').text) == 0]
File "/home/ouc/LiuHongzhi/tf-faster-rcnn-contest (copy)/tools/../lib/datasets/pascal_voc.py", line 148, in <listcomp>
obj for obj in objs if int(obj.find('difficult').text) == 0]
AttributeError: 'NoneType' object has no attribute 'text'
Command exited with non-zero status 1
1.50user 0.14system 0:01.64elapsed 99%CPU (0avgtext+0avgdata 251932maxresident)k
0inputs+24outputs (0major+51834minor)pagefaults 0swaps
修改方案,注釋以下代碼:
non_diff_objs = [
obj for obj in objs if int(obj.find('difficult').text) == 0]
- 11、訓練自己的模型,報錯KeyError。
'USE_GPU_NMS': True}
Loaded dataset `voc_2007_trainval` for training
Set proposal method: gt
Appending horizontally-flipped training examples...
Traceback (most recent call last):
File "./tools/trainval_net.py", line 105, in <module>
imdb, roidb = combined_roidb(args.imdb_name)
File "./tools/trainval_net.py", line 76, in combined_roidb
roidbs = [get_roidb(s) for s in imdb_names.split('+')]
File "./tools/trainval_net.py", line 76, in <listcomp>
roidbs = [get_roidb(s) for s in imdb_names.split('+')]
File "./tools/trainval_net.py", line 73, in get_roidb
roidb = get_training_roidb(imdb)
File "/home/ouc/LiuHongzhi/tf-faster-rcnn-contest (copy)/tools/../lib/model/train_val.py", line 328, in get_training_roidb
imdb.append_flipped_images()
File "/home/ouc/LiuHongzhi/tf-faster-rcnn-contest (copy)/tools/../lib/datasets/imdb.py", line 113, in append_flipped_images
boxes = self.roidb[i]['boxes'].copy()
File "/home/ouc/LiuHongzhi/tf-faster-rcnn-contest (copy)/tools/../lib/datasets/imdb.py", line 74, in roidb
self._roidb = self.roidb_handler()
File "/home/ouc/LiuHongzhi/tf-faster-rcnn-contest (copy)/tools/../lib/datasets/pascal_voc.py", line 111, in gt_roidb
for index in self.image_index]
File "/home/ouc/LiuHongzhi/tf-faster-rcnn-contest (copy)/tools/../lib/datasets/pascal_voc.py", line 111, in <listcomp>
for index in self.image_index]
File "/home/ouc/LiuHongzhi/tf-faster-rcnn-contest (copy)/tools/../lib/datasets/pascal_voc.py", line 175, in _load_pascal_annotation
cls = self._class_to_ind[obj.find('name').text.lower().strip()]
KeyError: '"scallop"'
Command exited with non-zero status 1
1.54user 0.22system 0:01.81elapsed 97%CPU (0avgtext+0avgdata 251004maxresident)k
0inputs+0outputs (0major+51792minor)pagefaults 0swaps
刪除py-faster-rcnn/data/VOCdevkit2007/annotations_cache這個文件夾;
刪除py-faster-rcnn/data/cache文件夾。
可能是xml中有self_classes沒有的類別scallop。
- 12、訓練自己的模型,報錯Attribute Error。
Attribute Error: 'NoneType' object has no attribute 'astype'
建議檢查demo文檔里,測試圖片的名字是否寫錯,尤其是擴展名。比如把.jpeg寫成了.jepg。
- 13、測試自己的模型,報錯TypeError。
Saving cached annotations to /home/ouc/LiuHongzhi/tf-faster-rcnn-contest -2018/data/VOCdevkit2007/VOC2007/ImageSets/Main/test.txt_annots.pkl
Traceback (most recent call last):
File "./tools/test_net.py", line 120, in <module>
test_net(sess, net, imdb, filename, max_per_image=args.max_per_image)
File "/home/ouc/LiuHongzhi/tf-faster-rcnn-contest -2018/tools/../lib/model/test.py", line 196, in test_net
imdb.evaluate_detections(all_boxes, output_dir)
File "/home/ouc/LiuHongzhi/tf-faster-rcnn-contest -2018/tools/../lib/datasets/pascal_voc.py", line 285, in evaluate_detections
self._do_python_eval(output_dir)
File "/home/ouc/LiuHongzhi/tf-faster-rcnn-contest -2018/tools/../lib/datasets/pascal_voc.py", line 248, in _do_python_eval
use_07_metric=use_07_metric, use_diff=self.config['use_diff'])
File "/home/ouc/LiuHongzhi/tf-faster-rcnn-contest -2018/tools/../lib/datasets/voc_eval.py", line 122, in voc_eval
pickle.dump(recs, f)
TypeError: write() argument must be str, not bytes
Command exited with non-zero status 1
一開始嘗試在/tf-faster-rcnn-contest -2018/tools/../lib/datasets/voc_eval.py中修改
with open(cachefile, 'w') as f:
修改為
with open(cachefile, 'wb') as f:
出現新的報錯
Evaluating detections
Writing holothurian VOC results file
Writing echinus VOC results file
Writing scallop VOC results file
Writing starfish VOC results file
VOC07 metric? Yes
Traceback (most recent call last):
File "/home/ouc/LiuHongzhi/tf-faster-rcnn-contest -2018/tools/../lib/datasets/voc_eval.py", line 128, in voc_eval
recs = pickle.load(f)
EOFError: Ran out of input
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "./tools/test_net.py", line 120, in <module>
test_net(sess, net, imdb, filename, max_per_image=args.max_per_image)
File "/home/ouc/LiuHongzhi/tf-faster-rcnn-contest -2018/tools/../lib/model/test.py", line 196, in test_net
imdb.evaluate_detections(all_boxes, output_dir)
File "/home/ouc/LiuHongzhi/tf-faster-rcnn-contest -2018/tools/../lib/datasets/pascal_voc.py", line 285, in evaluate_detections
self._do_python_eval(output_dir)
File "/home/ouc/LiuHongzhi/tf-faster-rcnn-contest -2018/tools/../lib/datasets/pascal_voc.py", line 248, in _do_python_eval
use_07_metric=use_07_metric, use_diff=self.config['use_diff'])
File "/home/ouc/LiuHongzhi/tf-faster-rcnn-contest -2018/tools/../lib/datasets/voc_eval.py", line 130, in voc_eval
recs = pickle.load(f, encoding='bytes')
EOFError: Ran out of input
Command exited with non-zero status 1
參考EOFError: Ran out of input #171
將tf-faster-rcnn-contest -2018/tools/../lib/datasets/voc_eval.py中找到
cachefile = os.path.join(cachedir, '%s_annots.pkl' % imagesetfile)
print('Saving cached annotations to {:s}'.format(cachefile))
with open(cachefile, 'w') as f:
pickle.dump(recs, f)
修改為
cachefile = os.path.join(cachedir, ('%s_annots.pkl' %'imagesetfile'))
#cachefile = os.path.join(cachedir, '%s_annots.pkl' % imagesetfile.split("/")[-1].split(".")[0])
with open(cachefile, 'wb') as f:
pickle.dump(recs, f)
- 14、測試數據集,根據輸入test_list對demo中的圖片進行檢測,輸出比賽格式需要的txt文檔結果的demo.py。
#!/usr/bin/env python
# --------------------------------------------------------
# Tensorflow Faster R-CNN
# Licensed under The MIT License [see LICENSE for details]
# Written by Xinlei Chen, based on code from Ross Girshick
# --------------------------------------------------------
"""
Demo script showing detections in sample images.
See README.md for installation instructions before running.
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import _init_paths
from model.config import cfg
from model.test import im_detect
from model.nms_wrapper import nms
from utils.timer import Timer
import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np
import os, cv2
import os.path
import argparse
from nets.vgg16 import vgg16
from nets.resnet_v1 import resnetv1
import scipy.io as sio
import os, sys, cv2
import argparse
import os
import numpy
from PIL import Image #導入Image模塊
from pylab import * #導入savetxt模塊
CLASSES = ('__background__',
'holothurian', 'echinus', 'scallop', 'starfish')
NETS = {'vgg16': ('vgg16_faster_rcnn_iter_70000.ckpt',),'res101': ('res101_faster_rcnn_iter_110000.ckpt',)}
DATASETS= {'pascal_voc': ('voc_2007_trainval',),'pascal_voc_0712': ('voc_2007_trainval+voc_2012_trainval',)}
def vis_detections(im, class_name, dets, thresh=0.5):
"""Draw detected bounding boxes."""
inds = np.where(dets[:, -1] >= thresh)[0]
if len(inds) == 0:
return
#im = im[:, :, (2, 1, 0)]
#fig, ax = plt.subplots(figsize=(12, 12))
#ax.imshow(im, aspect='equal')
# !/usr/bin/env python
# -*- coding: UTF-8 -*-
# --------------------------------------------------------
# Faster R-CNN
# Copyright (c) 2015 Microsoft
# Licensed under The MIT License [see LICENSE for details]
# Written by Ross Girshick
# --------------------------------------------------------
for i in inds:
bbox = dets[i, :4]
score = dets[i, -1]
if class_name == '__background__':
fw = open('result.txt', 'a') # 最終的txt保存在這個路徑下,下面的都改
fw.write(str(im_name[1]) + ' ' + class_name + ' ' + str(score) + ' ' +str(int(bbox[0])) + ' ' + str(int(bbox[1])) + ' ' + str(int(bbox[2])) + ' ' + str(int(bbox[3])) + '\n')
fw.close()
elif class_name == 'holothurian':
fw = open('result.txt', 'a') # 最終的txt保存在這個路徑下,下面的都改
fw.write(str(im_name[1]) + ' ' + str(1) + ' ' + str(score) + ' ' +str(int(bbox[0])) + ' ' + str(int(bbox[1])) + ' ' + str(int(bbox[2])) + ' ' + str(int(bbox[3])) + '\n')
fw.close()
elif class_name == 'echinus':
fw = open('result.txt', 'a') # 最終的txt保存在這個路徑下,下面的都改
fw.write(str(im_name[1]) + ' ' + str(2) + ' ' + str(score) + ' ' +str(int(bbox[0])) + ' ' + str(int(bbox[1])) + ' ' + str(int(bbox[2])) + ' ' + str(int(bbox[3])) + '\n')
fw.close()
elif class_name == 'scallop':
fw = open('result.txt', 'a') # 最終的txt保存在這個路徑下,下面的都改
fw.write(str(im_name[1]) + ' ' + str(3) + ' ' + str(score) + ' ' +str(int(bbox[0])) + ' ' + str(int(bbox[1])) + ' ' + str(int(bbox[2])) + ' ' + str(int(bbox[3])) + '\n')
fw.close()
elif class_name == 'starfish':
fw = open('result.txt', 'a') # 最終的txt保存在這個路徑下,下面的都改
fw.write(str(im_name[1]) + ' ' + str(4) + ' ' + str(score) + ' ' +str(int(bbox[0])) + ' ' + str(int(bbox[1])) + ' ' + str(int(bbox[2])) + ' ' + str(int(bbox[3])) + '\n')
fw.close()
def demo(sess, net, image_name):
"""Detect object classes in an image using pre-computed object proposals."""
# Load the demo image
all_name = image_name + '.jpg'
im_file = os.path.join(cfg.DATA_DIR, 'demo', all_name)
im = cv2.imread(im_file)
# Detect all object classes and regress object bounds
timer = Timer()
timer.tic()
scores, boxes = im_detect(sess, net, im)
timer.toc()
print('Detection took {:.3f}s for {:d} object proposals'.format(timer.total_time, boxes.shape[0]))
#save_jpg = os.path.join('/data/test',im_name)
# Visualize detections for each class
CONF_THRESH = 0.8
NMS_THRESH = 0.3
#im = im[:, :, (2, 1, 0)]
#fig,ax = plt.subplots(figsize=(12, 12))
#ax.imshow(im, aspect='equal')
for cls_ind, cls in enumerate(CLASSES[1:]):
cls_ind += 1 # because we skipped background
cls_boxes = boxes[:, 4*cls_ind:4*(cls_ind + 1)]
cls_scores = scores[:, cls_ind]
dets = np.hstack((cls_boxes,
cls_scores[:, np.newaxis])).astype(np.float32)
keep = nms(dets, NMS_THRESH)
dets = dets[keep, :]
vis_detections(im, cls, dets,thresh=CONF_THRESH)
def parse_args():
"""Parse input arguments."""
parser = argparse.ArgumentParser(description='Tensorflow Faster R-CNN demo')
#parser.add_argument('--net', dest='demo_net', help='Network to use [vgg16 res101]',
# choices=NETS.keys(), default='res101') #default
parser.add_argument('--net', dest='demo_net', help='Network to use [vgg16 res101]',
choices=NETS.keys(), default='vgg16')
parser.add_argument('--dataset', dest='dataset', help='Trained dataset [pascal_voc pascal_voc_0712]',
choices=DATASETS.keys(), default='pascal_voc_0712')
args = parser.parse_args()
return args
if __name__ == '__main__':
cfg.TEST.HAS_RPN = True # Use RPN for proposals
args = parse_args()
cfg.USE_GPU_NMS = False
# model path
demonet = args.demo_net
dataset = args.dataset
tfmodel = os.path.join('output', demonet, DATASETS[dataset][0], 'default',
NETS[demonet][0])
if not os.path.isfile(tfmodel + '.meta'):
raise IOError(('{:s} not found.\nDid you download the proper networks from '
'our server and place them properly?').format(tfmodel + '.meta'))
# set config
tfconfig = tf.ConfigProto(allow_soft_placement=True)
tfconfig.gpu_options.allow_growth=True
# init session
sess = tf.Session(config=tfconfig)
# load network
if demonet == 'vgg16':
net = vgg16()
elif demonet == 'res101':
net = resnetv1(num_layers=101)
else:
raise NotImplementedError
net.create_architecture("TEST",5,
tag='default', anchor_scales=[8, 16, 32])
saver = tf.train.Saver()
saver.restore(sess, tfmodel)
print('Loaded network {:s}'.format(tfmodel))
#im_names = ['000456.jpg', '000542.jpg', '001150.jpg',
# '001763.jpg', '004545.jpg'] #default
#im_names = ['000456.jpg', '000542.jpg', '001150.jpg',
# '001763.jpg', '004545.jpg']
im = 128 * np.ones((300, 500, 3), dtype=np.uint8)
for i in range(2):
_, _= im_detect(sess,net, im)
#im_names = get_imlist(r"/home/henry/Files/tf-faster-rcnn-contest/data/demo")
fr = open('/home/ouc/LiuHongzhi/tf-faster-rcnn-contest -2018/data/VOCdevkit2007/test_list.txt', 'r')
for im_name in fr:
#path = "/home/henry/Files/URPC2018/VOC/VOC2007/JPEGImages/G0024172/*.jpg"
#filelist = os.listdir(path)
#for im_name in path:
im_name = im_name.strip()
im_name = im_name.split(' ')
print('~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~')
print('mainDemo for data/demo/{}{}'.format(im_name[0], '.jpg'))
print('mainDemo for data/demo/{}{}'.format(im_name[1], '.jpg'))
demo(sess, net, im_name[0])
#plt.show()
fr.close
- 15、制作VOC鏡像訓練集,訓練模型,報錯RuntimeWarning。
/home/ouc/LiuHongzhi/tf-faster-rcnn-contest-2018/tools/../lib/model/bbox_transform.py:27: RuntimeWarning: invalid value encountered in log
targets_dw = np.log(gt_widths / ex_widths)
iter: 100 / 70000, total loss: nan
>>> rpn_loss_cls: 0.668627
>>> rpn_loss_box: nan
>>> loss_cls: 0.009253
>>> loss_box: 0.000000
>>> lr: 0.001000
speed: 0.342s / iter
iter: 120 / 70000, total loss: nan
>>> rpn_loss_cls: 0.657523
>>> rpn_loss_box: nan
>>> loss_cls: 0.001831
>>> loss_box: 0.000000
>>> lr: 0.001000
原因分析,Annotation中的xm文件的bounding box坐標超出圖片范圍,如下圖所示:
對xmin修改后,可以正常訓練。
相關參考faster rcnn訓練過程出現loss=nan解決辦法
- 16、訓練結束后測試時出現Keyerror報錯
File "/home/hyzhan/py-faster-rcnn/tools/../lib/datasets/voc_eval.py", line 126, in voc_eval
R = [obj for obj in recs[imagename] if obj['name'] == classname]
KeyError: '000002'
解決方法:刪除data/VOCdekit2007下的annotations_cache文件夾
參考鏈接 用faster-rcnn訓練自己的數據集
- 17、運行程序發現去重框有問題,應該定位到NMS問題
,根據GPU的型號來選擇合適的計算能力(Architecture),在setup.py修改后需要到tf-faster-rcnn/lib重新編譯。
- 18、minibatch_db為空時
報錯
assert(cfg.TRAIN.BATCH_SIZE % num_images ==0)
ZeroDivisionError: integer division or modulo by zero
minibatch_db: [{'boxes': array([[164, 103, 280, 237],
[524, 232, 687, 385]], dtype=uint16),
'gt_classes': array([2, 2], dtype=int32),
'gt_overlaps': <2x5 sparse matrix of type '<class 'numpy.float32'>'with 2 stored elements in Compressed Sparse Row format>,
'flipped': False, 'seg_areas': array([15795., 25256.], dtype=float32),
'image': '/home/henry/Files/URPC2019/faster-rcnn-contest/data/VOCdevkit2007/VOC2007/JPEGImages/YN030001_3285.jpg',
'width': 720,
'height': 405,
'max_classes': array([2, 2]),
'max_overlaps': array([1., 1.], dtype=float32)}]
解決方式,在/lib/roi_data_layer/layer.py加入如下代碼:
def _get_next_minibatch_inds(self):
"""Return the roidb indices for the next minibatch."""
if self._cur + cfg.TRAIN.IMS_PER_BATCH >= len(self._roidb):
self._shuffle_roidb_inds()
db_inds = self._perm[self._cur:self._cur + cfg.TRAIN.IMS_PER_BATCH]
self._cur += cfg.TRAIN.IMS_PER_BATCH
if self._cur == self._perm.size: # add
self._cur = 0 # add
#print("\n db_inds: ", db_inds) # db_inds: [5138]
#print("\n self._perm: ", self._perm.shape) # self._perm: (7388,)
#print("\n self._cur: ", self._cur) # self._cur: 27
return db_inds