運行tensorflow
在CUDA完成安裝之后,還需要添加環(huán)境變量,打開終端,輸入下面的命令:
export PATH=/usr/local/cuda-9.0/bin${PATH:+:${PATH}}
如果是64位系統(tǒng),輸入:
export LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
如果是32位系統(tǒng),輸入:
export LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
如果需要使用其中的內(nèi)置庫
export PYTHONPATH=$PYTHONPATH:/home/time/ImageNet/models-master
運行ResNet
ResNet的程序位于offical/resnet目錄下
假設ImageNet存放目錄為
/media/time/20162AC5162A9BB2/Thunder/ImageNet_TF
運行
python imagenet_main.py --data_dir='/media/time/20162AC5162A9BB2/Thunder/ImageNet_TF' --batch_size=16 --model_dir='./model_101Res/' --resnet_size=101
可以將上面的文件寫成批處理文件
export PYTHONPATH=$PYTHONPATH:/home/time/ImageNet/models-master
export PATH=/usr/local/cuda-9.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
python imagenet_main.py --data_dir='/media/time/20162AC5162A9BB2/Thunder/ImageNet_TF' --batch_size=256 --model_dir='./modelChkPt2/' --resnet_size=18
注意這里Image_main.py的參數(shù)
flags:
imagenet_main.py:
-bs,--batch_size:
Batch size for training and evaluation. When using multiple gpus, this is
the
global batch size for all devices. For example, if the batch size is 32 and
there are 4 GPUs, each GPU will get 8 examples on each step.
(default: '32')
(an integer)
--[no]clean:
If set, model_dir will be removed if it exists.
(default: 'false')
-dd,--data_dir:
The location of the input data.
(default: '/tmp')
-df,--data_format: <channels_first|channels_last>:
A flag to override the data format used in the model. channels_first
provides a
performance boost on GPU but is not always compatible with CPU. If left
unspecified, the data format will be chosen automatically based on whether
TensorFlow was built for CPU or GPU.
-ebe,--epochs_between_evals:
The number of training epochs to run between evaluations.
(default: '1')
(an integer)
-ed,--export_dir:
If set, a SavedModel serialization of the model will be exported to this
directory at the end of training. See the README for more details and
relevant
links.
-hk,--hooks:
A list of (case insensitive) strings to specify the names of training hooks.
Hook:
profilerhook
loggingtensorhook
examplespersecondhook
loggingmetrichook
Example: `--hooks ProfilerHook,ExamplesPerSecondHook`
See official.utils.logs.hooks_helper for details.
(default: 'LoggingTensorHook')
(a comma separated list)
-md,--model_dir:
The location of the model checkpoint files.
(default: '/tmp')
-rs,--resnet_size: <18|34|50|101|152|200>:
The size of the ResNet model to use.
(default: '50')
-rv,--resnet_version: <1|2>:
Version of ResNet. (1 or 2) See README.md for details.
(default: '2')
-te,--train_epochs:
The number of epochs used to train.
(default: '100')
(an integer)
使用Tensorboard
tensorboard --logdir=/home/time/ImageNet/models-master/official/resnet/model_101Res
可以啟動tensorboard觀察運行狀態(tài)