參考文獻(xiàn)《Benchmarking State-of-the-Art Deep Learning Software Tools》
軟件:Caffe, CNTK, TensorFlow and Torch。
軟件對(duì)比:
測(cè)試代碼:http://www.comp.hkbu.edu.hk/~chxw/dlbench/index.html?
Caffe: Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell, “Caffe: Convolutional architecture for fast feature embedding” inProceedings of the 22nd ACM international conference on Multimedia, 2014, pp. 675–678.
CNTK: D. Yu, A. Eversole, M. Seltzer, K. Yao, Z. Huang, B. Guenter, O. Kuchaiev, Y. Zhang, F. Seide, H. Wanget al., “An introduction to computational networks and the computational network toolkit” Technical report, Tech. Rep. MSR, Microsoft Research, 2014, 2014. research. microsoft. com/apps/pubs, Tech. Rep., 2014.
Tensorflow: M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devinet al., “Tensorflow: Large scale machine learning on heterogeneous systems, 2015” Software available from tensorflow. org, vol. 1, 2015.
Torch: R. Collobert, K. Kavukcuoglu, and C. Farabet, “Torch7: A matlab like environment for machine learning” inBigLearn, NIPS Workshop, no. EPFL-CONF-192376, 2011.
矩陣計(jì)算:
Tensorflow -> Eigen
Caffe, CNTK, Torch -> OpenBlas
結(jié)論:
(1) In general, all tools do not scale well on many-core CPUs. The performance using 16 CPU cores is only slightly better than using 4 CPU cores.
(2) For FCNs and CNNs, all tools can achieve significant speedup by using contemporary GPUs. With GPUs, Caffe performs the best on FCNs while TensorFlow performs the best on CNNs.
(3) For RNNs, Torch and TensorFlow can achieve much better performance than CNTK on GPU. But on the other hand CNTK performs much better than Torch and TensorFlow on CPU.
(4) Among the three GPU platforms, GTX1080 always performs the best.