Machine Learning(一):基于 TensorFlow 實現(xiàn)寵物血統(tǒng)智能識別

  • Hello TensorFlow
  • TensorFlow C library
  • TensorFlow Go bingding

Hello TensorFlow

人類喜歡將所有事物都納入鄙視鏈的范疇,寵物當(dāng)然也不例外。一般來說,擁有一只純種寵物可以讓主人占據(jù)鄙視鏈的云端,進而鄙視那些混血或者流浪寵物。甚至還發(fā)展出了專業(yè)的鑒定機構(gòu),可以頒發(fā)《血統(tǒng)證明書》。但是考究各類純種鑒定的常規(guī)方法:例如眼睛的大小、顏色、鼻子的特點、身軀長度、尾巴特征、毛發(fā)等,當(dāng)然也包括一些比較玄幻的特征:寵物家族的個性、氣質(zhì)等等。拋開“黑魔法”不在此討論之外,既然是基于生物外形特征鑒定,判斷是否純種的需求本質(zhì)上就是一個圖像識別服務(wù)。

Tensorflow is not a Machine Learning specific library, instead, is a general purpose computation library that represents computations with graphs.

TensorFlow 開源軟件庫(Apache 2.0 許可證),最初由 Google Brain 團隊開發(fā)。TensorFlow 提供了一系列算法模型和編程接口,讓我們可以快速構(gòu)建一個基于機器學(xué)習(xí)的智能服務(wù)。對于開發(fā)者來說,目前有四種編程接口可供選擇:

  • C++ source code: Tensorflow 核心基于 C++ 編寫,支持從高到低各個層級的操作;
  • Python bindings & Python library: 對標(biāo) C++ 實現(xiàn),支持 Python 調(diào)用 C++ 函數(shù);
  • Java bindings;
  • Go binding;

下面是一個簡單的實例:

[圖片上傳失敗...(image-8754c1-1518459129749)]

環(huán)境準(zhǔn)備

  • 安裝 TensorFlow C library,包含一個頭文件 c_api.h 和 libtensorflow.so
wget https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-cpu-linux-x86_64-1.5.0.tar.gz

## options
TF_TYPE="cpu" # Change to "gpu" for GPU support
TF_VERSION='1.5.0'
curl -L \
  "https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-${TF_TYPE}-$(go env GOOS)-x86_64-${TF_VERSION}.tar.gz" |
go get github.com/tensorflow/tensorflow/tensorflow/go
go get github.com/tensorflow/tensorflow/tensorflow/go/op
  • 下載模型(demo model),包含一個標(biāo)簽文件 label_strings.txt 和 graph.pb
mkdir model
wget https://storage.googleapis.com/download.tensorflow.org/models/inception5h.zip -O model/inception.zip
unzip model/inception.zip -d model
chmod -R 777 model

Tensorflow Model Function

//Loading TensorFlow model
func loadModel() error {
  // Load inception model
  model, err := ioutil.ReadFile("./model/tensorflow_inception_graph.pb")
  if err != nil {
    return err
  }
  graph = tf.NewGraph()
  if err := graph.Import(model, ""); err != nil {
    return err
  }
  // Load labels
  labelsFile, err := os.Open("./model/imagenet_comp_graph_label_strings.txt")
  if err != nil {
    return err
  }
  defer labelsFile.Close()
  scanner := bufio.NewScanner(labelsFile)
  // Labels are separated by newlines
  for scanner.Scan() {
    labels = append(labels, scanner.Text())
  }
  if err := scanner.Err(); err != nil {
    return err
  }
  return nil
}

Classifying Workflow

基于 Tensorflow 模型實現(xiàn)圖像識別的主要流程如下:

  • 圖像轉(zhuǎn)換 (Convert to tensor )
  • 圖像標(biāo)準(zhǔn)化( Normalize )
  • 圖像分類 ( Classifying )
func recognizeHandler(w http.ResponseWriter, r *http.Request, _ httprouter.Params) {
  // Read image
  imageFile, header, err := r.FormFile("image")
  // Will contain filename and extension
  imageName := strings.Split(header.Filename, ".")
  if err != nil {
    responseError(w, "Could not read image", http.StatusBadRequest)
    return
  }
  defer imageFile.Close()
  var imageBuffer bytes.Buffer
  // Copy image data to a buffer
  io.Copy(&imageBuffer, imageFile)

  // ...

  tensor, err := makeTensorFromImage(&imageBuffer, imageName[:1][0])
  if err != nil {
    responseError(w, "Invalid image", http.StatusBadRequest)
    return
  }

  // ...
}

函數(shù) makeTensorFromImage() which runs an image tensor through the normalization graph.

func makeTensorFromImage(imageBuffer *bytes.Buffer, imageFormat string) (*tf.Tensor, error) {
  tensor, err := tf.NewTensor(imageBuffer.String())
  if err != nil {
    return nil, err
  }
  graph, input, output, err := makeTransformImageGraph(imageFormat)
  if err != nil {
    return nil, err
  }
  session, err := tf.NewSession(graph, nil)
  if err != nil {
    return nil, err
  }
  defer session.Close()
  normalized, err := session.Run(
    map[tf.Output]*tf.Tensor{input: tensor},
    []tf.Output{output},
    nil)
  if err != nil {
    return nil, err
  }
  return normalized[0], nil
}

函數(shù) maketransformimagegraph() 將圖形的像素值調(diào)整到 224x224,以符合模型輸入?yún)?shù)要求。

func makeTransformImageGraph(imageFormat string) (graph *tf.Graph, input, output tf.Output, err error) {
  const (
    H, W  = 224, 224
    Mean  = float32(117)
    Scale = float32(1)
  )
  s := op.NewScope()
  input = op.Placeholder(s, tf.String)
  // Decode PNG or JPEG
  var decode tf.Output
  if imageFormat == "png" {
    decode = op.DecodePng(s, input, op.DecodePngChannels(3))
  } else {
    decode = op.DecodeJpeg(s, input, op.DecodeJpegChannels(3))
  }
  // Div and Sub perform (value-Mean)/Scale for each pixel
  output = op.Div(s,
    op.Sub(s,
      // Resize to 224x224 with bilinear interpolation
      op.ResizeBilinear(s,
        // Create a batch containing a single image
        op.ExpandDims(s,
          // Use decoded pixel values
          op.Cast(s, decode, tf.Float),
          op.Const(s.SubScope("make_batch"), int32(0))),
        op.Const(s.SubScope("size"), []int32{H, W})),
      op.Const(s.SubScope("mean"), Mean)),
    op.Const(s.SubScope("scale"), Scale))
  graph, err = s.Finalize()
  return graph, input, output, err
}

最后,將格式化的 image tensor 輸入到 Inception model graph 中運算。

session, err := tf.NewSession(graph, nil)
if err != nil {
  log.Fatal(err)
}
defer session.Close()
output, err := session.Run(
  map[tf.Output]*tf.Tensor{
    graph.Operation("input").Output(0): tensor,
  },
  []tf.Output{
    graph.Operation("output").Output(0),
  },
  nil)
if err != nil {
  responseError(w, "Could not run inference", http.StatusInternalServerError)
  return
}

Testing

func main() {
  if err := loadModel(); err != nil {
    log.Fatal(err)
    return
  }
  r := httprouter.New()
  r.POST("/recognize", recognizeHandler)
  err := http.ListenAndServe(":8080", r)
  if err != nil {
    log.Println(err)
    return
  }
}
識別案例:黑天鵝
$ curl localhost:8080/recognize -F 'image=@../data/IMG_3560.png'
{
  "filename":"IMG_3000.png",
  "labels":[
    {"label":"black swan","probability":0.98746836,"Percent":"98.75%"},
    {"label":"oystercatcher","probability":0.0040768473,"Percent":"0.41%"},
    {"label":"American coot","probability":0.002185003,"Percent":"0.22%"},
    {"label":"black stork","probability":0.0011524856,"Percent":"0.12%"},
    {"label":"redshank","probability":0.0010183558,"Percent":"0.10%"}]
}
IMG_3560.png
IMG_3608.png

通過上面的案例我們可以發(fā)現(xiàn),這個服務(wù)目前可以對于黑天鵝圖像的推算概率值為 98.75%,非常準(zhǔn)確;但是對于另外兩張寵物狗的圖像,最高的推算概率值也僅有 30% 左右,雖然也沒有被識別成貓咪或者狼,但是和理想效果要求可用性還有一段距離(此處暫時忽略物種本身的復(fù)雜性)。主要是因為現(xiàn)在我們使用的還只是一個非常“原始”的模型,如果需要為小眾領(lǐng)域服務(wù)(寵物,也可以是其它事物),需要通過訓(xùn)練(Training Models)增強優(yōu)化,或者引入更豐富的標(biāo)簽,更合適的模型。當(dāng)然,訓(xùn)練過程中也會存在樣本質(zhì)量不佳的情況,錯誤樣本和各種噪音也會影響準(zhǔn)確度。

擴展閱讀

We know that label 866 (military uniform) should be the top label for the Admiral Hopper image.

擴展閱讀:《The Machine Learning Master》

image
最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。

推薦閱讀更多精彩內(nèi)容