說明
NLC服務使用機器學習算法返回短文本輸入的匹配預定義類。創建和訓練一個分類器,將預定義分類與示例文本連接起來,以便服務可以將這些分類器可以對新的輸入進行分類
認證方式
使用HTTP Basic Authentication方式認證。 即用戶名/密碼方式
創建一個分類器
CURL命令
curl -u "USERNAME":"PASSWORD" ^
-F training_data=@weather_data_train.csv ^
-F training_metadata="{\"language\":\"en\",\"name\":\"atp-weather\"}" ^
"https://gateway.watsonplatform.net/natural-language-classifier/api/v1/classifiers"
返回值
{
"classifier_id" : "359f3fx202-nlc-223328",
"name" : "atp-weather",
"language" : "en",
"created" : "2017-07-25T03:20:16.451Z",
"url" : "https://gateway.watsonplatform.net/natural-language-classifier/api/v1/classifiers/359f3fx202-nlc-223328",
"status" : "Training",
"status_description" : "The classifier instance is in its training phase, not yet ready to accept classify requests"
}
** 注意此時分類器的狀態為訓練中 暫時還不能使用。我們可以通過命令查看分類器狀態**
查看分類器列表
CURL命令
curl -u "USERNAME":"PASSWORD" ^
"https://gateway.watsonplatform.net/natural-language-classifier/api/v1/classifiers"
返回值
{
"classifiers" : [ {
"classifier_id" : "359f3fx202-nlc-223328",
"url" : "https://gateway.watsonplatform.net/natural-language-classifier/api/v1/classifiers/359f3fx202-nlc-223328",
"name" : "atp-weather",
"language" : "en",
"created" : "2017-07-25T03:20:16.451Z"
} ]
}
查看分類器信息
CURL命令
curl -u "USERNAME":"PASSWORD" ^
"https://gateway.watsonplatform.net/natural-language-classifier/api/v1/classifiers/359f3fx202-nlc-223328"
返回值
{
"classifier_id" : "359f3fx202-nlc-223328",
"name" : "atp-weather",
"language" : "en",
"created" : "2017-07-25T03:20:16.451Z",
"url" : "https://gateway.watsonplatform.net/natural-language-classifier/api/v1/classifiers/359f3fx202-nlc-223328",
"status" : "Available",
"status_description" : "The classifier instance is now available and is ready to take classifier requests."
}
分類器有如下五種狀態
- 1 Non Existent : 不存在
- 2 Training : 訓練中
- 3 Failed:失敗
- 4 Available:有效
- 5 Unavailable:無效
使用分類器進行分類
CURL命令
- Get方法分類 How how will it be today?
curl -G -u "USERNAME":"PASSWORD" ^
"https://gateway.watsonplatform.net/natural-language-classifier/api/v1/classifiers/359f3fx202-nlc-223328/classify?text=How%20hot%20will%20it%20be%20today%3F"
- Post方法分類 How how will it be today?
curl -X POST -u "USERNAME":"PASSWORD" ^
-H "Content-Type:application/json" ^
-d "{\"text\":\"How hot will it be today?\"}" ^
"https://gateway.watsonplatform.net/natural-language-classifier/api/v1/classifiers/359f3fx202-nlc-223328/classify"
返回值
{
"classifier_id" : "359f3fx202-nlc-223328",
"url" : "https://gateway.watsonplatform.net/natural-language-classifier/api/v1/classifiers/359f3fx202-nlc-223328",
"text" : "How hot will it be today?",
"top_class" : "temperature",
"classes" : [ {
"class_name" : "temperature",
"confidence" : 0.9929586035651006
}, {
"class_name" : "conditions",
"confidence" : 0.007041396434899482
} ]
}
使用分類器訓練數據中未包含的詞匯(sleet 為雨夾雪)
特意使用了temperature分類中包含的句式 how xxx it is today?
分類器還是準確將其分到condition類中了。
curl -X POST -u "username":"password" ^
-H "Content-Type:application/json" ^
-d "{\"text\":\"How sleet will it be today?\"}" ^
"https://gateway.watsonplatform.net/natural-language-classifier/api/v1/classifiers/359f3fx202-nlc-223328/classify"
返回值
{
"classifier_id" : "359f3fx202-nlc-223328",
"url" : "https://gateway.watsonplatform.net/natural-language-classifier/api/v1/classifiers/359f3fx202-nlc-223328",
"text" : "How sleet will it be today?",
"top_class" : "conditions",
"classes" : [ {
"class_name" : "conditions",
"confidence" : 0.89688785244637
}, {
"class_name" : "temperature",
"confidence" : 0.10311214755363002
} ]
}
使用分類器完全無關的詞匯 it is atp's notebook?
分類結果非常不理想 temperature類的置信度竟然高達82%
curl -X POST -u "74e23665-dfea-4bd6-ad80-3e9b4a7f7604":"RxFKejjwlUcA" ^
-H "Content-Type:application/json" ^
-d "{\"text\":\"it is atp's notebook?\"}" ^
"https://gateway.watsonplatform.net/natural-language-classifier/api/v1/classifiers/359f3fx202-nlc-223328/classify"
返回值
{
"classifier_id" : "359f3fx202-nlc-223328",
"url" : "https://gateway.watsonplatform.net/natural-language-classifier/api/v1/classifiers/359f3fx202-nlc-223328",
"text" : "it is atp's notebook?",
"top_class" : "temperature",
"classes" : [ {
"class_name" : "temperature",
"confidence" : 0.8255246180698945
}, {
"class_name" : "conditions",
"confidence" : 0.1744753819301055
} ]
}
刪除一個分類器
CURL命令
curl -X DELETE -u "{username}":"{password}"
"https://gateway.watsonplatform.net/natural-language-classifier/api/v1/classifiers/10D41B-nlc-1"
要點
- 置信度值表示為百分比,值越大表示置信度越高。響應最多包含 10 個類。
- 如果培訓數據中的類少于10個,那么所有置信度值的和為 100%。例如只定義了兩個類,就只能返回兩個類。
- 其中一個樣本問題包含未對分類器進行培訓的詞語(“foggy”)。您無須執行額外工作來識別這些“缺少”的詞語,分類器對于這些詞語就能獲得不錯的分數。請嘗試使用包含培訓數據中沒有的詞(例如,“sleet”或“storm”)的其他問題。
課題
- 1 支持語言 en之外還包含?
- 2 訓練數據文本的格式 csv固定? csv的format也是固定?
- 3 分類器建成以后是否可以追加training數據