1.0 理論
熵
條件熵
信息增益
信息增益比
1.0 sklearn.tree
首先,http://scikit-learn.org給的入門代碼是有問題的...
from sklearn.datasets import load_iris
from sklearn import tree
from sklearn.externals.six import StringIO
import pydot
dot_data = StringIO()
iris = load_iris()
clf = tree.DecisionTreeClassifier()
clf = clf.fit(iris.data, iris.target)
tree.export_graphviz(clf, out_file=dot_data)
graph = pydot.graph_from_dot_data(dot_data.getvalue())
graph.write_pdf("iris.pdf")
這么粘下來,報(bào)的第一個(gè)錯(cuò)是:
AttributeError: 'list' object has no attribute 'write_pdf'
不禁顯然了深深的思考...
然后stackoverflow告訴我,pydot已經(jīng)升級(jí)了,請(qǐng)使用plus版...
于是麻溜的,pydotplus搞起!
果然,報(bào)錯(cuò)變了!(我就知道不會(huì)這么順利...)
InvocationException:GraphViz's executables not found
趕緊再google起來,stackoverflow這次告訴我:小子!你沒裝GraphViz或者沒配環(huán)境吧!
soga!GraphViz裝起來~
搜一個(gè)GraphViz安裝大保健~安裝,重啟IDE
from sklearn.datasets import load_iris
from sklearn import tree
from sklearn.externals.six import StringIO
from IPython.display import Image
import numpy as np
import pandas as pd
import os
import pydotplus
iris = load_iris()
test = tree.DecisionTreeClassifier()
test = test.fit(iris.data, iris.target)
dot_data = StringIO()
tree.export_graphviz(test, out_file=dot_data)
graph = pydotplus.graph_from_dot_data(dot_data.getvalue())
完美~
接下來研究怎么出圖....