1.4. Support Vector Machines
支持向量機
Support vector machines (SVMs) are a set of supervised learning methods used for classification, regression and outliers detection.
支持向量機是一種用于分類,回歸和異常值檢測的監督學習方式。
The advantages of support vector machines are:
SVM的優點如下:
Effective in high dimensional spaces.
在多維度空間中具有高效性。
Still effective in cases where number of dimensions is greater than the number of samples.
在特征值大于樣本數情況下仍舊高效。
Uses a subset of training points in the decision function (called support vectors), so it is also memory efficient.
在決定函數(稱為支持向量)中使用訓練集數據的一個子集,因此內存表現也高效。
Versatile: different Kernel functions can be specified for the decision function. Common kernels are provided, but it is also possible to specify custom kernels.
多功能性:可以為決定函數指定不同的核函數。提供常用的核函數,也可以指定你所習慣的核函數。
The disadvantages of support vector machines include:
缺點如下:
If the number of features is much greater than the number of samples, the method is likely to give poor performances.
如果特征值數遠遠超過樣本數,SVM性能可能不太好。
SVMs do not directly provide probability estimates, these are calculated using an expensive five-fold cross-validation (see Scores and probabilities, below).
SVM不直接提供概率評估,而是使用開銷五倍的交叉驗證來計算()
The support vector machines in scikit-learn support both dense (numpy.ndarray and convertible to that by numpy.asarray) and sparse (any scipy.sparse) sample vectors as input. However, to use an SVM to make predictions for sparse data, it must have been fit on such data. For optimal performance, use C-ordered numpy.ndarray (dense) or scipy.sparse.csr_matrix (sparse) with dtype=float64.
1.4.1. Classification