1.4. Support Vector Machines
支持向量機
Support vector machines (SVMs) are a set of supervised learning methods used for classification, regression and outliers detection.
支持向量機是一種用于分類,回歸和異常值檢測的監(jiān)督學習方式。
The advantages of support vector machines are:
SVM的優(yōu)點如下:
Effective in high dimensional spaces.
在多維度空間中具有高效性。
Still effective in cases where number of dimensions is greater than the number of samples.
在特征值大于樣本數(shù)情況下仍舊高效。
Uses a subset of training points in the decision function (called support vectors), so it is also memory efficient.
在決定函數(shù)(稱為支持向量)中使用訓練集數(shù)據(jù)的一個子集,因此內(nèi)存表現(xiàn)也高效。
Versatile: different Kernel functions can be specified for the decision function. Common kernels are provided, but it is also possible to specify custom kernels.
多功能性:可以為決定函數(shù)指定不同的核函數(shù)。提供常用的核函數(shù),也可以指定你所習慣的核函數(shù)。
The disadvantages of support vector machines include:
缺點如下:
If the number of features is much greater than the number of samples, the method is likely to give poor performances.
如果特征值數(shù)遠遠超過樣本數(shù),SVM性能可能不太好。
SVMs do not directly provide probability estimates, these are calculated using an expensive five-fold cross-validation (see Scores and probabilities, below).
SVM不直接提供概率評估,而是使用開銷五倍的交叉驗證來計算()
The support vector machines in scikit-learn support both dense (numpy.ndarray and convertible to that by numpy.asarray) and sparse (any scipy.sparse) sample vectors as input. However, to use an SVM to make predictions for sparse data, it must have been fit on such data. For optimal performance, use C-ordered numpy.ndarray (dense) or scipy.sparse.csr_matrix (sparse) with dtype=float64.
1.4.1. Classification