relative entropy
衡量兩個概率分布的散度probability distributions diverges
for discrete probability distributions
image.png
for continuous random variable
image.png
從字面意思來看呢,是一種距離,但是實際上和我們理解的“距離”并不一樣。我們常規理解的距離一般來說有幾點性質:
1.非負:距離是絕對值,非負好理解。
2.對稱:從A到B的距離 = 從B到A的距離
3.勾股定理:兩邊之和大于第三邊
而KL的性質只滿足第一點非負性,不滿足對稱性和勾股定理。
# KL divergence (and any other such measure) expects the input data to have a sum 1
1.import numpy as np
def KL(a, b):
a = np.array(a, dtype=np.float)
b = np.array(b, dtype=np.float)
return np.sum(np.where(a!=0, a*np.log(a/b), 0))
# np.log(a / (b + np.spacing(1))) np.spacing等價于inf
2. scipy.stats.entropy(pk, qk=None, base=None)
當qk != None時計算KL Divergence
automatically normalize x,y to have sum = 1
application:
text similarity, 先統計詞頻,然后計算kl divergence
用戶畫像
reference:
https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence
http://www.cnblogs.com/charlotte77/p/5392052.html