我正在研究一个基于Jaccard距离的程序,并且我需要计算两个二进制位向量之间的Jaccard距离。我碰到下面就在网上:为什么我们在计算二进制数字之间的jaccard距离时不包含0个匹配项?
If p1 = 10111 and p2 = 10011,
The total number of each combination attributes for p1 and p2:
M11 = total number of attributes where p1 & p2 have a value 1,
M01 = total number of attributes where p1 has a value 0 & p2 has a value 1,
M10 = total number of attributes where p1 has a value 1 & p2 has a value 0,
M00 = total number of attributes where p1 & p2 have a value 0.
Jaccard similarity coefficient = J =
intersection/union = M11/(M01 + M10 + M11)
= 3/(0 + 1 + 3) = 3/4,
Jaccard distance = J' = 1 - J = 1 - 3/4 = 1/4,
Or J' = 1 - (M11/(M01 + M10 + M11)) = (M01 + M10)/(M01 + M10 + M11)
= (0 + 1)/(0 + 1 + 3) = 1/4
现在,在计算系数,为什么“M00”不包括在分母?任何人都可以解释吗?
你不仅可以在网上碰到过这样的片断,还就在这里:http://stackoverflow.com/a/19969874/14955 – Thilo