2013-07-08 98 views
2

我正在输入的例子非数字下面给出:如何创建从原始数据的邻接矩阵在本质上是

User ID 1 --- Artist 5 
User ID 2 --- Artist 1 
User ID 3 --- Artist 7 
User ID 4 --- Artist 2 
User ID 5 --- Artist 3 
User ID 1 --- Artist 2 
User ID 3 --- Artist 1 

以上数据是音乐的记录听取应用的用户。

我想生成对应于下面给出例子的邻接矩阵:

  ARTIST 1 ARTIST 2 ARTIST 3 ARTIST 4 ARTIST 5 ARTIST 6 ARTIST 7 
USER ID 1  0  1   0   0   1   0   0 
USER ID 2  1  0   0   0   0   0   0 
USER ID 3  1  0   0   0   0   0   1 
USER ID 4  0  1   0   0   0   0   0 
USER ID 5  0  0   1   0   0   0   0 

这将如何能够在R.任何提示或指针将最赞赏。

预先感谢您的时间和帮助。

+0

我建议增加 “R” 标记,这将达到[R专家 – doctorlove

+0

谢谢doctorlove ...将添加标签 – Manus

+0

只需在数据中添加一列'1'并使用上面的答案。 – flodel

回答

3

这工作:

# get data in useable form 
ContingencyTable <- read.table(text=gsub(pattern = " --- ", replacement = ",","User ID 1 --- Artist 5 
User ID 2 --- Artist 1 
User ID 3 --- Artist 7 
User ID 4 --- Artist 2 
User ID 5 --- Artist 3 
User ID 1 --- Artist 2 
User ID 3 --- Artist 1"),sep=",", stringsAsFactors = FALSE) 
# add variable for match value 
ContingencyTable$Val <- 1 
# more or less lifted from Arun's answer linked by @Hong Ooi, above 
adjMat <- reshape2::dcast(ContingencyTable, V1 ~ V2, value.var = "Val", fill=0) 
rownames(adjMat) <- adjMat[,1] 
adjMat <- adjMat[,2:ncol(adjMat)] 

adjMat 
     Artist 1 Artist 2 Artist 3 Artist 5 Artist 7 
User ID 1  0  1  0  1  0 
User ID 2  1  0  0  0  0 
User ID 3  1  0  0  0  1 
User ID 4  0  1  0  0  0 
User ID 5  0  0  1  0  0 
+0

谢谢蒂姆....你的回答非常有帮助。 – Manus

+1

'表(ContingencyTable)'似乎也工作 – user20650

2

qdap packageadjmat功能,可以这样做:

dat <- read.table(text=gsub(pattern = " --- ", replacement = ",", 
"User ID 1 --- Artist 5 
User ID 2 --- Artist 1 
User ID 3 --- Artist 7 
User ID 4 --- Artist 2 
User ID 5 --- Artist 3 
User ID 1 --- Artist 2 
User ID 3 --- Artist 1"),sep=",", stringsAsFactors = FALSE) 


library(qdap) 
x <- with(dat, termco(V1, V2, unique(V1))) 
adjmat(x)$boolean 

## > adjmat(x)$boolean 
##   Artist 1 Artist 2 Artist 3 Artist 5 Artist 7 
## User ID 1  0  1  0  1  0 
## User ID 2  1  0  0  0  0 
## User ID 3  1  0  0  0  1 
## User ID 4  0  1  0  0  0 
## User ID 5  0  0  1  0  0 

PS添Riffe尼斯的方法来在数据:)

+0

也我认为这被称为布尔矩阵不是一个邻接矩阵,但我可能是错的。 –

+0

谢谢泰勒引用qdap软件包......你的回答非常有用。 – Manus

4

如果DF阅读是与问题中的数据对应的两列数据帧:

xtabs(data = DF) 

这给:

  V2 
V1   Artist 1 Artist 2 Artist 3 Artist 5 Artist 7 
    User ID 1  0  1  0  1  0 
    User ID 2  1  0  0  0  0 
    User ID 3  1  0  0  0  1 
    User ID 4  0  1  0  0  0 
    User ID 5  0  0  1  0  0 

注:我们用它进行输入:

DF <- structure(list(V1 = structure(c(1L, 2L, 3L, 4L, 5L, 1L, 3L), .Label = c("User ID 1", 
"User ID 2", "User ID 3", "User ID 4", "User ID 5"), class = "factor"), 
    V2 = structure(c(4L, 1L, 5L, 2L, 3L, 2L, 1L), .Label = c("Artist 1", 
    "Artist 2", "Artist 3", "Artist 5", "Artist 7"), class = "factor")), .Names = c("V1", 
"V2"), class = "data.frame", row.names = c(NA, -7L)) 
+0

这太棒了,谢谢! –

+0

很好的回答!谢谢格洛腾迪克! – Manus