0
我有一个数据集(Facebook的帖子)(通过netvizz),我用R中的quanteda软件包。这是我的R代码。R采用量化的文本挖掘
# Load the relevant dictionary (relevant for analysis)
liwcdict <- dictionary(file = "D:/LIWC2001_English.dic", format = "LIWC")
# Read File
# Facebooks posts could be generated by FB Netvizz
# https://apps.facebook.com/netvizz
# Load FB posts as .csv-file from .zip-file
fbpost <- read.csv("D:/FB-com.csv", sep=";")
# Define the relevant column(s)
fb_test <-as.character(FB_com$comment_message) #one column with 2700 entries
# Define as corpus
fb_corp <-corpus(fb_test)
class(fb_corp)
# LIWC Application
fb_liwc<-dfm(fb_corp, dictionary=liwcdict)
View(fb_liwc)
一切工作,直到:
> fb_liwc<-dfm(fb_corp, dictionary=liwcdict)
Creating a dfm from a corpus ...
... indexing 2,760 documents
... tokenizing texts, found 77,923 total tokens
... cleaning the tokens, 1584 removed entirely
... applying a dictionary consisting of 68 key entries
Error in `dimnames<-.data.frame`(`*tmp*`, value = list(docs = c("text1", :
invalid 'dimnames' given for data frame
你会如何解释错误消息?有什么建议可以解决这个问题吗?
很难说,因为我没有文本输入文件,但是如果您尝试'dfm(inaugTexts,dictionary = liwcdict)',会发生什么?我有'LIWC2001_English.dic'文件,'dfm'命令可以在'inaugTexts'下正常工作 - 尽管速度很慢,需要重写才能优化它(列表中的下一部分)。 –
它现在已经在dev分支中修复,您可以按照下面的答案进行安装。 –