2016-03-14 102 views
0

我遇到以下问题:考虑与列标题和副标题等长的数据帧。标题是相当干净的数据,字幕相当混乱(错误的值,NAs,...)然而,当字幕正确填写时,它包含比我的标题变量更多的信息。基于nchar更改列值

我想替换标题列中的值,其中某个字幕观察的nchar超出标题观察的nchar。

此刻我失败的代码如下所示:

#filter from real table 
    baseTable_sentiment <- filter(baseTable, theme_ecoFin == 1) 
    #with this code I try to do what I explained, while coping with the NA's in subtitle 
    baseTable_sentiment$title <- baseTable_sentiment$subtitle[nchar(baseTable_sentiment$subtitle , allowNA = TRUE , keepNA = TRUE) > nchar(baseTable_sentiment$title) , ] 

一种替代方法与NAS,以应付现在

#filter from real table 
    baseTable_sentiment <- filter(baseTable, theme_ecoFin == 1) 
    #change NA to text value "na" 
    baseTable_sentiment$subtitle <- replace(baseTable_sentiment$subtitle,which(is.na(baseTable_sentiment$subtitle)),"na") 
    #same code as before 
    baseTable_sentiment$title <- baseTable_sentiment$subtitle[nchar(baseTable_sentiment$subtitle) > nchar(baseTable_sentiment$title) , ] 

当我运行的两个例子之一:我得到以下错误:

Error in baseTable_sentiment$subtitle[(nchar(baseTable_sentiment$subtitle, :
incorrect number of dimensions

但是:当我检查所有使用的尺寸

> > > length(baseTable_sentiment$subtitle) [1] 170206 
> > > length(baseTable_sentiment$title) [1] 170206 
> > > length(nchar(baseTable_sentiment$subtitle , allowNA = TRUE) > nchar(baseTable_sentiment$title)) [1] 170206 

我该如何解决这个问题,或者你们有其他方法来做这个手术吗?

以下链接包含data example

预先感谢您

奥利维尔

+0

在'baseTable'什么? – mtoto

+0

baseTable包含关于newscoverage和过滤器变量的元数据,如theme_ecofin(经济和财务主题新闻)baseTable不包含对于字幕变量期望的任何NA值。它包含不同的数据类型,但我向你保证,字幕和标题都是“字符”类型(我只是控制它) –

+0

请分享一个示例数据集来重现您的问题。 – mtoto

回答

0

我发现了错误:我的标题的尺寸不与后病情创建新的层面了对应分配。现在

baseTable_sentiment$title[nchar(baseTable_sentiment$subtitle , allowNA = TRUE , keepNA = FALSE) > nchar(baseTable_sentiment$title) ]<- baseTable_sentiment$subtitle[nchar(baseTable_sentiment$subtitle , allowNA = TRUE , keepNA = FALSE) > nchar(baseTable_sentiment$title) ] 

尺寸匹配和代码运行完美