2015-05-20 63 views
-1

这是我的数据,与列infoemployer,inforst和interrst。这就是所谓的tyearb。我不能得到这个循环与if语句工作

             infoemployer     inforst 
1               Comcast    Jeff Dunn 
6             Cummins, Inc.  Rebekah Smith 
38               DaVita  Andy Nielsen 
42              Deloitte   Chase Russell 
66            Duff & Phelps LLC  Tanner Anderson 
76             Frito-Lay Inc.  Tanner Anderson 
88            Intel Corporation   Jake Graff 
96  J.P. Morgan- (J.P. Morgan is part of JPMorgan Chase & Co)  Andy Nielsen 
97               Lenovo  Nelson Anievas 
98              PepsiCo  Tanner Anderson 
100            Procter & Gamble  Andee Flinders 
102 Sears Holdings Corporation, formerly Sears, Roebuck & Company  Tanner Anderson 
103          The Walt Disney Company Kylie Rothlisberger 
106          Union Pacific Railroad   Jake Graff 
116               USAA  Rebekah Smith 
117              Walmart  Chase Russell 
237                    <NA> 
238               Apple    <NA> 
239        Brandes Investment Partners L.P.    <NA> 
240      EY (formerly known as Ernst & Young) LLP    <NA> 
242           Grant Thornton LLP    <NA> 
243              KPMG LLP    <NA> 
245             Moss Adams    <NA> 
246           Pariveda Solutions    <NA> 
248        PwC (PricewaterhouseCoopers, LLC)    <NA> 
250               RCLCO    <NA> 
251          Strata Fund Services, LLC    <NA> 
       interrst 
1     <NA> 
6   Rebekah Smith 
38   Andy Nielsen 
42  Chase Russell 
66  Tanner Anderson 
76  Tanner Anderson 
88   Jake Graff 
96   Andy Nielsen 
97  Nelson Anievas 
98  Tanner Anderson 
100  Andee Flinders 
102  Tanner Anderson 
103 Kylie Rothlisberger 
106   Jake Graff 
116  Rebekah Smith 
117  Chase Russell 
237  Austin Pollard 
238  Brady Tengberg 
239   Jeff Dunn 
240  Rebekah Smith 
242   Jeff Dunn 
243  Andee Flinders 
245   Jake Graff 
246  Nelson Anievas 
248  Nelson Anievas 
250   Jake Graff 
251  Andy Nielsen 

我的代码如下:

levels(tyearb[,2]) <- c(levels(tyearb[,2]), levels(tyearb[,3])) 

for (i in 1:length(tyearb)) 
    { 
if (is.na(tyearb[i,2])) 
    { 
    tyearb[i,2] = tyearb[i,3] 
    } 
    } 

我只想把所有的当前值的inforst,除非是<NA>,那么我想插入interrst的价值。我意识到我可以将除第一个以外的所有值复制到inforst,但我显然不能用更大的数据集来做到这一点,而更多的信息将会丢失。

我已经看了很多,如果循环在一起,我只是无法让它为我工作。有人能解释一下吗?

+1

'ifelse'将在这里工作得更快。可能是按照以下方式排列:'tyearb [,2] = ifelse(is.na(tyearb [,2]),tyearb [,3],tyearb [,2]' –

+0

Thanks!由于某种原因,它会插入所有tyearb [ ,3]转化为tyearb [,2],但我认为我可以处理这个。 –

+0

支架安置问题 –

回答

2

data.table解决方案(这将是即使是非常大的数据集非常快):

library(data.table) 
DT[is.na(z), z := y] 

其中z是你为NA测试列,y是您要插入的列(虽然你可以用任何表达式替换y)。