2016-12-17 37 views
2

是否有任何快速的方法来将DataFrame的NA值转换为最后观测值?Julia DataFrame使用LOCF填充NA

using DataFrames 

d = @data [1,NA,5,NA,NA] 
df = DataFrame(d=d) 

result = filled_with_locf(df) 

expected = [1,1,5,5,5] 
  • LOCF =最后一次观察结转
+0

'result = d [cummax([i *!isna(d [i])for i = 1:length(d)])]' –

回答

2

扩大在评论oneliner,如果我们定义locf为:

locf(v) = v[cummax([i*!isna(v[i]) for i=1:length(v)])] 

然后,

nona_df = DataFrame(Any[locf(df[c]) for c in names(df)],names(df)) 

并且,

julia> nona_df 
5×1 DataFrames.DataFrame 
│ Row │ d │ 
├─────┼───┤ 
│ 1 │ 1 │ 
│ 2 │ 1 │ 
│ 3 │ 5 │ 
│ 4 │ 5 │ 
│ 5 │ 5 │