2016-09-11 46 views
0

基本上我试图自动化评分建模工作流程,并遇到输入从循环产生的结果从smbinning()的问题,因此记录在名单。结果本身就是一个列表,所以我列出了一堆列表。当我尝试将结果(连续变量的存储区)添加到数据框中时出现问题。我发现无法提供进入列表级别所需的语法。我尝试通过引用列号来解决此问题,并试图从循环中传递相应的列表名称。我得到的错误是:在循环中访问与smbinning.gen()列表中的列表

[.data.frame(df,,col_id)中的错误:选择了未定义的列。

我的代码如下:

colcnt <- ncol(e_mod) 
bucket_resultlist <- list() 
for (i in 2:colcnt) { 
    #curvar = paste0('z', i) 
    curresult = smbinning(df = e_mod, y = "Bankrupt", x = colnames(e_mod)[i], p = 0.05) 
    bucket_resultlist[[paste0('Bin_Result_', colnames(e_mod)[i])]] = curresult #paste0('binresult', colnames(e)[i]) = curresult 
} 

e_mod2 = e_mod 

for (i in 1:length(bucket_resultlist_trunc)) { 
e_mod2 = smbinning.genCUSTOM(e_mod, bucket_resultlist_trunc[[i]] , chrname = i) 
} 

我甚至试图定义客户版本smbinning.gen()功能,考虑到这一点,在标准的形式,它只是试图串连$ivtable到列表引用,但我需要能够从此生成的列表中跳过一个级别,然后为该列表中的每个相应列表运行smbinning.gen()。这里是自定义代码和原定义注释:

smbinning.genCUSTOM = function(df, ivout, chrname = "NewChar") { 
    df = cbind(df, tmpname = NA) 
    ncol = ncol(df) 
    col_id = paste0(ivout, '[[6]]', collapse = NULL) # Original: ivout$col_id 
    # Updated 20160130 
    b = paste0(ivout, '[[4]]', collapse = NULL) # Original: ivout$bands 
    df[, ncol][is.na(df[, col_id])] = 0 # Missing 
    df[, ncol][df[, col_id] <= b[2]] = 1 # First valid 
    # Loop goes from 2 to length(b)-2 if more than 1 cutpoint 
    if (length(b) > 3) { 
     for (i in 2:(length(b) - 2)) { 
      df[, ncol][df[, col_id] > b[i] & df[, col_id] <= b[i + 1]] = i 
     } 
    } 
    df[, ncol][df[, col_id] > b[length(b) - 1]] = length(b) - 1 # Last 
    df[, ncol] = as.factor(df[, ncol]) # Convert to factor for modeling 
    blab = c(paste("01 <=", b[2])) 
    if (length(b) > 3) { 
     for (i in 3:(length(b) - 1)) { 
      blab = c(blab, paste(sprintf("%02d", i - 1), "<=", b[i])) 
     } 
    } else { i = 2 } 
    blab = c(blab, paste(sprintf("%02d", i), ">", b[length(b) - 1])) 

    # Are there ANY missing values 
    # any(is.na(df[,col_id])) 

    if (any(is.na(df[, col_id]))) { 
     blab = c("00 Miss", blab) 
    } 
    df[, ncol] = factor(df[, ncol], labels = blab) 

    names(df)[names(df) == "tmpname"] = chrname 
    return(df) 
} 

所有帮助非常感谢!

这里的表结构 http://i.stack.imgur.com/iYau2.png

这也张贴在数据科学部分,但整个今天

+0

我认为,问题的关键在于最有可能正确地传递参数进入'smbinning.genCUSTOM()'函数 –

回答

0

感谢#1的为是我的黄色橡皮鸭在这个只有5次。此修复程序是更改传入参数的方法:

smbinning.genCUSTOM = function(df, ivout, chrname = "NewChar") { df = cbind(df, tmpname = NA) ncol = ncol(df) col_id = ivout[[6]] # paste0(ivout, '[[6]]', collapse = NULL) # Original: ivout$col_id # Updated 20160130 b = ivout[[4]] # paste0(ivout, '[[4]]', collapse = NULL) # Original: ivout$bands

并提及新的DF e_mod2,而不是e_mod for (i in 1:length(bucket_resultlist_trunc)) { e_mod2 = smbinning.genCUSTOM(e_mod2, bucket_resultlist_trunc[[i]] , chrname = i) }