2017-04-04 41 views
2

我想向R中的矩阵添加一组外推“观察值”。我知道如何使用普通编程技术来完成此操作(读取;一堆嵌套循环和函数),但我觉得这必须通过使用内置R函数以更简洁的方式实现。R - 将外推(lm)值添加到观察矩阵

下面的代码说明了这一点,并在它打破了

很多感谢您的帮助!

随着亲切的问候

西尔

library(dplyr) 

# The idea is that i have a table of observations for e.g. x=5, 6, 7, 8, 9 and 10. The observations (in this example 2) 
# conform fairly decently to sets of 2nd order polynomials. 
# Now, I want to add an extrapolated value to this table (e.g. x=4). I know how to do this programmically 
# but I feel there must be a cleaner solution to do this. 

#generate dummy data table 
x <- 5:10 
myData <- tibble(x, a = x^2 * 2 + x * 3 + 4 + rnorm(1,0,0.01), b = x^2 * 3 + x * 4 + 5 + rnorm(1,0,0.01) ) 

#Gather (put in Data-Key format) 
myDataKeyFormat <- gather(myData,key = "someLabel", value = "myObservation", -x) 
fitted_models <- myDataKeyFormat %>% group_by(someLabel) %>% do(model = lm(myObservation ~ poly(x,2), data = .)) 
myExtrapolatedDataPointx <- tibble(x = 4) 

#Add the x=4 field 
fitted_points <- fitted_models %>% group_by(someLabel) %>% do(predict(.$model,myExtrapolatedDataPointx)) #R really doesnt like this bit 

#append the fitted_points to the myDataKeyFormat 
myDataKeyFormatWithExtrapolation <- union(myDataKeyFormat,fitted_points) 

#use spread to 
myDataWithExtrapolation <- myDataKeyFormatWithExtrapolation %>% spread(someLabel,myObservation) 

回答

1

这里是在tidyverse的解决方案,并使用purrr建立不同的模型。这个想法是嵌套(使用tidyr::nest)然后purrr::map来训练模型。然后,我将添加新值并使用modelr::add_predictions来计算预测值。在这里,您可以在同一个地方获得所有数据:通过变量someLabel来训练数据,模型,测试数据和预测。我也给你一种可视化数据的方式。 您可以查看Hadley Wickham的R for Data Science & Garrett Grolemund,特别是关于型号的部分了解更多信息。

library(dplyr) 
library(tibble) 
library(tidyr) 
library(purrr) 
library(modelr) 
library(ggplot2) 

set.seed(1) # For reproducibility 
x <- 5:10 
myData <- tibble(x, 
       a = x^2 * 2 + x * 3 + 4 + rnorm(1,0,0.01), 
       b = x^2 * 3 + x * 4 + 5 + rnorm(1,0,0.01)) 

#Gather (put in Data-Key format) 
myDataKeyFormat <- gather(myData,key = "someLabel", value = "myObservation", -x) 

myModels <- myDataKeyFormat %>% 
    nest(-someLabel) %>% 
    mutate(model = map(data, ~lm(myObservation ~ poly(x,2), data = .x))) 

以下是此时的结果:您为someLabel的每个值都有一个模型。

# A tibble: 2 × 3 
    someLabel    data model 
     <chr>   <list> <list> 
1   a <tibble [6 × 2]> <S3: lm> 
2   b <tibble [6 × 2]> <S3: lm> 

我将在一个新的列添加一些数据点(map是将其创建为用于数据帧的每行一个tibble)。

# New data 
new_data <- myModels %>% 
    mutate(new = map(data, ~tibble(x = c(3, 4, 11, 12)))) 

我添加的预言:add_predictions取一个数据帧和模型作为参数,所以我用map2映射在新的数据和模型。

fitted_models <- new_data %>% 
    mutate(new = map2(new, model, ~add_predictions(.x, .y))) 
fitted_models 
# A tibble: 2 × 4 
    someLabel    data model    new 
     <chr>   <list> <list>   <list> 
1   a <tibble [6 × 2]> <S3: lm> <tibble [4 × 2]> 
2   b <tibble [6 × 2]> <S3: lm> <tibble [4 × 2]> 

你去那里:你有每个标签的数据和模型中训练的这些数据,并与预测的新数据。 为了绘制它,我使用unnest将数据带回数据框,并将行绑定以将“旧”数据和新值组合在一起。

my_points <- bind_rows(unnest(fitted_models, data), 
      unnest(fitted_models, new)) 

ggplot(my_points)+ 
    geom_point(aes(x = x, y = myObservation), color = "black") + 
    geom_point(aes(x = x, y = pred), color = "red")+ 
    facet_wrap(~someLabel) 

Models

+0

非常感谢@FlorianGD,这正是我试图做! – Sylvain