2017-10-11 60 views
4

我的第一个问题这里:)滚动轴承预测使用dplyr和rollapply

我的目标是:考虑到与预测的数据帧(每列一个预测器/行的观察结果)使用LM拟合回归,然后预测使用滚动窗口使用上次观察的值。

数据帧看起来像:

> DfPredictor[1:40,] 
      Y   X1   X2     X3   X4    X5 
1   3.2860  192.5115 2.1275    83381   11.4360   8.7440 
2   3.2650  190.1462 2.0050    88720   11.4359   8.8971 
3   3.2213  192.9773 2.0500    74130   11.4623   8.8380 
4   3.1991  193.7058 2.1050    73930   11.3366   8.7536 
5   3.2224  193.5407 2.0275    80875   11.3534   8.7555 
6   3.2000  190.6049 2.0950    86606   11.3290   8.8555 
7   3.1939  191.1390 2.0975    91402   11.2960   8.8433 
8   3.1971  192.2921 2.2700    88181   11.2930   8.8681 
9   3.1873  194.9700 2.3300    115959   1.9477   8.5245 
10   3.2182  194.5396 2.4200    134754   11.3200   8.4990 
11   3.2409  194.5396 2.2025    136685   1.9649   8.4192 
12   3.2112  195.1362 2.1900    136316   1.9750   8.3752 
13   3.2231  193.3560 2.2475    140295   1.9691   8.3546 
14   3.2015  192.9649 2.2575    139474   1.9500   8.3116 
15   3.1744  194.0154 2.1900    146202   1.8476   8.2225 
16   3.1646  194.4423 2.2650    142983   1.8600   8.1948 
17   3.1708  194.9473 2.2425    141377   1.8522   8.2589 
18   3.1675  193.9788 2.2400    141377   1.8600   8.2600 
19   3.1744  194.2563 2.3000    149875   1.8718   8.2899 
20   3.1410  193.4316 2.2300    129561   1.8480   8.2395 
21   3.1266  191.2633 2.2550    122636   1.8440   8.2396 
22   3.1486  192.0354 2.3600    130996   1.8570   8.8640 
23   3.1282  194.3351 2.4825    92430   1.7849   8.1291 
24   3.1214  193.5196 2.4750    94814   1.7624   8.1991 
25   3.1230  193.2017 2.3725    87590   1.7660   8.2310 
26   3.1182  192.1642 2.4475    87715   1.6955   8.2414 
27   3.1203  191.3744 2.3775    89857   1.6539   8.2480 
28   3.1156  192.2646 2.3725    92159   1.5976   8.1676 
29   3.1270  192.7555 2.3675    97425   1.5896   8.1162 
30   3.1154  194.0375 2.3725    87598   1.5277   8.2640 
31   3.1104  192.0596 2.3850    93236   1.5132   7.9999 
32   3.0846  192.2792 2.2900    94608   1.4990   8.1600 
33   3.0569  193.2573 2.3050    84663   1.4715   8.2200 
34   3.0893  192.7632 2.2550    67149   1.4955   7.9590 
35   3.0991  192.1229 2.3050    75519   1.4280   7.9183 
36   3.0879  192.1229 2.3100    76756   1.3839   7.9133 
37   3.0965  192.0502 2.2175    61748   1.3130   7.8750 
38   3.0655  191.2274 2.2300    41490   1.2823   7.8656 
39   3.0636  191.6342 2.1925    51049   1.1492   7.7447 
40   3.1097  190.9312 2.2150    21934   1.1626   7.6895 

例如使用滚动窗口宽度= 10回归应该估计,然后预测“Y”相应于X1,X2,..., X5。 这个预测应该包含在一个新的专栏'Ypred'中。

有一些方法可以使用rollapply + lm/predict + mudate?

非常感谢!

回答

1

在注意最后使用数据和假设中的宽度10的窗口,我们要预测的最后一个Y,然后(i..e 10日。):

library(zoo) 

pred <- function(x) tail(fitted(lm(Y ~., as.data.frame(x))), 1) 
transform(DF, pred = rollapplyr(DF, 10, pred, by.column = FALSE, fill = NA)) 

,并提供:

 Y  X1  X2  X3  X4  X5  pred 
1 3.2860 192.5115 2.1275 83381 11.4360 8.7440  NA 
2 3.2650 190.1462 2.0050 88720 11.4359 8.8971  NA 
3 3.2213 192.9773 2.0500 74130 11.4623 8.8380  NA 
4 3.1991 193.7058 2.1050 73930 11.3366 8.7536  NA 
5 3.2224 193.5407 2.0275 80875 11.3534 8.7555  NA 
6 3.2000 190.6049 2.0950 86606 11.3290 8.8555  NA 
7 3.1939 191.1390 2.0975 91402 11.2960 8.8433  NA 
8 3.1971 192.2921 2.2700 88181 11.2930 8.8681  NA 
9 3.1873 194.9700 2.3300 115959 1.9477 8.5245  NA 
10 3.2182 194.5396 2.4200 134754 11.3200 8.4990 3.219764 
11 3.2409 194.5396 2.2025 136685 1.9649 8.4192 3.241614 
12 3.2112 195.1362 2.1900 136316 1.9750 8.3752 3.225423 
13 3.2231 193.3560 2.2475 140295 1.9691 8.3546 3.217797 
14 3.2015 192.9649 2.2575 139474 1.9500 8.3116 3.205856 
15 3.1744 194.0154 2.1900 146202 1.8476 8.2225 3.177928 
16 3.1646 194.4423 2.2650 142983 1.8600 8.1948 3.156405 
17 3.1708 194.9473 2.2425 141377 1.8522 8.2589 3.176243 
18 3.1675 193.9788 2.2400 141377 1.8600 8.2600 3.177165 
19 3.1744 194.2563 2.3000 149875 1.8718 8.2899 3.177211 
20 3.1410 193.4316 2.2300 129561 1.8480 8.2395 3.145533 
21 3.1266 191.2633 2.2550 122636 1.8440 8.2396 3.127410 
22 3.1486 192.0354 2.3600 130996 1.8570 8.8640 3.148792 
23 3.1282 194.3351 2.4825 92430 1.7849 8.1291 3.124913 
24 3.1214 193.5196 2.4750 94814 1.7624 8.1991 3.124992 
25 3.1230 193.2017 2.3725 87590 1.7660 8.2310 3.117981 
26 3.1182 192.1642 2.4475 87715 1.6955 8.2414 3.117679 
27 3.1203 191.3744 2.3775 89857 1.6539 8.2480 3.119898 
28 3.1156 192.2646 2.3725 92159 1.5976 8.1676 3.121039 
29 3.1270 192.7555 2.3675 97425 1.5896 8.1162 3.123903 
30 3.1154 194.0375 2.3725 87598 1.5277 8.2640 3.119438 
31 3.1104 192.0596 2.3850 93236 1.5132 7.9999 3.113963 
32 3.0846 192.2792 2.2900 94608 1.4990 8.1600 3.101229 
33 3.0569 193.2573 2.3050 84663 1.4715 8.2200 3.076817 
34 3.0893 192.7632 2.2550 67149 1.4955 7.9590 3.083266 
35 3.0991 192.1229 2.3050 75519 1.4280 7.9183 3.089377 
36 3.0879 192.1229 2.3100 76756 1.3839 7.9133 3.084225 
37 3.0965 192.0502 2.2175 61748 1.3130 7.8750 3.075252 
38 3.0655 191.2274 2.2300 41490 1.2823 7.8656 3.063025 
39 3.0636 191.6342 2.1925 51049 1.1492 7.7447 3.068808 
40 3.1097 190.9312 2.2150 21934 1.1626 7.6895 3.091819 

注:输入DF在重现的形式是:

Lines <- "   Y   X1   X2     X3   X4    X5 
1   3.2860  192.5115 2.1275    83381   11.4360   8.7440 
2   3.2650  190.1462 2.0050    88720   11.4359   8.8971 
3   3.2213  192.9773 2.0500    74130   11.4623   8.8380 
4   3.1991  193.7058 2.1050    73930   11.3366   8.7536 
5   3.2224  193.5407 2.0275    80875   11.3534   8.7555 
6   3.2000  190.6049 2.0950    86606   11.3290   8.8555 
7   3.1939  191.1390 2.0975    91402   11.2960   8.8433 
8   3.1971  192.2921 2.2700    88181   11.2930   8.8681 
9   3.1873  194.9700 2.3300    115959   1.9477   8.5245 
10   3.2182  194.5396 2.4200    134754   11.3200   8.4990 
11   3.2409  194.5396 2.2025    136685   1.9649   8.4192 
12   3.2112  195.1362 2.1900    136316   1.9750   8.3752 
13   3.2231  193.3560 2.2475    140295   1.9691   8.3546 
14   3.2015  192.9649 2.2575    139474   1.9500   8.3116 
15   3.1744  194.0154 2.1900    146202   1.8476   8.2225 
16   3.1646  194.4423 2.2650    142983   1.8600   8.1948 
17   3.1708  194.9473 2.2425    141377   1.8522   8.2589 
18   3.1675  193.9788 2.2400    141377   1.8600   8.2600 
19   3.1744  194.2563 2.3000    149875   1.8718   8.2899 
20   3.1410  193.4316 2.2300    129561   1.8480   8.2395 
21   3.1266  191.2633 2.2550    122636   1.8440   8.2396 
22   3.1486  192.0354 2.3600    130996   1.8570   8.8640 
23   3.1282  194.3351 2.4825    92430   1.7849   8.1291 
24   3.1214  193.5196 2.4750    94814   1.7624   8.1991 
25   3.1230  193.2017 2.3725    87590   1.7660   8.2310 
26   3.1182  192.1642 2.4475    87715   1.6955   8.2414 
27   3.1203  191.3744 2.3775    89857   1.6539   8.2480 
28   3.1156  192.2646 2.3725    92159   1.5976   8.1676 
29   3.1270  192.7555 2.3675    97425   1.5896   8.1162 
30   3.1154  194.0375 2.3725    87598   1.5277   8.2640 
31   3.1104  192.0596 2.3850    93236   1.5132   7.9999 
32   3.0846  192.2792 2.2900    94608   1.4990   8.1600 
33   3.0569  193.2573 2.3050    84663   1.4715   8.2200 
34   3.0893  192.7632 2.2550    67149   1.4955   7.9590 
35   3.0991  192.1229 2.3050    75519   1.4280   7.9183 
36   3.0879  192.1229 2.3100    76756   1.3839   7.9133 
37   3.0965  192.0502 2.2175    61748   1.3130   7.8750 
38   3.0655  191.2274 2.2300    41490   1.2823   7.8656 
39   3.0636  191.6342 2.1925    51049   1.1492   7.7447 
40   3.1097  190.9312 2.2150    21934   1.1626   7.6895" 

DF <- read.table(text = Lines, header = TRUE) 
+0

嘿格洛腾迪克!完美的工作:) Tks的援助 –