time_elapsed network_name daypart day
1: 4705 Laff TV 2016-09-09 03:11:35 Friday
2: 1800 CNN 2016-09-10 08:00:00 Saturday
3: 23 INSP 2016-09-02 18:00:00 Friday
4: 148 NBC 2016-09-02 16:01:26 Friday
5: 957 History Channel 2016-09-07 14:44:03 Wednesday
6: 1138 Nickelodeon/Nick-at-Nite 2016-09-09 16:00:00 Friday
7: 120 Starz Edge 2016-09-07 15:28:59 Wednesday
8: 268 Starz Encore Westerns 2016-09-07 17:13:05 Wednesday
9: 6 CBS 2016-09-10 04:00:00 Saturday
10: 69 Independent 2016-09-07 12:48:11 Wednesday
11: 4151 NBC 2016-09-09 04:32:37 Friday
12: 570 PBS: Public Broadcasting Service 2016-09-07 16:17:58 Wednesday
13: 1421 NBCSN 2016-09-03 15:22:23 Saturday
14: 466 Estrella TV (Broadcast) 2016-09-04 19:00:00 Sunday
(一般超过200万行)
我几个月前写了下面的嵌套ifelse语句时,我运行我的整个脚本经过短短几百万行,但现在我运行它一个更大规模我真的想找到一个办法让它快一点。
targets_random$daypart <- ifelse((wday(targets_random$daypart) == 1 |
wday(targets_random$daypart) == 7), "W: Weekend",
ifelse(hour(targets_random$daypart) <= 2, "LP: Late Prime",
ifelse((hour(targets_random$daypart) >= 3 &
hour(targets_random$daypart) <= 5), "O: Overnight",
ifelse((hour(targets_random$daypart) >= 6 &
hour(targets_random$daypart) <= 9), "EM: Early Morning",
ifelse((hour(targets_random$daypart) >= 10 &
hour(targets_random$daypart) <= 16), "D: Day",
ifelse((hour(targets_random$daypart) >= 17 &
hour(targets_random$daypart) <= 20), "F: Fringe",
ifelse(hour(targets_random$daypart) >= 21, "P: Prime", NA)))))))
我试图用一个data.table解决方案,但只有非常稍快,而我的data.table到列表中。对于我的生活,我不明白为什么。这增加了足够的时间来撤消它是不值得的节省。
任何建议将不胜感激。我有什么工作,如果我必须坚持下去,它会没事的。目前大约需要3.5小时才能完成整个代码。最大的部分是SQL查询和结果的文件创建,但如果我能尽可能地减少时间,这将是非常好的!
(一点题外话 - 它使用的是近8小时,然后我更换零件吨,与data.table语法我现在是一个官迷!)
您可能可以使用parLapply一次运行多个行 – Rilcon42
请参阅'?cut'。看来你可以使用类似'切(targets_random $时段每小时$,C(-Inf,3,6,10,17,21,天道酬勤),include.lowest = TRUE,右= FALSE)'但改变“标签”以'C的说法( “LP:已故总理”, “O:隔夜”,等...)'和,之后用'代替 “W:周末”''任何地方(targets_random $时段$ wday + 1)%在%C(1,7)' –