0
说我有一个这样的名单:值分配给从另一列基于条件列
> desired <- c("10001", "10004")
和样本数据帧是这样的:
> desired_sample_df <- data.frame(geo = rep("other", 30), zip = c(rep(10001:10010, 2), 10011:10020), cbsa = c(rep("NY", 20), rep("CA", 10)))
> desired_sample_df
geo zip cbsa
1 other 10001 NY
2 other 10002 NY
3 other 10003 NY
4 other 10004 NY
5 other 10005 NY
6 other 10006 NY
7 other 10007 NY
8 other 10008 NY
9 other 10009 NY
10 other 10010 NY
11 other 10001 NY
12 other 10002 NY
13 other 10003 NY
14 other 10004 NY
15 other 10005 NY
16 other 10006 NY
17 other 10007 NY
18 other 10008 NY
19 other 10009 NY
20 other 10010 NY
21 other 10011 CA
22 other 10012 CA
23 other 10013 CA
24 other 10014 CA
25 other 10015 CA
26 other 10016 CA
27 other 10017 CA
28 other 10018 CA
29 other 10019 CA
30 other 10020 CA
我想覆盖geo
列的值只有在zip的值位于开头保存的desired
列表中时才有值。
这是什么,我已经试过:
> desired_sample_df$geo[desired_sample_df$zip %in% desired] <- desired_sample_df$zip[which(desired_sample_df$zip %in% desired)]
Warning message:
In `[<-.factor`(`*tmp*`, desired_sample_df$zip %in% desired, value = c(NA, :
invalid factor level, NA generated
> desired_sample_df$geo[desired_sample_df$zip %in% desired] <- desired_sample_df$zip
Warning messages:
1: In `[<-.factor`(`*tmp*`, desired_sample_df$zip %in% desired, value = c(NA, :
invalid factor level, NA generated
2: In `[<-.factor`(`*tmp*`, desired_sample_df$zip %in% desired, value = c(NA, :
number of items to replace is not a multiple of replacement length
我添加了'stringsAsFactors = FALSE'部分,因为它给你的错误。 – Brouwer
啊,这很有道理。不确定是什么导致了错误。不知道如何排名答案,但我把它交给了@jhoward以获得简洁... – goldisfine
不幸的是,你只能接受一个答案,没有银牌;-) – Brouwer