2016-10-13 75 views
1

我试图对两个数据帧中的值进行评估,并创建一个包含结果的新数据框。我对R的力量很陌生,我试图避免旧的编码习惯。换句话说,我拼命地试图避免使用循环,但在这种情况下无法找出plyr之类的东西。在R中,评估两个数据帧之间的函数

在示例中,我创建了机场,飞行员和用公里计算距离的函数。我的问题是试图确定每个飞行员最接近哪个主要机场以及每个机场的距离。

#Build Airports 
code <- c("IAH", "DFW", "Denver", "STL") 
lat <- c(29.97, 32.90, 39.75, 38.75) 
long <- c(95.35, 97.03, 104.87, 90.37) 
airports <- data.frame(code, lat, long) 

#Build Pilots 
names <- c("James", "Fiona", "Seamus") 
lat <- c(32.335131, 44.913223, 28.849631) 
long <- c(-84.989067, -97.151334, -96.917240) 
pilots <- data.frame(names, lat, long) 

#Create distance function 
distInKm <- function(lat1, long1, lat2, long2) { 
    dlat = (lat2 * 0.01745329) - (lat1 * 0.01745329) #pi/180 convert to radians 
    dlong = (long2 * 0.01745329) - (long1 * 0.01745329) 
    step1 = (sin(dlat/2))^2 + cos(lat1 * 0.01745329) * cos(long2 * 0.01745329) * (sin(dlong/2))^2 
    step2 = 2 * atan2(sqrt(step1), sqrt(1 - step1)) 
    dist = 6372.798 * step2 #R is the radius of earth (40041.47/(2 * pi)) 
    dist 
} 

谢谢你的时间。

回答

3

首先,您的机场经济是积极的,他们应该是负面的,这将甩掉结果。让我们来解决他们如此结果更有意义:现在

airports$long <- -airports$long 

,您可以使用apply来评估所有的飞行员对每个机场。 geosphere包有几个函数可以计算直线距离,包括distGeodistHaversine

library(geosphere) 

pilots$closest_airport <- apply(pilots[, 3:2], 1, function(x){ 
    airports[which.min(distGeo(x, airports[, 3:2])), 'code'] 
}) 

pilots$airport_distance <- apply(pilots[, 3:2], 1, function(x){ 
    min(distGeo(x, airports[, 3:2]))/1000 # /1000 to convert m to km 
}) 

pilots 
## names  lat  long closest_airport airport_distance 
## 1 James 32.33513 -84.98907    STL   862.5394 
## 2 Fiona 44.91322 -97.15133   Denver   855.8088 
## 3 Seamus 28.84963 -96.91724    IAH   196.3559 

,或者如果你希望所有的距离,而不是仅仅最小的一个,cbindapply得到的矩阵:

pilots <- cbind(pilots, t(apply(pilots[, 3:2], 1, function(x){ 
    setNames(distGeo(x, airports[, 3:2])/1000, airports$code) 
}))) 

pilots 
## names  lat  long closest_airport  IAH  DFW Denver  STL 
## 1 James 32.33513 -84.98907    STL 1021.6523 1131.2129 1965.6586 862.5394 
## 2 Fiona 44.91322 -97.15133   Denver 1666.0359 1333.6842 855.8088 885.8480 
## 3 Seamus 28.84963 -96.91724    IAH 196.3559 449.1838 1412.0664 1253.4874 

翻译成dplyr,继任者plyr

library(dplyr) 

pilots %>% rowwise() %>% 
     mutate(closest_airport = airports[which.min(distGeo(c(long, lat), airports[, 3:2])), 'code'], 
       airport_distance = min(distGeo(c(long, lat), airports[, 3:2]))/1000) 

## Source: local data frame [3 x 5] 
## Groups: <by row> 
## 
## # A tibble: 3 × 5 
## names  lat  long closest_airport airport_distance 
## <fctr> <dbl>  <dbl>   <fctr>   <dbl> 
## 1 James 32.33513 -84.98907    STL   862.5394 
## 2 Fiona 44.91322 -97.15133   Denver   855.8088 
## 3 Seamus 28.84963 -96.91724    IAH   196.3559 

或所有的距离,使用bind_cols与上面的方法,或unnest一个列表列,重塑:

library(tidyverse) 

pilots %>% rowwise() %>% 
    mutate(closest_airport = airports[which.min(distGeo(c(long, lat), airports[, 3:2])), 'code'], 
      data = list(data_frame(airport = airports$code, 
            distance = distGeo(c(long, lat), airports[, 3:2])/1000))) %>% 
    unnest() %>% 
    spread(airport, distance) 

## # A tibble: 3 × 8 
## names  lat  long closest_airport Denver  DFW  IAH  STL 
## * <fctr> <dbl>  <dbl>   <fctr>  <dbl>  <dbl>  <dbl>  <dbl> 
## 1 Fiona 44.91322 -97.15133   Denver 855.8088 1333.6842 1666.0359 885.8480 
## 2 James 32.33513 -84.98907    STL 1965.6586 1131.2129 1021.6523 862.5394 
## 3 Seamus 28.84963 -96.91724    IAH 1412.0664 449.1838 196.3559 1253.4874 

或者更直接但不清晰,

pilots %>% rowwise() %>% 
    mutate(closest_airport = airports[which.min(distGeo(c(long, lat), airports[, 3:2])), 'code'], 
      data = (distGeo(c(long, lat), airports[, 3:2])/1000) %>% 
        setNames(airports$code) %>% t() %>% as_data_frame() %>% list()) %>% 
    unnest() 

## # A tibble: 3 × 8 
## names  lat  long closest_airport  IAH  DFW Denver  STL 
## <fctr> <dbl>  <dbl>   <fctr>  <dbl>  <dbl>  <dbl>  <dbl> 
## 1 James 32.33513 -84.98907    STL 1021.6523 1131.2129 1965.6586 862.5394 
## 2 Fiona 44.91322 -97.15133   Denver 1666.0359 1333.6842 855.8088 885.8480 
## 3 Seamus 28.84963 -96.91724    IAH 196.3559 449.1838 1412.0664 1253.4874 
+0

OP正试图确定哪些主要机场各试点最接近不是哪个飞行员距离每个机场最近 – HubertL

+0

@HubertL哎呀,向后看。固定。 – alistaire

+0

你往回读它导致它向后写 – HubertL