2013-07-23 26 views
5

我在美国各地的候鸟种类的发生数据中有大约500,000点R计算网格中的物种发生

我试图覆盖这些点上的网格,然后计算每个网格中发生的次数。一旦统计完成后,我想引用它们到网格单元ID。

在R中,我使用over()函数来获取范围映射中的点,这是一个shapefile。

#Read in occurrence data 
data=read.csv("data.csv", header=TRUE) 
coordinates(data)=c("LONGITUDE","LATITUDE") 

#Get shapefile of the species' range map 
range=readOGR(".",layer="data") 

proj4string(data)=proj4string(range) 

#Get points within the range map 
inside.range=!is.na(over(data,as(range,"SpatialPolygons"))) 

以上工作正是我希望的,但并没有解决我目前的问题:如何处理那些SpatialPointsDataFrame的类型和网格光栅点。你会推荐多边形化栅格网格,并使用我上面指出的相同方法?或者另一个过程会更有效率?

+0

您正在使用哪个软件包? –

+0

@HongOoi我相信它是'sp'。 – agstudy

+3

这可能会让你开始:[使用R将点聚合到网格](http://gis.stackexchange.com/a/48434/9803) – Ben

回答

3

首先,您的R代码不能像写入一样工作。我建议将它复制粘贴到一个干净的会话中,并且如果它也出错了,请更正语法错误或包括附加库直到它运行。

这就是说,我假设你应该以二维数字坐标的data.frame结尾。所以,为了对它们进行分箱和计数,任何这样的数据都可以,因此我冒昧地模拟了这样的数据集。如果这不能捕获数据的相关方面,请纠正我。

## Skip this line if you are the OP, and substitute the real data instead. 
data<-data.frame(LATITUDE=runif(100,1,100),LONGITUDE=runif(100,1,100)); 

## Add the latitudes and longitudes between which each observation is located 
## You can substitute any number of breaks you want. Or, a vector of fixed cutpoints 
## LATgrid and LONgrid are going to be factors. With ugly level names. 
data$LATgrid<-cut(data$LATITUDE,breaks=10,include.lowest=T); 
data$LONgrid<-cut(data$LONGITUDE,breaks=10,include.lowest=T); 

## Create a single factor that gives the lat,long of each observation. 
data$IDgrid<-with(data,interaction(LATgrid,LONgrid)); 

## Now, create another factor based on the above one, with shorter IDs and no empty levels 
data$IDNgrid<-factor(data$IDgrid); 
levels(data$IDNgrid)<-seq_along(levels(data$IDNgrid)); 

## If you want total grid-cell count repeated for each observation falling into that grid cell, do this: 
data$count<- ave(data$LATITUDE,data$IDNgrid,FUN=length); 
## You could have also used data$LONGITUDE, doesn't matter in this case 

## If you want just a table of counts at each grid-cell, do this: 
aggregate(data$LATITUDE,data[,c('LATgrid','LONgrid','IDNgrid')],FUN=length); 
## I included the LATgrid and LONgrid vectors so there would be some 
## sort of descriptive reference accompanying the anonymous numbers in IDNgrid, 
## but only IDNgrid is actually necessary 

## If you want a really minimalist table, you could do this: 
table(data$IDNgrid);