2017-03-09 20 views
1

现在我有一个GeoJSON的文件和使用匀称以下功能:如何在python中反转地理编码中的大量点?

它发生在一个坐标,并返回邻里名

def get_neighb(lat, lon): 
    """Input Latitude and Longitude, Returns Neighborhood Name""" 
    point = Point(lon, lat) 
    found = False 
    for feature in geo_data['features']: 
     polygon = shape(feature['geometry']) 
     if polygon.contains(point): 
      return(feature['properties']['neighborhood']) 
      found = True 
    if found is False: 
     return('NA') 

# Initialize list 
tn = ['']*data.shape[0] 
for i in range(len(tn)): 
    tn[i] = get_neighb(data.latitude[i], data.longitude[i]) 

这工作,但它实在是太慢了,有什么想法就如何我可以加快速度,目前运行在400万行。

+0

只是一个小挑逗,但你实际上并不需要找到的变量。 –

回答

1

如果您想避免例如PostGIS数据库的重型机械,那么可以使用rtree包作为(如文档所述)“廉价空间数据库”。这个想法大多如下:

#!/usr/bin/env python 
from itertools import product 
from random import uniform, sample, seed 
from rtree import index 
from shapely.geometry import Point, Polygon, box, shape 
from shapely.affinity import translate 

seed(666) 

#generate random polygons, in your case, the polygons are stored 
#in geo_data['features'] 
P = Polygon([(0, 0), (0.5, 0), (0.5, 0.5), (0, 0.5), (0, 0)]) 
polygons = [] 
for dx, dy in product(range(0, 100), range(0, 100)): 
    polygons.append(translate(P, dx, dy)) 

#construct the spatial index and insert bounding boxes of all polygons 
idx = index.Index() 
for pid, P in enumerate(polygons): 
    idx.insert(pid, P.bounds) 

delta = 0.5 
for i in range(0, 1000): 
    #generate random points 
    x, y = uniform(0, 10), uniform(0, 10) 
    pnt = Point(x, y) 

    #create a region around the point of interest 
    bounds = (x-delta, y-delta, x+delta, y+delta) 

    #also possible, but much slower 
    #bounds = pnt.buffer(delta).bounds 

    #the index tells us which polygons are worth checking, i.e., 
    #the bounding box of which intersects with the region constructed in previous step 
    for candidate in idx.intersection(bounds): 
     P = polygons[candidate] 

     #test only these candidates 
     if P.contains(pnt): 
      print(pnt, P) 
2

你必须找到一个策略,不要检查每一行。最简单的方法是将所有形状转储到地理位置感知数据库中并进行查询。类似于后贴式或弹性搜索。

另一种策略可能是找到所有邻域的质心,然后使用KD树仅过滤附近质心的邻域。

+0

还有SpatiaLite。 SQLite的空间扩展,比PostGIS运行起来要简单一些。 – Alexander