2014-07-07 75 views
0

我正在查询选择位于特定地理区域的记录,并且我正在进行一些连接和过滤。为什么不使用PostgreSQL索引并建议索引

这是我的查询:

SELECT "events".* FROM "events" INNER JOIN "albums" ON "albums"."event_id" = "events"."id" INNER JOIN "photos" ON "photos"."album_id" = "albums"."id" WHERE "events"."deleted_at" IS NULL AND "albums"."deleted_at" IS NULL AND "photos"."deleted_at" IS NULL AND (events.latitude BETWEEN -44.197088742316055 AND -23.22003941183816 AND events.longitude BETWEEN 133.226480859375 AND 165.570230859375) GROUP BY events.id HAVING count(albums.id) > 0 ORDER BY start_date DESC 

我有以下指标:

活动:

"events_pkey" PRIMARY KEY, btree (id) 
"index_events_on_deleted_at" btree (deleted_at) 
"index_events_on_latitude_and_longitude" btree (latitude, longitude) 

专辑:

"albums_pkey" PRIMARY KEY, btree (id) 
"index_albums_on_deleted_at" btree (deleted_at) 
"index_albums_on_event_id" btree (event_id) 

照片:

"photos_pkey" PRIMARY KEY, btree (id) 
"index_photos_on_album_id" btree (album_id) 
"index_photos_on_deleted_at" btree (deleted_at) 

做一个EXPLAIN ANALYZE结果在这一点,我没有看到我的指标,其中任何使用。我不知道如何强制它使用索引。任何人都可以帮我优化这个吗?

Sort (cost=4057.46..4057.84 rows=150 width=668) (actual time=556.114..556.187 rows=76 loops=1) 
    Sort Key: events.start_date 
    Sort Method: quicksort Memory: 78kB 
    -> HashAggregate (cost=4050.16..4052.04 rows=150 width=668) (actual time=555.667..555.783 rows=76 loops=1) 
     Filter: (count(albums.id) > 0) 
     -> Hash Join (cost=76.14..3946.54 rows=20724 width=668) (actual time=3.675..467.578 rows=48050 loops=1) 
       Hash Cond: (photos.album_id = albums.id) 
       -> Seq Scan on photos (cost=0.00..3441.87 rows=59013 width=4) (actual time=0.008..169.206 rows=60599 loops=1) 
        Filter: (deleted_at IS NULL) 
       -> Hash (cost=74.10..74.10 rows=163 width=668) (actual time=3.633..3.633 rows=318 loops=1) 
        Buckets: 1024 Batches: 1 Memory Usage: 176kB 
        -> Hash Join (cost=49.80..74.10 rows=163 width=668) (actual time=1.195..2.519 rows=318 loops=1) 
          Hash Cond: (albums.event_id = events.id) 
          -> Seq Scan on albums (cost=0.00..21.47 rows=321 width=8) (actual time=0.011..0.458 rows=321 loops=1) 
           Filter: (deleted_at IS NULL) 
          -> Hash (cost=47.92..47.92 rows=150 width=664) (actual time=1.151..1.151 rows=195 loops=1) 
           Buckets: 1024 Batches: 1 Memory Usage: 126kB 
           -> Seq Scan on events (cost=0.00..47.92 rows=150 width=664) (actual time=0.007..0.488 rows=195 loops=1) 
             Filter: ((deleted_at IS NULL) AND (latitude >= (-44.1970887423161)::double precision) AND (latitude <= (-23.2200394118382)::double precision) AND (longitude >= 133.226480859375::double precision) AND (longitude <= 165.570230859375::double precision)) 
Total runtime: 556.459 ms 

谢谢!!

编辑:谢谢你的链接。我已尝试禁用seqscan。现在我的计划是:

Sort (cost=5565.73..5566.10 rows=150 width=46) (actual time=451.208..451.290 rows=76 loops=1) 
    Sort Key: (date(events.start_date)) 
    Sort Method: quicksort Memory: 31kB 
    -> GroupAggregate (cost=0.00..5560.31 rows=150 width=46) (actual time=2.990..450.850 rows=76 loops=1) 
     Filter: (count(albums.id) > 0) 
     -> Nested Loop (cost=0.00..5454.44 rows=20724 width=46) (actual time=0.077..278.319 rows=48050 loops=1) 
       -> Merge Join (cost=0.00..205.35 rows=163 width=46) (actual time=0.051..2.856 rows=318 loops=1) 
        Merge Cond: (events.id = albums.event_id) 
        -> Index Scan using events_pkey on events (cost=0.00..118.72 rows=150 width=42) (actual time=0.024..0.792 rows=195 loops=1) 
          Filter: ((deleted_at IS NULL) AND (latitude >= (-44.1970887423161)::double precision) AND (latitude <= (-23.2200394118382)::double precision) 
AND (longitude >= 133.226480859375::double precision) AND (longitude <= 165.570230859375::double precision)) 
        -> Index Scan using index_albums_on_event_id on albums (cost=0.00..83.83 rows=321 width=8) (actual time=0.017..0.832 rows=321 loops=1) 
          Filter: (deleted_at IS NULL) 
       -> Index Scan using index_photos_on_album_id on photos (cost=0.00..30.27 rows=155 width=4) (actual time=0.010..0.409 rows=151 loops=318) 
        Index Cond: (album_id = albums.id) 
        Filter: (deleted_at IS NULL) 
Total runtime: 451.562 ms 

Still索引没有被完全使用,特别是在纬度和长条件下。我的索引设置是否正确?

编辑:在http://stackoverflow.com/questions/8228326/how-can-i-avoid-postgresql-sometimes-choosing-a-bad-query-plan-for-one-of-two-ne寻找答案之后,我认为它的表现就像这一点,因为我的查询语句返回的所有记录,然后我更新了条件,而新的查询计划是:

Sort (cost=786.18..786.22 rows=19 width=668) (actual time=3.754..3.755 rows=2 loops=1) 
    Sort Key: events.start_date 
    Sort Method: quicksort Memory: 25kB 
    -> HashAggregate (cost=785.54..785.77 rows=19 width=668) (actual time=3.700..3.703 rows=2 loops=1) 
     Filter: (count(albums.id) > 0) 
     -> Nested Loop (cost=48.39..765.51 rows=2670 width=668) (actual time=1.116..2.968 rows=543 loops=1) 
       -> Hash Join (cost=48.39..89.25 rows=21 width=668) (actual time=1.093..1.128 rows=3 loops=1) 
        Hash Cond: (events.id = albums.event_id) 
        -> Bitmap Heap Scan on events (cost=9.42..49.44 rows=19 width=664) (actual time=0.061..0.080 rows=9 loops=1) 
          Recheck Cond: ((latitude >= (-33.7474111086624)::double precision) AND (latitude <= (-33.581678187556)::double precision) AND (longitude >= 151.193933862305::double precision) AND (longitude <= 151.44661940918::double precision)) 
          Filter: (deleted_at IS NULL) 
          -> Bitmap Index Scan on index_events_on_latitude_and_longitude (cost=0.00..9.42 rows=28 width=0) (actual time=0.050..0.050 rows=9 loops=1) 
           Index Cond: ((latitude >= (-33.7474111086624)::double precision) AND (latitude <= (-33.581678187556)::double precision) AND (longitude >= 151.193933862305::double precision) AND (longitude <= 151.44661940918::double precision)) 
        -> Hash (cost=34.95..34.95 rows=321 width=8) (actual time=0.992..0.992 rows=321 loops=1) 
          Buckets: 1024 Batches: 1 Memory Usage: 13kB 
          -> Bitmap Heap Scan on albums (cost=14.74..34.95 rows=321 width=8) (actual time=0.069..0.570 rows=321 loops=1) 
           Recheck Cond: (deleted_at IS NULL) 
           -> Bitmap Index Scan on index_albums_on_deleted_at (cost=0.00..14.66 rows=321 width=0) (actual time=0.056..0.056 rows=321 loops=1) 
             Index Cond: (deleted_at IS NULL) 
       -> Index Scan using index_photos_on_album_id on photos (cost=0.00..30.27 rows=155 width=4) (actual time=0.014..0.273 rows=181 loops=3) 
        Index Cond: (album_id = albums.id) 
        Filter: (deleted_at IS NULL) 
Total runtime: 3.958 ms 

而时间是verrry少!! 有什么建议吗?

+0

也相关http://stackoverflow.com/questions/14554302/postgres-query-optimization-forcing-a-index-scan和http://www.postgresql.org/docs/current/static/indexes-examine .html – pozs

+0

嗨@pozs我试图禁用'seqscan'。更新了查询计划。 –

+0

我看到,顺便说一句你的查询计划尚未更新。 – pozs

回答

1

这主要是因为您没有在查询中设置LIMIT子句。没有LIMIT,你总是要求你的表中的所有数据,所以查看它们的索引也是不够的。

SQLFiddle 1, 2 vs. 3, 4

还要注意,FOREIGN KEY不添加索引(不过UNIQUE & PRIMARY KEY约束一样)。因此,您可能需要为albums.event_id & photos.album_id添加索引。

SQLFiddle 3 vs. 4

排序(如果存在)可以使用索引。在您的查询中,这意味着events.start_date上的索引。

+0

非常感谢@pozs。不幸的是,我现在无法在我的应用中使用“LIMIT”,因为我需要做很多前端更改。但它确实有帮助! :) –

+0

而且,我已经有这两个'FK'的索引。顺便说一句,让我给start_date添加索引吧! 帮了我很多。谢谢! –

+0

加入'LIMIT'没有帮助。我也为排序添加了索引。我也删除了排序..但时间停留在450-500毫秒。 –