2012-05-02 40 views
2

作为对我的问题的答案:Is it normal that sqlite.fetchall() is so slow?似乎fetch-all和fetch-one对于sqlite来说可能非常慢。运行sqlite查询后,如何加速获取结果?

正如我所提到,我有以下查询:

time0 = time.time() 
self.cursor.execute("SELECT spectrum_id, feature_table_id "+ 
       "FROM spectrum AS s "+ 
       "INNER JOIN feature AS f "+ 
       "ON f.msrun_msrun_id = s.msrun_msrun_id "+ 
       "INNER JOIN (SELECT feature_feature_table_id, min(rt) AS rtMin, max(rt) AS rtMax, min(mz) AS mzMin, max(mz) as mzMax "+ 
          "FROM convexhull GROUP BY feature_feature_table_id) AS t "+ 
       "ON t.feature_feature_table_id = f.feature_table_id "+ 
       "WHERE s.msrun_msrun_id = ? "+ 
       "AND s.scan_start_time >= t.rtMin "+ 
       "AND s.scan_start_time <= t.rtMax "+ 
       "AND base_peak_mz >= t.mzMin "+ 
       "AND base_peak_mz <= t.mzMax", spectrumFeature_InputValues) 
print 'query took:',time.time()-time0,'seconds' 

time0 = time.time() 
spectrumAndFeature_ids = self.cursor.fetchall()  
print time.time()-time0,'seconds since to fetchall' 

选择语句的执行大约需要50秒(可接受的)。但是,fetchall()需要788秒,只能获取981个结果。

提出加快查询的方式,回答我的问题:Is it normal that sqlite.fetchall() is so slow?使用fetchmany(),并没有提高获取结果的速度。

运行sqlite查询后,如何加速获取结果?


正是我试图执行它的命令行的SQL:

sqlite> SELECT spectrum_id, feature_table_id 
    ...> FROM spectrum AS s 
    ...> INNER JOIN feature AS f 
    ...> ON f.msrun_msrun_id = s.msrun_msrun_id 
    ...> INNER JOIN (SELECT feature_feature_table_id, min(rt) AS rtMin, max(rt) AS rtMax, min(mz) AS mzMin, max(mz) as mzMax 
    ...> FROM convexhull GROUP BY feature_feature_table_id) AS t 
    ...> ON t.feature_feature_table_id = f.feature_table_id 
    ...> WHERE s.msrun_msrun_id = 1 
    ...> AND s.scan_start_time >= t.rtMin 
    ...> AND s.scan_start_time <= t.rtMax 
    ...> AND base_peak_mz >= t.mzMin 
    ...> AND base_peak_mz <= t.mzMax; 

更新:

所以我就开始运行在命令行查询约45分钟前,它仍然很忙,所以使用命令行也很慢。

+0

多少时间通过客户端执行时,它不相同的查询时间? – 2012-05-02 11:35:41

+0

另外,你使用的是什么sqlite3 python模块?什么版本?什么是模块使用的sqlite3版本? – 2012-05-02 11:44:12

+0

我正在使用sqlite模块:2.6.3和sqlite版本:3.7.10。我试图通过SQLite管理器执行命令,但似乎无法应付它。 –

回答

1

从阅读this question,听起来像你可以从使用APSW sqlite module受益。不知何故,你可能会成为你的sqlite模块的受害者,导致你的查询以一些性能较低的方式执行。

我很好奇,所以我尝试使用apsw我自己。这并不复杂。你为什么不试一试?

要安装它,我必须:

  1. 提取latest version
  2. 有安装包取最新的sqlite合并。

    python setup.py fetch --sqlite 
    
  3. 构建和安装。

    sudo python setup.py install 
    
  4. 用它来代替其他的sqlite模块。

    import apsw 
    <...> 
    conn = apsw.Connection('foo.db')