2016-08-25 35 views
1

我有postgres表,我想在这些表上使用python运行PostgreSQL脚本文件,然后在csv文件中写入查询结果。该脚本文件有多个以分号分隔的查询;。示例脚本如下所示阅读和使用python postgres脚本

脚本文件:

--Duplication Check 
select p.*, c.name 
from scale_polygons_v3 c inner join cartographic_v3 p 
on (metaphone(c.name_displ, 20) LIKE metaphone(p.name, 20)) AND c.kind NOT IN (9,10) 
where ST_Contains(c.geom, p.geom); 

--Area Check 
select sp.areaid,sp.name_displ,p.road_id,p.name 
from scale_polygons_v3 sp, pak_roads_20162207 p 
where st_contains(sp.geom,p.geom) and sp.kind = 1 
and p.areaid != sp.areaid; 

当我运行Python代码,它成功地执行没有任何错误,但我面临的问题是,编写查询的结果中一个csv文件。只有上次执行的查询的结果才写入csv文件。这意味着第一个查询结果被第二个查询覆盖,第二个查询结果被第三个覆盖,直到最后一个查询。

这里是我的Python代码:

import psycopg2 
import sys 
import csv 
import datetime, time 

def run_sql_file(filename, connection): 
''' 
    The function takes a filename and a connection as input 
    and will run the SQL query on the given connection 
''' 
    start = time.time() 

    file = open(filename, 'r') 
    sql = s = " ".join(file.readlines()) 
    #sql = sql1[3:] 
    print "Start executing: " + " at " + str(datetime.datetime.now().strftime("%Y-%m-%d %H:%M")) + "\n" 
    print "Query:\n", sql + "\n" 
    cursor = connection.cursor() 
    cursor.execute(sql) 
    records = cursor.fetchall() 
    with open('Report.csv', 'a') as f: 
     writer = csv.writer(f, delimiter=',') 
     for row in records: 
      writer.writerow(row) 
    connection.commit() 
    end = time.time() 
    row_count = sum(1 for row in records) 
    print "Done Executing:", filename 
    print "Number of rows returned:", row_count 
    print "Time elapsed to run the query:",str((end - start)*1000) + ' ms' 
    print "\t ===============================" 

def main():  
    connection = psycopg2.connect("host='localhost' dbname='central' user='postgres' password='tpltrakker'") 
    run_sql_file("script.sql", connection) 
    connection.close() 

if __name__ == "__main__": 
    main() 

什么是错我的代码?

+0

可能没有帮助,但代码看起来不错。你正在打开文件模式'a',就像[这里]完成(http://stackoverflow.com/questions/2363731/append-new-row-to-old-csv-file-python) – Matthias

+0

执行脚本单次调用你只会得到最后执行的命令的结果(或者如果出现错误,则会出错)。 – Abelisto

+0

所有的查询都有相同的列数,它们是相同类型(和顺序)? –

回答

1

这是最简单的将每个查询输出为不同的文件。 copy_expert

query = ''' 
    select p.*, c.name 
    from 
     scale_polygons_v3 c 
     inner join 
     cartographic_v3 p on metaphone(c.name_displ, 20) LIKE metaphone(p.name, 20) and c.kind not in (9,10) 
    where ST_Contains(c.geom, p.geom) 
''' 
copy = "copy ({}) to stdout (format csv)".format(query) 
f = open('Report.csv', 'wb') 
cursor.copy_expert(copy, f, size=8192) 
f.close() 

query = ''' 
    select sp.areaid,sp.name_displ,p.road_id,p.name 
    from scale_polygons_v3 sp, pak_roads_20162207 p 
    where st_contains(sp.geom,p.geom) and sp.kind = 1 and p.areaid != sp.areaid; 
''' 
copy = "copy ({}) to stdout (format csv)".format(query) 
f = open('Report2.csv', 'wb') 
cursor.copy_expert(copy, f, size=8192) 
f.close() 

如果要追加第二输出到同一个文件,然后自顾自地打开的第一个文件对象。

请注意,这是必要的copy输出到stdout,以使其可用于copy_expert

+0

这对我有用好吧,但我不明白的代码,因为我必须做一些进一步的修改 –

+1

@ShahzadBacha你的意思是为什么'copy'到'stdout'?我更新了答案。否则你不明白哪一部分? –

+0

实际上,这应该是一个单独的问题,但无论如何,我该如何修改代码,还要计算从查询返回的行的总数,并且只能在查询结束时写入限制行,如“LIMIT 10” –

1

如果你能够改变SQL脚本一下,然后在这里是一个解决办法:

#!/usr/bin/env python 

import psycopg2 

script = ''' 
    declare cur1 cursor for 
     select * from (values(1,2),(3,4)) as t(x,y); 

    declare cur2 cursor for 
     select 'a','b','c'; 
    ''' 
print script 

conn = psycopg2.connect(''); 

# Cursors exists and available only inside the transaction 
conn.autocommit = False; 

# Create cursors from script 
conn.cursor().execute(script); 

# Read names of cursors 
cursors = conn.cursor(); 
cursors.execute('select name from pg_cursors;') 
cur_names = cursors.fetchall() 

# Read data from each available cursor 
for cname in cur_names: 
    print cname[0] 
    cur = conn.cursor() 
    cur.execute('fetch all from ' + cname[0]) 
    rows = cur.fetchall() 
    # Here you can save the data to the file 
    print rows 


conn.rollback() 

print 'done' 

免责声明:我是Python的新手。