2017-08-04 98 views
0

我想基于使用csv作为输入的jinj2模板构建输出。我已经搜索并找不到许多信息来构建解决方案。在jinja2模板中循环csv内容

到目前为止,我有以下代码:

import sys, os 
import jinja2 
import csv 

in_file="csv_file.csv" 
jinja_template = "test.j2" 
env = jinja2.Environment(loader=jinja2.FileSystemLoader(searchpath=".")) 

with open(in_file, "rb") as FD: 
    reader = csv.DictReader(FD) 
    for vals in reader: 
     gen_template = env.get_template(jinja_template) 
     output = gen_template.render(vals) 
     print output 

我的CSV文件看起来像这样:

country,city 
UK,London 
UK,Manchester 
UK,Liverpool 
US,Chicago 
US,Denver 
US,Atlanta 

而且我Jinja2的模板看起来是这样的:

country: {{country}} has cities {{city}} 

以下是我试图实现的输出:

country: UK has cities: London, Manchester, Liverpool 
country: US has cities: Chicago, Denver, Atlanta 

我相信我需要在j2模板中运行循环,以便在该国旁边建立城市名称。

当我运行上面的代码,我居然得到了各个国家<>城市名称为seperately这样的:

country: UK has cities London 
country: UK has cities Manchester 
country: UK has cities Liverpool 
country: US has cities Chicago 
country: US has cities Denver 
country: US has cities Atlanta 

感激,如果有人能提供我如何能做到这一点一些指引。

回答

0

假设你输入CSV是按国家排序,itertools.groupby可以帮助:

from io import StringIO 
from jinja2 import Template 
from itertools import groupby 
from operator import itemgetter 
from csv import DictReader 


csv_data = '''country,city 
UK,London 
UK,Manchester 
UK,Liverpool 
US,Chicago 
US,Denver 
US,Atlanta 
''' 

tmpl = 'country: {{country}} has cities {{cities}}' 
template = Template(tmpl) 

with StringIO(csv_data) as file: 
    rows = DictReader(file) 
    for country, groups in groupby(rows, key=itemgetter('country')): 
     cities = ', '.join(row['city'] for row in groups) 
     output = template.render(country=country, cities=cities) 
     print(output) 

它打印

country: UK has cities London, Manchester, Liverpool 
country: US has cities Chicago, Denver, Atlanta 

,如果你喜欢做的join内神社,这是一种选择:

tmpl = 'country: {{country}} has cities {{cities | join(", ")}}' 
template = Template(tmpl) 

with StringIO(csv_data) as file: 
    rows = DictReader(file) 
    for country, groups in groupby(rows, key=itemgetter('country')): 
     cities = (row['city'] for row in groups) 
     output = template.render(country=country, cities=cities) 

,如果你需要可以追加一个标题,你首先需要从您的文件中收集的所有数据(这里使用OrderedDict完成):

from collections import OrderedDict 

tmpl = '''Countries and cities: 
{%-for country, cities in grouped.items()%} 
country: {{coutry}} has cities {{cities | join(", ")}} 
{%-endfor%}''' 
template = Template(tmpl) 

with StringIO(csv_data) as file: 
    rows = DictReader(file) 
    grouped = OrderedDict() 
    for country, groups in groupby(rows, key=itemgetter('country')): 
     grouped[country] = [item['city'] for item in groups] 
    output = template.render(grouped=grouped) 
    print(output) 

然后得出:

Countries and cities: 
country: has cities London, Manchester, Liverpool 
country: has cities Chicago, Denver, Atlanta 
+0

看起来不错,我会读更多关于itertools。 虽然有一个问题,如果我添加另一段文本,即使是模板文件的标题,它实际上也会多次打印,因为我们有csv文件中的内容。我怎样才能防止 – zig

+0

这是它看起来的样子,如果我在模板文件中添加标题: '这是文件的标题:' '国家:英国拥有的城市:伦敦,曼彻斯特,Liverpool' '这是文件的标题:' '国家:美国有城市:芝加哥,丹佛,亚特兰大' – zig

+0

好吧,试了一下。希望这是你所需要的... –