所以我有一个csv文件,其中每一行表示分层数据的形式: '门','班','订单','家庭','属','物种','亚种','unique_gi'将csv转换为Newick树
我想将其转换为经典的Newick tree format无距离。无论是一种新颖的方法还是一个python包都会很棒。谢谢!
所以我有一个csv文件,其中每一行表示分层数据的形式: '门','班','订单','家庭','属','物种','亚种','unique_gi'将csv转换为Newick树
我想将其转换为经典的Newick tree format无距离。无论是一种新颖的方法还是一个python包都会很棒。谢谢!
您可以使用一些简单的Python从CSV中构建一棵树,然后将它写出到Newick树中。不知道这是你想要做什么或不是。
import csv
from collections import defaultdict
from pprint import pprint
def tree(): return defaultdict(tree)
def tree_add(t, path):
for node in path:
t = t[node]
def pprint_tree(tree_instance):
def dicts(t): return {k: dicts(t[k]) for k in t}
pprint(dicts(tree_instance))
def csv_to_tree(input):
t = tree()
for row in csv.reader(input, quotechar='\''):
tree_add(t, row)
return t
def tree_to_newick(root):
items = []
for k in root.iterkeys():
s = ''
if len(root[k].keys()) > 0:
sub_tree = tree_to_newick(root[k])
if sub_tree != '':
s += '(' + sub_tree + ')'
s += k
items.append(s)
return ','.join(items)
def csv_to_weightless_newick(input):
t = csv_to_tree(input)
#pprint_tree(t)
return tree_to_newick(t)
if __name__ == '__main__':
# see https://docs.python.org/2/library/csv.html to read CSV file
input = [
"'Phylum','Class','Order','Family','Genus','Species','Subspecies','unique_gi'",
"'Phylum','Class','Order','example'",
"'Another','Test'",
]
print csv_to_weightless_newick(input)
输出示例:
$ python ~/tmp/newick_tree.py
(((example,((((unique_gi)Subspecies)Species)Genus)Family)Order)Class)Phylum,(Test)Another
此外,该库看起来很酷,让你想象你的树:http://biopython.org/wiki/Phylo
谢谢!很棒。 – 2014-10-02 06:41:22
@MarkWatson,'python newick_tree.py file.csv'是正确的命令行吗? – user3184877 2017-05-25 15:04:26
交叉贴:https://www.biostars.org/p/ 114387 – Pierre 2014-10-01 19:52:48