biopython

1热度

2回答

我正在研究一个大的fasta文件，我想根据基因ID拆分为多个文件。我试图从biopython教程使用上面的脚本： def batch_iterator(iterator, batch_size): """Returns lists of length batch_size. This can be used on any iterator, for example to ba

1热度

1回答

如何安装和使用github代码进行下一代测序数据分析？

从VIPREADME.md文件：安装一台机器上安装VIP的步骤如下： > git clone https://github.com/keylabivdc/VIP > cd VIP > cd installer > chmod 755 * > sudo sh dependency_installer.sh > sudo sh db_installer.sh -r [PATH]

2热度

1回答

从SeqIO.index生成的字典中删除项目

我正在使用Python 2.6.6，并且我试图删除中的，它们与file1中的读取重叠（即相同）。这里是代码我想实现： ref_reads = SeqIO.index("file1.fastq", "fastq") spk_reads = SeqIO.index("file2.fastq", "fastq") for spk in spk_reads: if spk in ref_r

0热度

1回答

从FASTA文件删除第一条记录在Python

我有以下格式的小FASTA文件： >gene_1 + other data seq 1 >gene_1 + other data seq2 >gene_1 + other data seq3 我想删除的文件的第一个元素。这是一个庞大的Python脚本的一部分，一旦我已经使用该seq，并提取了它的有趣部分，我想从文件中删除它。最终，文件将被清空，因此我可以从文件夹中删除它。因为我一直

1热度

1回答

Biopython：MMTFParser找不到原子之间的距离

我正在使用biopython来查找两个残基的C原子之间的距离，并且不断收到错误。这里是我的代码和错误： ``` >>> from Bio.PDB.mmtf import MMTFParser >>> structure = MMTFParser.get_structure_from_url('4mne') /Library/Python/2.7/site-packages/Bio/PDB/St

0热度

1回答

Biopython和检索日志的全名

我使用Biopython和Python 3.x从PubMed数据库进行搜索。我正确地获得搜索结果，但接下来我需要提取搜索结果的所有日记名称（全名，而不仅仅是缩写）。目前我使用下面的代码： from Bio import Entrez from Bio import Medline Entrez.email = "[email protected]" handle = Entrez.esea

0热度

1回答

Biopython的ESearch不给我充分IDLIST

我试图寻找通过使用下面的代码的一些文章： handle = Entrez.esearch(db="pubmed", term="lung+cancer") record = Entrez.read(handle) 从record['Count']我可以看到有293279分的结果，但是当我看到record['IdList']它只给我20个ID。这是为什么？我如何获得所有293279记录？

0热度

1回答

Python：将gzip文件转换为普通文件

我有一个ent文件，压缩到.gz。我需要阅读它并将其放入Biopython分析器中。问题是解析器需要文件路径或文件对象，但我改为使用gzip文件。现在我把它转换是这样的： file_path = 'file.ent.gz' # path to current file file = gzip.open(file_path, 'rb') content = file.read() # its

1热度

1回答

Weblogo - 硒字母

我要生成硒代半胱氨酸的标志，但是当我选择的选项与reduced_protein_alphabet我得到错误“但却难免重复字母” weblogo -f sc.txt -D fasta -o sc_logo -F pdf -a reduced_protein_alphabet -s large -n 100 -c chemistry

1热度

1回答

使用Biopython发现并提取FASTA匹配精确DNA序列

我试图用Biopython提取所有DNA序列从包含有以下的短DNA序列匹配一个FASTA文件：“GGCTCAACCCTGGA” 以下是我迄今为止： from Bio import SeqIO source = "rep_set_no_spaces.fasta" outfile = "rep_set_PNA_matches.fasta" seq1 = "GGCTCAACCCTGGA" #