首页 > 编程语言> > BioPython：如何将氨基酸字母表转换为

BioPython：如何将氨基酸字母表转换为

2019-06-12 17:55:14 作者：互联网

在讨论如何使用Bio.SeqIO.parse()导入序列数据时,BioPython食谱说明：

There is an optional argument alphabet to specify the alphabet to be used. This is useful for file formats like FASTA where otherwise Bio.SeqIO will default to a generic alphabet.

如何添加此可选参数？我有以下代码：

from os.path import abspath
from Bio import SeqIO

handle = open(f_path, "rU")
records = list(SeqIO.parse(handle, "fasta"))
handle.close()

这将从UniProt数据库导入大量FASTA文件.问题是它在通用的SingleLetterAlphabet类中.如何在SingleLetterAlphabet和ExtendedIUPACProtein之间进行转换？

最终目标是在这些序列中搜索诸如GxxxG之类的主题.

解决方法:

像这样：

# Import required alphabet
from Bio.Alphabet import IUPAC

# Pass imported alphabet as an argument for `SeqIO.parse`:
records = list(SeqIO.parse(handle, 'fasta', IUPAC.extended_protein))

标签：python,bioinformatics,biopython
来源： https://codeday.me/bug/20190612/1227418.html