Biopython:如何避免蛋白质的特定氨基酸序列,以便绘制Ramachandran图?
作者:互联网
我写了一个python脚本来绘制泛素蛋白的’Ramachandran Plot’.我正在使用biopython.我正在使用pdb文件.我的脚本如下:
import Bio.PDB
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
phi_psi = ([0,0])
phi_psi = np.array(phi_psi)
pdb1 ='/home/devanandt/Documents/VMD/1UBQ.pdb'
for model in Bio.PDB.PDBParser().get_structure('1UBQ',pdb1) :
for chain in model :
polypeptides = Bio.PDB.PPBuilder().build_peptides(chain)
for poly_index, poly in enumerate(polypeptides) :
print "Model %s Chain %s" % (str(model.id), str(chain.id)),
print "(part %i of %i)" % (poly_index+1, len(polypeptides)),
print "length %i" % (len(poly)),
print "from %s%i" % (poly[0].resname, poly[0].id[1]),
print "to %s%i" % (poly[-1].resname, poly[-1].id[1])
phi_psi = poly.get_phi_psi_list()
for res_index, residue in enumerate(poly) :
#res_name = "%s%i" % (residue.resname, residue.id[1])
#print res_name, phi_psi[res_index]
phi_psi = np.vstack([phi_psi \
,np.asarray(phi_psi[res_index])]).astype(np.float)
#np.float - conversion to float array from object
phi, psi = np.transpose(phi_psi)
phi = np.degrees(phi)
psi = np.degrees(psi)
phi = phi[~np.isnan(phi)] # avoiding nan
psi = psi[~np.isnan(psi)]
f,ax = plt.subplots(1)
plt.title('Ramachandran Plot for Ubiquitin')
plt.xlabel('$\phi^o$', size=20,fontsize=15)
plt.ylabel('$\psi^o$', size=20,fontsize=15)
h=ax.hexbin(phi, psi, extent=[-180,180,-180,180],cmap=plt.cm.Blues)
#h=ax.hexbin(phi, psi,gridsize=35, extent=[-180,180,-180,180],cmap=plt.cm.Blues)
f.colorbar(h)
plt.grid()
plt.show()
我想修改这个代码,以便忽略GLYCINE氨基酸,然后绘制Ramachandran图.我的输出如下:
解决方法:
索引GLY后可以删除它们:
for poly_index, poly in enumerate(polypeptides):
gly_index = [i for i, res in enumerate(poly) if res.get_resname() == "GLY"]
在主循环和phy / psi计算之后,删除数组中的点:
new_phi_psi = np.delete(phi_psi, gly_index, 0)
phi, psi = np.transpose(new_phi_psi)
删除你摆脱NaNs的步骤.现在绘制积分得到这样的东西:
h=ax.hexbin(phi, psi, extent=[-180,180,-180,180],cmap=plt.cm.Blues)
h=ax.hexbin(n_phi, n_psi, extent=[-180,180,-180,180],cmap=plt.cm.Reds)
标签:biopython,python,bioinformatics,protein-database 来源: https://codeday.me/bug/20190830/1768132.html