Python-Matplotlib做gmx_MMPBSA計(jì)算結(jié)果展示

學(xué)習(xí)使用GROMACS已經(jīng)很久了,但是一直停留在很初級的應(yīng)用上面,對高級技巧并不了解。就連做生物體系幾乎必做的MMPBSA分析都沒有嘗試過。乘著這個(gè)假期了解了一下MMPBSA,才知道這個(gè)計(jì)算方法不僅僅是用于蛋白質(zhì)-配體體系的結(jié)合自由能計(jì)算,對于任何一個(gè)二聚體都是可以的。另外,Jerkwin老師已經(jīng)發(fā)展出了比之前常用的GMXMMPBSA與g_mmpbsa兩種腳本/程序更加簡單易用的計(jì)算腳本——gmx_mmpbsa。gmx_mmpbsa不但沒有GROMACS與APBS程序的版本要求,還可以一步運(yùn)行獲得所有結(jié)果,大大降低了MMPBSA的學(xué)習(xí)成本。

Jerkwin老師的博客中有詳盡的使用說明以及與其他兩種方法計(jì)算結(jié)果的比對,另外,gmx_mmpbsa腳本的最新版本也在可以他的github庫中找到。

gmx_mmpbsa使用前需安裝GRMOMACS與APBS,腳本會(huì)自動(dòng)調(diào)用這兩個(gè)程序進(jìn)行計(jì)算。Ubuntu環(huán)境下APBS可以使用sudo apt install apbs直接進(jìn)行安裝。在修改腳本內(nèi)的變量內(nèi)容時(shí),如果GROMACS與APBS都已添加進(jìn)了環(huán)境變量,則可簡寫為:gmx='gmx'以及apbs='apbs'

腳本運(yùn)行過程中如果出現(xiàn)某些awk函數(shù)未定義的錯(cuò)誤,那么還需要安裝一下gawk,使用sudo apt install gawk即可。

計(jì)算完成后會(huì)生成一系列不同結(jié)果的文檔,這里編寫了一個(gè)python腳本來進(jìn)行繪圖,順便復(fù)習(xí)了一下Pandas與Matplotlib的使用方法。其中有關(guān)餅狀圖的修飾來自于Lemonbit的知乎專欄文章

# This script is design to plot the mmpbsa calculate results
# from Jerkwin's gmx_mmpbsa script
# Author: Lewisbase
# Date: 2020.02.29

import numpy as np 
import pandas as pd 
import matplotlib.pyplot as plt 
from matplotlib import font_manager as fm
from matplotlib import cm

def readindata(filename):
    with open(filename) as f:
        text = f.readlines()
    index = []
    data = np.zeros([len(text)-1,len(text[0].split())-1])
    for i in range(1,len(text)):
        index.append(text[i].split()[0])
        for j in range(1,len(text[i].split())):
            if text[i].split()[j] == '|':
                data[i-1][j-1] = np.nan
            else:
                data[i-1][j-1]=float(text[i].split()[j])
    columns = text[0].split()[1:]
    dataframe = pd.DataFrame(data=data,index=index,columns=columns)
    return dataframe

def plot_binding_bar(dataframe):
    '''Plot the bar figure from total MMPBSA data'''
    names = [('Binding Free Energy\nBinding = MM + PB + SA',
             ['Binding','MM','PB','SA']),
             ('Molecule Mechanics\nMM = COU + VDW',
             ['MM','COU','VDW']),
             ('Poisson Boltzman\nPB = PBcom - PBpro - PBlig',
             ['PB','PBcom','PBpro','PBlig']),
             ('Surface Area\nSA = SAcom - SApro - SAlig',
             ['SA','SAcom','SApro','SAlig'])]
    fig,axs = plt.subplots(2,2,figsize=(8,8),dpi=72)
    axs = np.ravel(axs)

    for ax,(title,name) in zip(axs,names):
        ax.bar(name,dataframe[name].mean(),width=0.5,
               yerr=dataframe[name].std(),color='rgby')
        for i in range(len(dataframe[name].mean())):
            ax.text(name[i],dataframe[name].mean()[i],
                    '%.3f'%dataframe[name].mean()[i],
                    ha='center',va='center')
        ax.grid(b=True,axis='y')
        ax.set_xlabel('Energy Decomposition Term')
        ax.set_ylabel('Free energy (kJ/mol)')
        ax.set_title(title)
    plt.suptitle('MMPBSA Results')
    plt.tight_layout()
    plt.subplots_adjust(top=0.9)
    plt.savefig('MMPBSA_Results.png')
    plt.show()



def plot_plot_pie(datas):
    '''Plot the composition curve and pie figure'''
    fig,axs = plt.subplots(2,2,figsize=(8,8),dpi=72)
    axs = np.ravel(axs)

    names = [('Composition of MMPBSA',[0,1,4]),
             ('Composition of MM',[1,2,3]),
             ('Composition of PBSA',[4,5,6])]
    labels = ['res_MMPBSA','resMM','resMM_COU','resMM_VDW',
             'resPBSA','resPBSA_PB','resPBSA_SA']
    colors = ['black','blue','red']
    linestyles = ['-','--',':']
    alphas = [1,0.4,0.4]
    for ax,(title,name) in zip(axs[:-1],names):
        for i in range(len(name)):
            ax.plot(range(datas[name[i]].shape[1]),datas[name[i]].mean(),
                    color=colors[i],alpha=alphas[i],label=labels[name[i]],
                    linestyle=linestyles[i],linewidth=2.5)
        ax.grid(b=True,axis='y')
        ax.set_xlabel('Residues No.')
        ax.set_ylabel('Free Energy Contribution (kJ/mol)')
        ax.legend(loc='best')
        ax.set_title(title)
    
    explode = np.zeros([datas[0].shape[1]])
    maxposition = np.where(datas[0].mean() == datas[0].mean().abs().max())
    maxposition = np.append(maxposition,np.where(datas[0].mean() == 
                            -1 * datas[0].mean().abs().max()))
    explode[maxposition] = 0.4
    colors = cm.rainbow(np.arange(datas[0].shape[1])/datas[0].shape[1])
    patches, texts, autotexts = axs[-1].pie(datas[0].mean()/datas[0].mean().sum()*100,
                explode=explode,labels=datas[0].columns,autopct='%1.1f%%',
                colors=colors,shadow=True,startangle=90,labeldistance=1.1,
                pctdistance=0.8)
    axs[-1].axis('equal') # Equal aspect ratio ensures that pie is drawn as a circle
    axs[-1].set_title('Composition of MMPBSA')
    # set font size
    proptease = fm.FontProperties()
    proptease.set_size('xx-small')
    # font size include: xx-small,x-small,small,medium,large,x-large.xx-large or numbers
    plt.setp(autotexts,fontproperties=proptease)
    plt.setp(texts,fontproperties=proptease)
    
    plt.suptitle('MMPBSA Energy Composition')
    plt.tight_layout()
    plt.subplots_adjust(top=0.9)
    plt.savefig('MMPBSA_Energy_Composition.png')
    plt.show()



if __name__ == '__main__':
    pass
    prefix = input('Input the prefix of the calculate results: \n')
    files = ['MMPBSA','res_MMPBSA','resMM','resMM_COU','resMM_VDW',
             'resPBSA','resPBSA_PB','resPBSA_SA']
    datas = []
    for file in files:
        filename = prefix + '~' + file + '.dat'
        datas.append(readindata(filename))
    plot_binding_bar(datas[0])
    plot_plot_pie(datas[1:])

運(yùn)行后會(huì)將MMPBSA的計(jì)算結(jié)果匯總為自由能的柱形圖,以及各個(gè)殘基自由能貢獻(xiàn)的線狀圖與餅狀圖。數(shù)據(jù)太多時(shí)餅狀圖的標(biāo)簽會(huì)堆疊在一起,暫時(shí)還沒想到很好的處理辦法,圖像展示如下:

Bar
Plots and Pie

參考資料

gmx_mmpbsa使用說明

關(guān)于matplotlib,你要的餅圖在這里

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡書系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

推薦閱讀更多精彩內(nèi)容