Resistance gene analogues (RGAs)的鑑定

2021-02-21 生信苑

RGAs是和植物抗病相關基因。主要包括含有一下domain活著motif的基因:

nucleotide binding site (NB-ARC)

leucine rich repeat (LRR)

 transmembrane (TM)

serine/threonine and tyrosine kinase (STTK)

 lysin motif (LysM)

 coiled-coil (CC) 

Toll/Interleukin-1 receptor (TIR)

其中PRGdb收集了植物的相關RGAs(http://www.prgdb.org/prgdb/)。這裡我們主要介紹一個pipeline-RGAugury進行whole genome水平預測植物抗病基因。

1.pipeline安裝

下載pipeline連結

https://bitbucket.org/yaanlpc/rgaugury/src/master/

2.軟體要求

在我的server我構建了一個conda環境,會給後續帶來很大方便。

BLAST+ package download the file ending with "x64-linux.tar.gz" extension

Hmmer3 install Hmmer prior to pfam_scan package

pfam_scan package, make sure pfam_scan.pl can directly run from anywhere without adding path prefix. Check this link for easier dependency installation.

phobius1.01 packages, this is a 32bit program, you need to make sure the 64bit Linux Operation System has installed 32bit runtime (libstdc++6:i386) to load it. Refer to this thread for further help.

ncoils package has been embedded in this package, given that a minor modification in source code, making it adatp to the pipeline, thus we don't hope you use original one.

git is optional for you to directly clone our repository. We highly suggest you to use git to clone this repository in that the files' permission can be kept in right way.

jdk, JDK 1.8 is a requisite component when using InterproScan over v57.

interproscan, a HMM based domain/motif identification package

CViT, a genomic linkage feature visualization tools package based on Perl. Be sure all required perl modules have been successfully installed and no error reported when using CViT independent of RGAugury.

3.庫的下載和構建

Prior to installation of GD modules, you might need to install below libraries first.

4.module下載和安裝

RGAugury dependency

CViT needs below modules:

Config::IniFiles

GD::SVG

GD::Arrow

GD::Text

Pfam_scan.pl needs below module:

Moose this is an essential module for pfam_scan package, see Pfam_scan's README to install. Following this guide for easier install. Or use command "cpan install Moose".

bioperl install BioPerl core via CPAN or its official website.

Check above installed software and programs and make sure all of them have been correctly setup the owner and file permission.

這個是安裝成功與否的關鍵步驟,需要嚴格安裝以上過程進行安裝。

5.設置根目錄命令

Below is a example how I setup my environment variables from scratch in a clean Ubuntu 14.04/16.04 LTS, user should change path correspondingly.

export PATH=$PATH:/home/lipch/bin/phobius1.01 # to specify the path of phobius.pl script and binary.

export PATH=$PATH:/home/lipch/bin/hmmer3/bin # binary path

export PATH=$PATH:/home/lipch/bin/blast/bin # binary path of blast+ package

export PATH=$PATH:/home/lipch/RGAugury_pipeline # this package scripts path

export PATH=$PATH:/home/lipch/RGAugury_pipeline/coils #the path to scoils-ht, which is a modified version of coils to adapt to RGAugury pipeline.

export PATH=$PATH:/home/lipch/database/interproscan-x.xx-xx.0 #download latest one as your wish. Do not add the path of "bin" under interproscan directory.

export PATH=$PATH:/home/lipch/Downloads/PfamScan #to specify the path for script of pfam_scan.pl

export PATH=$PATH:/home/lipch/bin/cvit.1.2.1 #to specify the path of cvit.pl in CViT package, make sure cvit.pl can be found by 'which' command.

export COILSDIR=/home/lipch/RGAugury_pipeline/coils # or create a plain file with putting this command only but a directory all user can access and drop it to /etc/profile.d/, file permission changes to 755, otherwise export it to user's profile and point to another user authorized directory

export PERL5LIB=/home/lipch/Downloads/PfamScan:$PERL5LIB #perl module for pfam_scan.pl

export PFAMDB=/home/lipch/database/pfamdb #to specifiy the hmm pfam-A/B DB path

interproscan.properties configuration

Due to the parallel modification on Tools.pm, thus we need to change the worker number of interproscan to 1, which will avoid the panic of RAM. Be aware of that we only optimized for regular workstation with multile thread supported, if you want to take advantate of grid, please refer to corresponding interproscan manual.

number.of.embedded.workers=1
maxnumber.of.embedded.workers=1

6. 安裝RGAugury pipeline

Download this pipeline by trying below command under Linux system if GIT was installed.

git clone https://bitbucket.org/yaanlpc/rgaugury.git

Before running pipeline, make sure all Perl scripts files permission are modified to 755, in directory of RGAugury:

chmod 755 *.pl

under directory of coils, try:

chmod 755 scoils-ht

7.下載database

pfam Follow the installation guide of pfam_scan package["Download Pfam data files" section] to prepare binary files by using 'Pfam-A.hmm'. Make sure put all files under directory of /home/user_ID_to_be_replaced_by_yours/database/pfam/, because this path has been hard coded in our scripts. Alternatively, make sure pfam folder is consisted with setting of $pfam_index_folder in RGAugury.pl

RGADB, RGADB has been embedded in this package. Be sure to keep its location without any change.

panther, if panther db will be used in either command line or web UI, be sure install it correctly according to instruction of interproscan package, meanwhile, configuration file of interproscan might need proper modification.

8.pipeline的使用

main script RGAugury.pl has six options, but only input file is mandatory to be specified in command line, make sure fasta file's seq title has only no-space gene ID. Export the RGAugury directory PATH to ENV variable.

Scripts: Resistance Gene Analogs (RGAs) prediction pipeline

Programmed by Pingchuan Li @ AAFC - Dr. Frank You Lab

Usage :perl RGAugury.pl <options>

arguments:

-p protein fasta file
-n corresponding cDNA/CDS nucleotide for -p (optional)
-g genome file in fasta format (optional)
-gff a modified gff3-like file, see below format (optional)
-c cpu or threads number, default = 2
-pfx prefix for filename, useful for multiple speices input in same folder (optional)

9.十字花科植物RGAs鑑定


1.鑑定的RGAs從Brassica參考基因組

2.對鑑定的RGAs基因進行聚類分析,分成了4個clades

3.鑑定相關domain基因的分布

其實剩下的就是基本生物信息分析手段,如果感興趣的可以查看references原文。

Refences

Li P, Quan X, Jia G, Xiao J, Cloutier S, You FM. RGAugury: a pipeline for genome-wide prediction of resistance gene analogs (RGAs) in plants. BMC Genomics. 2016;17(1):852. Published 2016 Nov 2. doi:10.1186/s12864-016-3197-x

Osuna-Cruz CM, Paytuvi-Gallart A, Di Donato A, Sundesha V, Andolfo G, Aiese Cigliano R, Sanseverino W, Ercolano MR. PRGdb 3.0: a comprehensive platform for prediction and analysis of plant disease resistance genes. Nucleic Acids Research 2017. doi: 10.1093/nar/gkx1119

Dolatabadian, A., Bayer, P.E., Tirnaz, S., Hurgobin, B., Edwards, D. and Batley, J. (2020), Characterization of disease resistance genes in the Brassica napus pangenome reveals significant structural variation. Plant Biotechnol J, 18: 969-982. doi:10.1111/pbi.13262

Soodeh Tirnaz, Philipp Bayer, Fabian Inturrisi, Fangning Zhang, Hua Yang, Aria Dolatabadian, Ting X Neik, Anita Severn-Ellis, Dhwani Patel, Muhammad I Ibrahim, Aneeta Pradhan, David Edwards, Jacqueline Batley. Plant Physiology Aug 2020, pp.00835.2020; DOI: 10.1104/pp.20.00835

相關焦點

  • 基因家族專題(3):基因家族成員的鑑定
    Data preparation繼續上次的內容,下載好數據後就可以正式開始鑑定了。首先回顧一下,下載好的數據。
  • resistance什麼意思
    resistance什麼意思我們好久沒見到resist啦,大家有沒有想它?我有點~單詞解析resist,動詞,抵抗、拒絕。resistance,名詞,意思是:拒絕、抵抗,拒絕接受、反抗攻擊的行為。阻力,阻止某事發生、減慢某事發生的速度的力。物理學專業名詞——電阻,最後:the Resistance,特指地下抵抗組織、秘密抵抗組織。
  • 非洲瘧原蟲種群結構鑑定
    非洲瘧原蟲種群結構鑑定 作者:小柯機器人 發布時間:2019/8/23 13:56:32 近日,來自甘比亞的研究團隊領銜合作報導了撒哈拉以南非洲地區惡性瘧原蟲的主要亞群。
  • 效應:Electrical Resistance(電阻)
    參考文獻:https://www.wonkeedonkeetools.co.uk/plumbing-irons/how-does-electrical-resistance-work一些說明:鑑於目前似乎沒有網友整理中文版的「科學效應庫」(或者有人整理,但我孤陋寡聞),我正好閒著,就慢慢做,以方便國內網友查閱;內容方面
  • Nature:鑑定SNCA基因增強子單點突變增加帕金森病風險
    2016年4月25日/生物谷BIOON/--在一項新的研究中,來自美國懷特海德研究所等機構的研究人員利用一種新方法確定了在全基因組關聯研究(genome-wide association study, GWAS)中鑑定出的一種非編碼突變如何能夠增加散發性帕金森病(sporadic Parkinson's disease)發病風險。
  • 科學家提出鑑定罕見病基因新方法
    科學家提出鑑定罕見病基因新方法 作者:小柯機器人 發布時間:2019/7/9 13:30:34 史丹福大學Stephen B.Montgomery研究團隊的一項最新研究,提出了利用血液轉錄組測序和大型對照組鑑定罕見病基因的方法。 2019年6月出版的《Nature Medicine》發表了這項成果。 該課題組試圖評估血液中的RNA-seq作為診斷不同病理生理學罕見疾病工具的效用。研究人員從94名患有16種不同疾病類別的未確診罕見疾病的患者身上提取了全血進行RNA-seq。
  • 每日一詞:resistance【電阻】
    電阻(resistance)是阻止電流通過能力的量度。電是導線中的電子流,與導線中的金屬離子發生碰撞,這碰撞使得原子震動加劇,導致金屬發熱。
  • Padgene DZ09藍牙智能手錶
    Padgene DZ09藍牙智能手錶 2017年09月03日 13:30作者:網絡編輯:網絡 Padgene DZ09藍牙智能手錶
  • SnapGene(破解)— 質粒圖譜查詢
    2019-04-02 18:01:24 來源: 南博屹生物 舉報   Snapgene
  • miR-150/GLUT4 ameliorates insulin resistance
    However, whether they regulate insulin resistance (IR) of cardiomyocytes remains unclear. The aim of the present study was to shed light on this issue with a focus on miR-150.