文件下載
下載misa,同時將misa.ini放在misa的同一個文件夾下,然后下載三個perl腳本get_set_trimmer.pl、p3_in.pl和p3_out.pl,建議下載到同一個文件夾下。
介紹
misa.ini :配置文件
p3_in.pl:輸入 misa.pl 的輸出結果(file.fasta.misa),將引物設計的參數文件(模板,產物長度,目標區域等)導入到一個以“p3in”為后綴的文件中。
get_est_trimmer.pl:針對EST序列,可以除去EST序列中短的序列和兩端不明確的堿基。
p3_out.pl:對primer3產生的文件進行提取合,得到最后的結果文件 filename.result。
- 這里以玉米1號染色體的序列為例進行演示,玉米基因組的fa文件可以在ensembl網站上下載
perl misa.pl Zea_mays.AGPv4.dna.chromosome.1.fa
生成的文件說明
Zea_mays.AGPv4.dna.chromosome.1.fa.misa:以表格的形式列出微衛星的類型和位點;
Zea_mays.AGPv4.dna.chromosome.1.fa.statistics:統計微衛星的類型和頻數。
因為如果直接使用p3_in.pl進行轉換生成的文件會比較大,所以下面多了幾步#提取misa文件中的染色體編號和開始,結束的位置,兩邊各延伸150bp,生成一個bed文件。
cat Zea_mays.AGPv4.dna.chromosome.1.fa.misa |awk 'NR>1 {print $1"\t"$6-150"\t"$7+150}' >Zea_mays.AGPv4.dna.chromosome.1_ssr.bed
#使用bedtools工具提取重復序列
bedtools getfasta -fi Zea_mays.AGPv4.dna.chromosome.1.fa -bed Zea_mays.AGPv4.dna.chromosome.1_ssr.bed -fo Zea_mays.AGPv4.dna.chromosome.1_ssr.fa
再進行一次misa查找一次
perl misa.pl Zea_mays.AGPv4.dna.chromosome.1_ssr.fa
Zea_mays.AGPv4.dna.chromosome.1.fa.misa
Zea_mays.AGPv4.dna.chromosome.1_ssr.fa.misa
比較一下這兩次的結果可以知道我們做了什么,需要提醒的是可以有多種方法達到這樣的結果
- 接下來就是修改p3_in.pl文件,這樣使用它生成的文件就可以直接在primer3上面運行了,修改的內容可以參考primer3文件下的example文件,將p3_in.pl文件的輸出內容和example的內容一致,我現在使用的版本的修改內容是:
print OUT "PRIMER_SEQUENCE_ID=$id"."_$ssr_nr\nSEQUENCE=$seq\n";
改為
print OUT "SEQUENCE_ID=$id"."_$ssr_nr\nSEQUENCE_TEMPLATE=$seq\n";
調用p3_in.pl
perl p3_in.pl Zea_mays.AGPv4.dna.chromosome.1_ssr.fa.misa
#然后使用primer3進行設計引物
~/software/primer3-2.4.0/src/primer3_core --default_version=1 -- output=Zea_mays.AGPv4.dna.chromosome.1_ssr.fa.p3out Zea_mays.AGPv4.dna.chromosome.1_ssr.fa.p3in
- 使用p3_out.pl對設計好的引物進行處理,生成自然閱讀的格式,但是需要對這個腳本進行處理,修改的方式和上面修改的方式類似,就是按照primer3生成的文件來修改#首先就是將輸出的編號進行修改,因為用primer3生成的引物是從0開始的,而且可能有多對引物,所以得增加輸出的列,將下列文件
print OUT "ID\tSSR nr.\tSSR type\tSSR\tsize\tstart\tend\t";
print OUT "FORWARD PRIMER1 (5'-3')\tTm(癈)\tsize\tREVERSE PRIMER1 (5'-3')\tTm(癈)\tsize\tPRODUCT1 size (bp)\tstart (bp)\tend (bp)\t";
print OUT "FORWARD PRIMER2 (5'-3')\tTm(癈)\tsize\tREVERSE PRIMER2 (5'-3')\tTm(癈)\tsize\tPRODUCT2 size (bp)\tstart (bp)\tend (bp)\t";
print OUT "FORWARD PRIMER3 (5'-3')\tTm(癈)\tsize\tREVERSE PRIMER3 (5'-3')\tTm(癈)\tsize\tPRODUCT3 size (bp)\tstart (bp)\tend (bp)\n";
改為
print OUT "ID\tSSR nr.\tSSR type\tSSR\tsize\tstart\tend\t";
print OUT "FORWARD PRIMER0 (5'-3')\tTm(癈)\tsize\tREVERSE PRIMER0 (5'-3')\tTm(癈)\tsize\tPRODUCT0 size (bp)\tstart (bp)\tend (bp)\t";
print OUT "FORWARD PRIMER1 (5'-3')\tTm(癈)\tsize\tREVERSE PRIMER1 (5'-3')\tTm(癈)\tsize\tPRODUCT1 size (bp)\tstart (bp)\tend (bp)\t";
print OUT "FORWARD PRIMER2 (5'-3')\tTm(癈)\tsize\tREVERSE PRIMER2 (5'-3')\tTm(癈)\tsize\tPRODUCT2 size (bp)\tstart (bp)\tend (bp)\t";
print OUT "FORWARD PRIMER3 (5'-3')\tTm(癈)\tsize\tREVERSE PRIMER3 (5'-3')\tTm(癈)\tsize\tPRODUCT3 size (bp)\tstart (bp)\tend (bp)\t";
print OUT "FORWARD PRIMER4 (5'-3')\tTm(癈)\tsize\tREVERSE PRIMER4 (5'-3')\tTm(癈)\tsize\tPRODUCT4 size (bp)\tstart (bp)\tend (bp)\t";
- 然后就是,將提取內容的代碼進行修改,就是將前面幾行中沒有0的地方加上0,再在后面加上幾次重復
/PRIMER_LEFT_SEQUENCE=(.*)/ || do {$count_failed++;print OUT "$misa\n"; next}; my $info = "$1\t";
/PRIMER_LEFT_TM=(.*)/; $info .= "$1\t";
/PRIMER_LEFT=\d+,(\d+)/; $info .= "$1\t";
/PRIMER_RIGHT_SEQUENCE=(.*)/; $info .= "$1\t";
/PRIMER_RIGHT_TM=(.*)/; $info .= "$1\t";
/PRIMER_RIGHT=\d+,(\d+)/; $info .= "$1\t";
/PRIMER_PRODUCT_SIZE=(.*)/; $info .= "$1\t";
/PRIMER_LEFT=(\d+),\d+/; $info .= "$1\t";
/PRIMER_RIGHT=(\d+),\d+/; $info .= "$1\t";
/PRIMER_LEFT_1_SEQUENCE=(.*)/; $info .= "$1\t";
/PRIMER_LEFT_1_TM=(.*)/; $info .= "$1\t";
/PRIMER_LEFT_1=\d+,(\d+)/; $info .= "$1\t";
/PRIMER_RIGHT_1_SEQUENCE=(.*)/; $info .= "$1\t";
/PRIMER_RIGHT_1_TM=(.*)/; $info .= "$1\t";
/PRIMER_RIGHT_1=\d+,(\d+)/; $info .= "$1\t";
/PRIMER_PRODUCT_SIZE_1=(.*)/; $info .= "$1\t";
/PRIMER_LEFT_1=(\d+),\d+/; $info .= "$1\t";
/PRIMER_RIGHT_1=(\d+),\d+/; $info .= "$1\t";
/PRIMER_LEFT_2_SEQUENCE=(.*)/; $info .= "$1\t";
/PRIMER_LEFT_2_TM=(.*)/; $info .= "$1\t";
/PRIMER_LEFT_2=\d+,(\d+)/; $info .= "$1\t";
/PRIMER_RIGHT_2_SEQUENCE=(.*)/; $info .= "$1\t";
/PRIMER_RIGHT_2_TM=(.*)/; $info .= "$1\t";
/PRIMER_RIGHT_2=\d+,(\d+)/; $info .= "$1\t";
/PRIMER_PRODUCT_SIZE_2=(.*)/; $info .= "$1\t";
/PRIMER_LEFT_2=(\d+),\d+/; $info .= "$1\t";
/PRIMER_RIGHT_2=(\d+),\d+/; $info .= "$1";
改為
/PRIMER_LEFT_0_SEQUENCE=(.*)/ || do {$count_failed++;print OUT "$misa\n"; next}; my $info = "$1\t";
/PRIMER_LEFT_0_TM=(.*)/; $info .= "$1\t";
/PRIMER_LEFT_0=\d+,(\d+)/; $info .= "$1\t";
/PRIMER_RIGHT_0_SEQUENCE=(.*)/; $info .= "$1\t";
/PRIMER_RIGHT_0_TM=(.*)/; $info .= "$1\t";
/PRIMER_RIGHT_0=\d+,(\d+)/; $info .= "$1\t";
/PRIMER_PRODUCT_SIZE_0=(.*)/; $info .= "$1\t";
/PRIMER_LEFT_0=(\d+),\d+/; $info .= "$1\t";
/PRIMER_RIGHT_0=(\d+),\d+/; $info .= "$1\t";
/PRIMER_LEFT_1_SEQUENCE=(.*)/; $info .= "$1\t";
/PRIMER_LEFT_1_TM=(.*)/; $info .= "$1\t";
/PRIMER_LEFT_1=\d+,(\d+)/; $info .= "$1\t";
/PRIMER_RIGHT_1_SEQUENCE=(.*)/; $info .= "$1\t";
/PRIMER_RIGHT_1_TM=(.*)/; $info .= "$1\t";
/PRIMER_RIGHT_1=\d+,(\d+)/; $info .= "$1\t";
/PRIMER_PRODUCT_SIZE_1=(.*)/; $info .= "$1\t";
/PRIMER_LEFT_1=(\d+),\d+/; $info .= "$1\t";
/PRIMER_RIGHT_1=(\d+),\d+/; $info .= "$1\t";
/PRIMER_LEFT_2_SEQUENCE=(.*)/; $info .= "$1\t";
/PRIMER_LEFT_2_TM=(.*)/; $info .= "$1\t";
/PRIMER_LEFT_2=\d+,(\d+)/; $info .= "$1\t";
/PRIMER_RIGHT_2_SEQUENCE=(.*)/; $info .= "$1\t";
/PRIMER_RIGHT_2_TM=(.*)/; $info .= "$1\t";
/PRIMER_RIGHT_2=\d+,(\d+)/; $info .= "$1\t";
/PRIMER_PRODUCT_SIZE_2=(.*)/; $info .= "$1\t";
/PRIMER_LEFT_2=(\d+),\d+/; $info .= "$1\t";
/PRIMER_RIGHT_2=(\d+),\d+/; $info .= "$1";
/PRIMER_LEFT_3_SEQUENCE=(.*)/; $info .= "$1\t";
/PRIMER_LEFT_3_TM=(.*)/; $info .= "$1\t";
/PRIMER_LEFT_3=\d+,(\d+)/; $info .= "$1\t";
/PRIMER_RIGHT_3_SEQUENCE=(.*)/; $info .= "$1\t";
/PRIMER_RIGHT_3_TM=(.*)/; $info .= "$1\t";
/PRIMER_RIGHT_3=\d+,(\d+)/; $info .= "$1\t";
/PRIMER_PRODUCT_SIZE_3=(.*)/; $info .= "$1\t";
/PRIMER_LEFT_3=(\d+),\d+/; $info .= "$1\t";
/PRIMER_RIGHT_3=(\d+),\d+/; $info .= "$1";
/PRIMER_LEFT_4_SEQUENCE=(.*)/; $info .= "$1\t";
/PRIMER_LEFT_4_TM=(.*)/; $info .= "$1\t";
/PRIMER_LEFT_4=\d+,(\d+)/; $info .= "$1\t";
/PRIMER_RIGHT_4_SEQUENCE=(.*)/; $info .= "$1\t";
/PRIMER_RIGHT_4_TM=(.*)/; $info .= "$1\t";
/PRIMER_RIGHT_4=\d+,(\d+)/; $info .= "$1\t";
/PRIMER_PRODUCT_SIZE_4=(.*)/; $info .= "$1\t";
/PRIMER_LEFT_4=(\d+),\d+/; $info .= "$1\t";`
/PRIMER_RIGHT_4=(\d+),\d+/; $info .= "$1";
- 最后運行p3_out.pl腳本即可
perl p3_out.pl Zea_mays.AGPv4.dna.chromosome.1_ssr.fa.p3out Zea_mays.AGPv4.dna.chromosome.1_ssr.fa.misa
最后的結果