侯雪新, 杜鹏程, 李振军. 规律成簇的间隔短回文重复序列中间隔序列的噬菌体来源研究[J]. 疾病监测, 2015, 30(5): 358-360. DOI: 10.3784/j.issn.1003-9961.2015.05.005
引用本文: 侯雪新, 杜鹏程, 李振军. 规律成簇的间隔短回文重复序列中间隔序列的噬菌体来源研究[J]. 疾病监测, 2015, 30(5): 358-360. DOI: 10.3784/j.issn.1003-9961.2015.05.005
HOU Xue-xin, DU Peng-cheng, LI Zhen-jun. Study of spacer sequences in clustered regularly interspaced short palindromic repeats originated from bacteriophages[J]. Disease Surveillance, 2015, 30(5): 358-360. DOI: 10.3784/j.issn.1003-9961.2015.05.005
Citation: HOU Xue-xin, DU Peng-cheng, LI Zhen-jun. Study of spacer sequences in clustered regularly interspaced short palindromic repeats originated from bacteriophages[J]. Disease Surveillance, 2015, 30(5): 358-360. DOI: 10.3784/j.issn.1003-9961.2015.05.005

规律成簇的间隔短回文重复序列中间隔序列的噬菌体来源研究

Study of spacer sequences in clustered regularly interspaced short palindromic repeats originated from bacteriophages

  • 摘要: 目的 发现在现有的全基因组测序完成的原核生物中规律成簇的间隔短回文重复序列(CRISPR)系统中间隔序列分布规律以及间隔序列中噬菌体来源情况. 方法 整理现有CRISPR数据库中2762株细菌基因组中的CRISPR系统和其中的间隔序列数据,整理GenBank数据库中发表的1444个噬菌体基因组数据.利用BLASTN软件对间隔序列数据与噬菌体基因组进行相似性比较,计数资料比较使用2检验. 结果 在2762个细菌基因组中整理出1940个基因组存在确定或可能的CRISPR结构和90 096条间隔序列,多数基因组具有1~50条间隔序列(1414/1940,72.9%),间隔序列数量250条的仅有58个基因组(58/1940,3.0%).其中古细菌13株(13/150,8.6%),真细菌45株(45/2612,1.7%),差异有统计学意义(2=29.98,P 0.01).相似性比较结果共发现245个细菌基因组的1055条间隔序列,成功比对上363个噬菌体,比对成功率仅为0.12%. 结论 细菌基因组中的CRISPR系统中间隔序列数量存在较大差异,古细菌基因组中CRISPR系统存在更多的间隔序列.相似性比较中噬菌体来源的间隔序列所占比例低,提示与细菌和噬菌体基因组发现较少相关,进一步深入研究可以大幅度提高成功率.

     

    Abstract: Objective To understand the regularity of spacer sequences in clustered regularly interspaced short palindromic repeats (CRISPR) distributed in prokaryotes to which the complete genome sequencing was completed, and find the spacer sequences originated from bacteriophages. Methods From CRISPR database, 90096 spacer sequences identified from bacterial genome sequences were obtained and 1444 bacteriophage sequences from GenBank were used to establish the database. All the spacer sequences were aligned by using bacteriophage database with BLASTN software. Enumeration data were analyzed with 2 test. Results Among 2762 genomes, there were 1940 genomes with CRISPR or with possible CRISPR and 90096 spacer sequences in these CRISPR. Most genome had 1~50 spacers (1414/1940, 72.9%). Only 58 genomes had 250 spacers (58/1940, 3.0%). Among these genomes, 13 were from archaebacteria and the other 45 genomes were from true bacteria, the difference was statistical significant (2=29.98,P 0.01). The 1055 spacer sequences form 245 bacteria strains completely matched to 363 bacteriophages sequences, the rate was only 0.12%. Conclusion The number of spacer sequences in CRISPR differed among prokaryote genomes. There were more spacer sequences in CRISPR of archaebacteria genomes than in those of true bacteria genomes. The rate of spacer sequence originated from bacteriophages was low, which was related with the less detection of bacteria and bacteriophages genomes. Further research can improve the effective discovery.

     

/

返回文章
返回