陈海霞, 贾俊楠, 李卫民, 高基民. 结核分枝杆菌单核苷酸多态性特征的分析[J]. 疾病监测, 2017, 32(4): 332-336. DOI: 10.3784/j.issn.1003-9961.2017.04.018
引用本文: 陈海霞, 贾俊楠, 李卫民, 高基民. 结核分枝杆菌单核苷酸多态性特征的分析[J]. 疾病监测, 2017, 32(4): 332-336. DOI: 10.3784/j.issn.1003-9961.2017.04.018
CHEN Hai-xia, JIA Jun-nan, LI Wei-min, GAO Ji-min. Characteristics of single nucleotide polymorphism of Mycobacterium tuberculosis[J]. Disease Surveillance, 2017, 32(4): 332-336. DOI: 10.3784/j.issn.1003-9961.2017.04.018
Citation: CHEN Hai-xia, JIA Jun-nan, LI Wei-min, GAO Ji-min. Characteristics of single nucleotide polymorphism of Mycobacterium tuberculosis[J]. Disease Surveillance, 2017, 32(4): 332-336. DOI: 10.3784/j.issn.1003-9961.2017.04.018

结核分枝杆菌单核苷酸多态性特征的分析

Characteristics of single nucleotide polymorphism of Mycobacterium tuberculosis

  • 摘要: 目的 通过全基因组序列分析结核分枝杆菌(MTB)单核苷酸多态性(SNP)特征,为结核病的预防、控制及治疗提供参考依据。方法 从美国国立生物技术信息中心(NCBI)和欧洲核酸数据库(ENA)中共下载来自全球2 372株MTB全基因组序列,原始数据按照质控要求去除冗余,BWA v 0.7.12软件将菌株的测序文件回帖到结核杆菌参考基因组H37Rv上;SAMtools v 1.3、Picard v 1.112、Varscan筛选SNPs位点以及去除非特异性SNP 位点;采用最大似然法软件RAxML v 8.2.8构建系统进化树;Genepop v 4.5.1软件计算每个SNP位点的遗传分化系数(Fst);SnpEff v 4.3c软件注释。结果 初步筛选得到107 654个SNP位点,构建的系统进化树将2 347株MTB明确地划分为7个谱系以及69个亚谱系。优化后得到285个谱系定义的SNP位点,将2 347株MTB准确划分为7个分支及67个亚谱系。结论 本研究通过基因组序列分析发现一批基于系统进化的SNP位点,而且基于系统进化285个SNP位点不仅可以用于系统发育及进化相关分析,同时也能够作为基因分型技术靶标,用于结核病分子流行病学。

     

    Abstract: Objective To provide a scientific basis of tuberculosis (TB) prevention and control by analyzing the characteristics of single nucleotide polymorphism (SNP) of the whole-genome sequences of Mycobacterium tuberculosis. Methods The whole-genome sequences of 2 372 M. tuberculosis strains were download form National Center for Biotechnology Information (NCBI) and European Nucleotide Archive (ENA), we qualified the raw data to clean redundancy as specific rules required, BWA v 0.7.12, SAMtools v 1.3, Picard v 1.112 and Varscan were employed to call SNPs and mappability values were used to filter out non-unique SNP sites. The maximum likelihood analysis constructed the phylogenetic tree by software RAxMLv 8.2.8, and calculated the value of Fst for each SNP with software Genepop v 4.5.1. Finally, software SnpEff v 4.3c was used to annotate each SNPs site. Results We gained 107 654 sites by initially SNP calling, the phylogenetic tree based on the 107 654 SNP sites classified 2 347 isolates into 7 lineages and 69 sublineages. And we gained 285 SNP sites by optimizing, and the phylogenetic tree based on the 285 sites classified 2 347 isolates into 7 lineages and 67 sublineages. Conclusion In the study we found a set of phylogeny-based SNP sites, the 285 SNP sites can not only be used in the phylogenetic and evolutionary analyses, but also used as markers of genotyping in molecular epidemiological research of tuberculosis.

     

/

返回文章
返回