BACKGROUND: Treponema pallidum ssp. pallidum (TPA), the causative agent of syphilis, is a highly clonal bacterium showing minimal genetic variability in the genome sequence of individual strains. Nevertheless, genetically characterized syphilis strains can be clearly divided into two groups, Nichols-like strains and SS14-like strains. TPA Nichols and SS14 strains were completely sequenced in 1998 and 2008, respectively. Since publication of their complete genome sequences, a number of sequencing errors in each genome have been reported. Therefore, we have resequenced TPA Nichols and SS14 strains using next-generation sequencing techniques.
METHODOLOGY/PRINCIPAL FINDINGS: The genomes of TPA strains Nichols and SS14 were resequenced using the 454 and Illumina sequencing methods that have a combined average coverage higher than 90x. In the TPA strain Nichols genome, 134 errors were identified (25 substitutions and 109 indels), and 102 of them affected protein sequences. In the TPA SS14 genome, a total of 191 errors were identified (85 substitutions and 106 indels) and 136 of them affected protein sequences. A set of new intrastrain heterogenic regions in the TPA SS14 genome were identified including the tprD gene, where both tprD and tprD2 alleles were found. The resequenced genomes of both TPA Nichols and SS14 strains clustered more closely with related strains (i.e. strains belonging to same syphilis treponeme subcluster). At the same time, groups of Nichols-like and SS14-like strains were found to be more distantly related.
CONCLUSION/SIGNIFICANCE: We identified errors in 11.5% of all annotated genes and, after correction, we found a significant impact on the predicted proteomes of both Nichols and SS14 strains. Corrections of these errors resulted in protein elongations, truncations, fusions and indels in more than 11% of all annotated proteins. Moreover, it became more evident that syphilis is caused by treponemes belonging to two separate genetic subclusters.