|
|
||||||||


1 Unité de Biochimie Bactérienne, UR477, INRA, 78350 Jouy-en-Josas, France
2 Unité Mathématique Informatique et Génome, UR1077, INRA, 78350 Jouy-en-Josas, France
3 Unité de Génétique Microbienne, UR895, INRA, 78350 Jouy-en-Josas, France
Correspondence
Rozenn Gardan
rozenn.gardan{at}jouy.inra.fr
Identification of short genes that encode peptides of fewer than 60 aa is challenging, both experimentally and in silico. As a consequence, the universe of these short coding sequences (CDSs) remains largely unknown, although some are acknowledged to play important roles in cell–cell communication, particularly in Gram-positive bacteria. This paper reports a thorough search for short CDSs across streptococcal genomes. Our bioinformatic approach relied on a combination of advanced intrinsic and extrinsic methods. In the first step, intrinsic sequence information (nucleotide composition and presence of RBSs) served to identify new short putative CDSs (spCDSs) and to eliminate the differences between annotation policies. In the second step, pseudogene fragments and false predictions were filtered out. The last step consisted of screening the remaining spCDSs for lines of extrinsic evidence involving sequence and gene-context comparisons. A total of 789 spCDSs across 20 complete genomes (19 Streptococcus and one Enterococcus) received the support of at least one line of extrinsic evidence, which corresponds to an average of 20 short CDSs per million base pairs. Most of these had no known function, and a significant fraction (31 %) are not even annotated as hypothetical genes in GenBank records. As an illustration of the value of this list, we describe a new family of CDSs, encoding very short hydrophobic peptides (20–23 aa) situated just upstream of some of the positive transcriptional regulators of the Rgg family. The expression of seven other short CDSs from Streptococcus thermophilus CNRZ1066 that encode peptides ranging in length from 41 to 56 aa was confirmed by real-time quantitative RT-PCR and revealed a variety of expression patterns. Finally, one peptide from this list, encoded by a gene that is not annotated in GenBank, was identified in a cell-envelope-enriched fraction of S. thermophilus CNRZ1066.
These authors contributed equally to this work.
Two supplementary tables listing spCDSs of the genomes studied that passed the filtration of pseudogenes and false predictions, and predicted and annotated short CDSs supported by extrinsic evidence in one Enterococcus faecalis and 18 streptococcal genomes, and a supplementary figure showing the chromosomal context of CDSs encoding the SHPs associated with rgg genes, are available with the online version of this paper.
This article has been cited by other articles:
![]() |
M. Ibrahim, A. Guillot, F. Wessner, F. Algaron, C. Besset, P. Courtin, R. Gardan, and V. Monnet Control of the Transcription of a Short Gene Encoding a Cyclic Peptide in Streptococcus thermophilus: a New Quorum-Sensing System? J. Bacteriol., December 15, 2007; 189(24): 8844 - 8854. [Abstract] [Full Text] [PDF] |
||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| INT J SYST EVOL MICROBIOL | MICROBIOLOGY | J GEN VIROL |
| J MED MICROBIOL | ALL SGM JOURNALS | |