SM: Conceptualization, Data curation, Writing - original draft, Visualization, Methodology
PC: Writing - review & editing, Validation, Software, Methodology
TL: Funding acquisition, Software, Methodology, Validation, Writing - review & editing
Transfer RNAs (tRNAs) are ubiquitous adapter molecules that link specific codons in messenger RNA (mRNA) with their corresponding amino acids during protein synthesis. The tRNA genes of Drosophila have been investigated for over half a century but have lacked systematic identification and nomenclature. Here, we review and integrate data within FlyBase and the Genomic tRNA Database (GtRNAdb) to identify the full complement of tRNA genes in the
(A) Isoacceptor counts of functional cytosolic tRNA genes in Drosophila - pseudogenes and mitochondrial tRNA genes are excluded. Colors distinguish distinct anticodons within each isoacceptor family. Counts for initiator (iMet) and elongator tRNA:Met are shown separately. (B) Summary statistics for all Drosophila tRNA genes. (C) Example of tRNA gene nomenclature syntax. (D) Distribution of tRNA genes across major chromosomal scaffolds; pseudogene numbers are in parentheses. ‘M’ = mitochondrial genome. (E) Example of a tRNA gene cluster from the 2R:6,144,000..6,195,000 genomic region, corresponding to cytological region 42A shown in Figure 3 of Kubli 1982. (Notably, the only significant change to tRNA annotations in this region is the addition of tRNA:Arg-ACG-1-4 in the current version.) Green highlight: tRNA:Arg-ACG-1 isodecoders; orange highlight: tRNA:Lys-CTT-1; yellow highlight: tRNA:Asn-GTT-1; magenta highlight: tRNA:Ile-AAT-1; blue rectangles: protein-coding genes; red rectangle is a lncRNA gene.
tRNAs are universal to all cellular life and provide the essential molecular link between mRNA codons and their corresponding amino acids during translation (reviewed by Suzuki 2021). The main functional regions in a tRNA are the anticodon triplet, which base pairs with mRNA codons, and the 3′ end to which the cognate amino acid is attached. Codon degeneracy for the 21 amino acids (20 standard amino acids plus selenocysteine) means that up to six tRNAs with distinct anticodons (‘isoacceptors’) are required depending on the amino acid. tRNA diversity is further increased through the existence of tRNAs that share the same anticodon but differ in the sequence of their body structure (Goodenbour and Pan 2006). Such ‘isodecoders’ may differ from each other by just one or several nucleotides. Moreover, each specific isodecoder sequence can be present in multiple copies within a genome. This combination of diversity and redundancy results in eukaryotic nuclear genomes having hundreds of genes encoding tRNAs functioning in cytosolic translation (cytosolic tRNAs). An additional set of tRNAs functioning in mitochondria are encoded by the mitochondrial genome of eukaryotes: in vertebrates and many other metazoa, there are often 22 mitochondrial tRNA genes with tRNA:Leu and tRNA:Ser represented by two different isoacceptors.
The tRNAs and tRNA genes of
We revisited cytosolic tRNA gene annotations in the current version of the Drosophila genome (release 6), integrating gene predictions from the Genomic tRNA database (GtRNAdb; Chan and Lowe 2016; Chan
Prior to our analysis, greater than 50% of Drosophila cytosolic tRNA genes were unnamed in FlyBase, while the named genes used an ambiguous and esoteric nomenclature incorporating the single letter amino acid code and cytogenetic map information but lacking anticodon information. We therefore implemented the logical and systematic nomenclature used by the GtRNAdb within FlyBase (Figure 1C; Extended Data Table 1). This syntax comprises the 3-letter code of the cognate amino acid (isotype), the anticodon triplet, a number identifying each unique transcript (isodecoder) sequence, followed by a second number to specify each copy (locus) of that sequence within the genome. This is preceded by the standard ‘
Mitochondrial tRNA genes are not currently included within the GtRNAdb. We therefore compared existing FlyBase annotations against Drosophila entries in the mitotRNAdb (Jühling
Extended Data Tables 1 and 2 provide detailed information on all the cytosolic and mitochondrial tRNA genes, respectively. Among the functional, cytosolic tRNAs decoding the 20 standard amino acids, each isoacceptor is encoded by between five (tRNA:His) and 26 (tRNA:Arg) genes, with the number of distinct anticodons in each isoacceptor family ranging from one (e.g. tRNA:Asn-GTT) to five (tRNA:Leu-CAG, tRNA:Leu-AAG, tRNA:Leu-CAA, tRNA:Leu-TAA, tRNA:Leu-TAG). Up to four distinct transcript sequences exist for a given anticodon (as is the case for tRNA:Arg-TCG, tRNA:Cys-GCA, tRNA:Gln-CTG and tRNA:Leu-TAA), and a given transcript sequence may be present in up to 13 exact gene copies (tRNA:Gly-GCC-1 and tRNA:Lys-CTT-1). Overall, there are 44 different isoacceptors and 84 distinct tRNA transcripts encoded by the Drosophila nuclear genome. Cytosolic tRNA genes are present on all major chromosome arms (Figure 1D) and are frequently found within clusters, harboring members of either the same or different isoacceptor families (Figure 1E; Kubli 1982; Phillips and Ardell 2021). 46% of cytosolic tRNA genes are located within introns of protein-coding genes, with the remainder being intergenic. A minority (5%) of cytosolic tRNA genes contain an intron, namely two tRNA:Ile-TAT, four tRNA:Leu-CAA and ten tRNA:Tyr-GTA genes (Figure 1B; Bergman and Ardell 2014). These characteristics are largely comparable with the cytosolic tRNA gene complement of other metazoa (
Notably, this project enabled several additional improvements to the representation of tRNAs within FlyBase. All functional (Gene Ontology) annotations were reviewed and revised as necessary. Reciprocal links between tRNA gene/transcript reports in FlyBase and corresponding pages at the GtRNAdb and RNAcentral (RNAcentral Consortium 2021) have been established, and 2D structural images for Drosophila tRNAs (Sweeney
In conclusion, we have generated definitive sets of cytosolic and mitochondrial tRNA genes present in the Drosophila genome and implemented a systematic and informative nomenclature for them. The improved datasets are available from several databases, including FlyBase, GtRNAdb, RNAcentral, and the Alliance of Genome Resources (Alliance of Genome Resources Consortium 2022). Our work will facilitate further exploration of tRNA biology within Drosophila as well as new comparative studies with other species.
Data on cytosolic tRNA genes were accessed and downloaded from FlyBase (
Data on mitochondrial tRNA genes were accessed and downloaded from FlyBase (
Description: Cytosolic tRNA genes. Resource Type: Dataset. DOI:
Description: Mitochondrial tRNA genes. Resource Type: Dataset. DOI:
SJM is funded by a grant from the National Human Genome Research Institute, National Institutes of Health (U41HG000739) to Norbert Perrimon (PI), Nicholas Brown (co-PI). PPC and TML are funded by a grant from the National Human Genome Research Institute, National Institutes of Health (R01HG006753) to TML.