Transposable elements constitute the main part of plant genomes. They have been identified and annotated from the Robusta coffee tree genome (C. canephora), one of the two diploid ancestors of the cultivated allotetraploid Arabica genome (C. arabica), and so from the C. arabica genome. TEs represent 49% et 59% of C. canephora and C. arabica genomes, respectively. LTR retrotransposons (LTR-RTN) is a particular class of TEs that represent altogether 42% and 52% of C. canephora and C. arabica. In C. canephora, LTR-RTNs are not randomly distributed along chromosomes, drawing TE rich and poor-regions similar to heterochromatin and euchromatin regions observed in plant chromosomes. The analysis of our LTR-RTN databases allowed identifying a new group of non-autonomous elements, devoid of the enzymatic machinery involved in their mobility (Chaparro et al., 2015). Members in this group, called TR-GAG, are relatively short (< 4Kbp) and carry only one open reading frame coding for a Capsid protein (GAG). In total, five different TR-GAG families have been identified in coffee trees. Similar elements were discovered in 23 available monocots and dicots plant genomes, indicating that they are ubiquitous elements in plant kingdom. The LTR-RTN databases were also used in comparative sequence analyses between C. arabica and its two diploid ancestors (C. canephora et C. eugenioides). Despite few overall variations, 14 LTR-RTNs with slight copy number variations have been identified (6 and 8 families from Copia and Gypsy groups, respectively) and used as probes for FISH.
In addiction to cultivated species, 10 genomes of wild species from West Africa, East Africa and Madagascar were partially sequenced (> 100,000 sequences each), and the diversity and copy numbers of LTR-RTNs studied. The analysis highlighted one LTR-RTN family, called SIRE (Copia), not present in species from East Africa and Madagascar, but showing a large diversity in species from West Africa, like C. canephora and C. arabica. Two SIRE elements, were finally used as genetic markers with the REMAP technology to analyze the diversity of a set of 96 DNA sample from wild and cultivated C. arabica, C. eugenioides and C. canephora. Preliminary data suggest that these LTR-RTN are particularly informative to reveal the genetic diversity of C. arabica genotypes.
The analysis of the C. arabica genome and its comparison with the diploid ancestor C. canephora will allow in the near future to highlight variation of the TE insertion profiles. Those profiles will be exploited to understand the evolution and origin of the C. arabica genome.
Project Number : 1102-006
Year : 2011
Type of funding : AAP CAPES
Project type : AAP
Start date :
01 Oct 2012
End date :
31 Dec 2014
Flagship project :
Non
Project leader :
Romain Guyot
Project leader's institution :
IRD
Project leader's RU :
DIADE
Budget allocated :
50000 €
Total budget allocated ( including co-financing) :
110000 €
Funding :
Labex