Author version


Genetic Maps and Genome Structure



Yüklə 126,69 Kb.
səhifə2/3
tarix26.04.2018
ölçüsü126,69 Kb.
#40267
1   2   3

Genetic Maps and Genome Structure

Sturtevant and Morgan’s insight that the percentage of meiotic recombinants between two loci can be used as a measure of the distance between them allowed the construction of genetic maps. The first genetic maps were made using phenotypic characteristics, but traits controlled by single locus are scarce, and modern genetic maps are based on DNA markers. In peanut, which has very low DNA polymorphism, the generation of informative genetic markers has been very difficult, and this has been a fundamental limitation to peanut genetics.

The development of molecular markers for peanut has followed the technical trends of the times. The first studies were based on isozymes and proteins (Krishna and Mitra 1988; Grieshammer and Wynne 1990; Lu and Pickersgill 1993), followed by restriction fragment length polymorphism – RFLPs (Kochert et al. 1991, 1996; Paik-Ro et al. 1992;), random amplified polymorphic DNA – RAPDs (Halward et al. 1991, 1992; Hilu and Stalker 1995; Subramanian et al. 2000), amplified fragment length polymorphism – AFLPs (He and Prakash 1997, 2001; Gimenes et al. 2002; Herselman 2003; Ferguson et al. 2004; Milla et al. 2005; Tallury et al. 2005), more recently microsatellite markers (Hopkins et al. 1999; Palmieri et al. 2002; He et al. 2003, 2005; Moretzsohn et al. 2004, 2005, 2009; Palmieri et al. 2005; Bravo et al. 2006; Budiman et al. 2006; Mace et al. 2006; Gimenes et al. 2007; Proite et al. 2007; Wang et al. 2007; Cuc et al. 2008; Guo et al. 2008; Naito et al. 2008; Liang et al. 2009; Yuan et al. 2010; Koilkonda et al. 2011) and molecular markers based on MITE markers (Shirasawa et al. in press and unpublished data). Generally, these markers have shown a trend towards becoming more informative, and now microsatellites, being codominant and easy to score in the tetraploid genome, are considered the molecular marker of choice, with MITE markers also showing much potential.

Markers based on single nucleotide polymorphism (SNP) have proved very difficult to apply for peanut. This is because they are very rare and difficult to detect against the background of false A-B polymorphism. If we consider that even very diverse peanut cultivars have diverged only a few thousand years ago, whilst the A and B genomes diverged a few Mya, then we can expect true A-A and B-B SNP rates to be in the region of 1,000 times less frequent than false A-B SNP rate. In addition to this difficulty of discovering SNPs, the considerable problem of scoring them in a tetraploid genome complicates it even more.

The obstacles to mapping in peanut meant that the first maps were generated using crosses involving wild species. Some of the maps also used the simpler diploid genetics of wild diploid species. Halward and collaborators (1991) produced such a diploid map, based on RFLPs and an F2 population derived from two diploid A genome species A. stenosperma and A. cardenasii. Another RFLP-based map was published for a BC1 tetraploid population derived from a synthetic allotetraploid [A. batizocoi x (A. cardenasii x A. diogoi)]4x crossed with peanut (Burow et al. 2001). In this latter map, three hundred and seventy RFLP loci were mapped onto 23 linkage groups, spanning a total of 2,207 cM. This tetraploid map was particularly informative to genome structure because it allowed the assignment of marker alleles to A or B genomes by reference to the known genomes of the diploid parents of the synthetic allotetraploid. In this way homologous linkage groups could be aligned. This showed that marker order was highly conserved between the A and B genomes.

The first map based on microsatellites was derived from a cross between A. duranensis and A. stenosperma (Moretzsohn et al. 2005). This map consisted of 11 linkage groups covering 1,230 cM. Subsequently, a microsatellite map of the B genome based on a cross of A. ipaënsis and the closely related A. magna map had 10 linkage groups, with 149 loci spanning a very similar total map distance of 1,294 cM. The comparison of 51 shared markers between these two maps revealed high levels of synteny, with all but one of the B linkage groups showing a single main correspondence to an A linkage group. This seems largely consistent with the observations for the previously mentioned tetraploid map. The main differences being: in the tetraploid study, one large B linkage group shows no marker correspondences to the A genome, whilst comparisons of the diploid maps showed no "orphan" linkage groups. Furthermore, in the diploid comparison, two B linkage groups correspond to one A, a situation not observed in the tetraploid map (Moretzsohn et al. 2009).

Further markers were subsequently included in the diploid A genome map. The most recently published version have 369 markers in 10 linkage groups (Leal-Bertioli et al. 2009). The total genetic distance covered by this map was more than 3,000 cM. Considering comparisons with subsequently published maps (Foncéka et al. 2009), we can conclude that this distance is overestimated several fold, probably due to the mixing of different dominant and codominant marker types. Nevertheless, this overestimate does not significantly reduce the information content of the map, because marker order seems correct and virtually all markers were sequence-characterized. Furthermore, many of the markers were particularly informative: 102 were genome comparative markers developed from intron regions of low copy genes (Leg anchor markers; Fredslund et al. 2006), and 35 were resistance gene homologs.

The sequence-characterized markers, and high proportion of low or single copy gene markers allowed the map to be aligned to the fully sequenced genomes of Lotus japonicus and Medicago truncatula (Sato et al. 2008; www.medicago.org). These were represented as “genome plots” (Fig. 3; Bertioli et al. 2009). Inspection of these plots shows surprising degrees of synteny considering the time of species divergence (estimated 55 million years). Although there are some regions of double affinities between Arachis and these model legumes, most synteny blocks have a single main affinity and not multiple affinities interleaved. This is an important observation. Genome evolution, for instance, chromosomal translocations and inversions, progressively breaks down syntenic relationships between species over evolutionary time. However, in addition, whole genome duplications occur periodically during plant evolution, followed by progressive diploidization. In this chapter until now we have referred to diploid and tetraploid as if they were absolute states. In fact, genome duplication in plant evolution is sufficiently frequent that almost no plant is fully diploid, but in varying states of diploidization, following the most recent polyploidy event (Adams and Wendel 2005; Cui et al. 2006). From the single pairwise affinities in the Arachis genome plots two main conclusions can be made. Firstly, that the last universal legume whole genome duplication predated the divergence of Arachis from the Galegoids and Phaseoloids sufficiently that the common ancestral genome that existed some 55 Mya was substantially diploidized. Secondly, that the so-called diploid Arachis genomes are therefore truly substantially diploid; their internal duplication is likely to be in the same range as for Lotus and Medicago 6.8% and 9.7%, respectively (Cannon et al. 2006).

Most economically important legumes and the two most important model legumes, Medicago and Lotus, belong to the Galegoid or Phaseoloid clades. Arachis is an outgroup, and so comparisons are particularly informative for making evolutionary inferences. For this reason, the Arachis vs Lotus / Medicago plots were drawn with equivalent chromosomal orders as a previously published comparison between Lotus and Medicago genomes (Bertioli et al. 2009; Cannon et al. 2006; Fig. 3). Then, all possible Arachis-Lotus-Medicago species-by-species analysis could be observed in a comparable format. In this way, ten distinct conserved synteny blocks and also nonconserved regions could be observed in all genomes. This clearly implies that certain legume genomic regions are consistently more stable during evolution than others. It is notable that these regions are large scale, and apparently in some cases consist of entire chromosomal arms.

An explanation for these observations was found by analyzing transposon distributions in Lotus and Medicago. Retrotransposons are very unevenly distributed in both the model legumes and it was observed that the retrotransposon-rich regions tend to correspond to variable regions, intercalating with the synteny blocks, which are relatively retrotransposon poor. This tendency is particularly evident for Medicago, but somewhat less so for Lotus. Furthermore, while the variable regions generally have lower densities of single copy genes than the more conserved regions, some harbor high densities of the fast evolving disease resistance genes (Bertioli et al. 2009; Fig. 3). For Arachis it was notable that LGs 2 and 4, which harbor the most prominent clusters of resistance gene homologues (RGHs) and quantitative trait loci (QTLs), showed shattered synteny with both Lotus and Medicago. In a different study, resistance to root-knot nematode was mapped to LGA 9 (Nagy et al. 2010). The upper region of LGA 9 is a synteny block, but its lower region appears to be a variable region. The region that confers nematode resistance is derived from the wild diploid A. cardenasii, and is particularly genetically interesting because it displays strongly suppressed recombination with the A genome of A. hypogaea and appears to cover about one-third to a half of a chromosome.

Through large scale screening of simple sequence repeat (SSR) markers, a sufficient number of polymorphic markers were identified for the generation of the first genetic linkage maps based on cultivated x cultivated crosses (Varshney et al. 2009; Hong et al. 2008, 2010 ). These maps are very useful for breeding because they incorporate QTLs for agronomically important traits, such as disease resistance and drought-related traits. For the creation of the highest density map of peanut to date, markers screening was done by in silico analysis of the parents. The map has 1,114 markers and is 2,166 cM in length. Interestingly it has 21 linkage groups, two of which are much lower density than the others (Shirasawa et al. unpublished data).


  1. Sequencing the Peanut Genome

In a new development, as an initial phase in the International Peanut Genome Initiative (http://www.peanutbioscience.com) the very large capacity of Illumina sequencing is being used for the generation of high density genetic maps. The data for the generation of these maps is obtained essentially by using low coverage sequencing as a method of high density genotyping. The approach is being used in diploid and tetraploid peanut mapping populations and a peanut diversity panel (Froenicke et al. 2011, 2012). The genetic maps generated are expected to be especially useful in the ordering of contigs and scaffolds in the Peanut Genome Project.

The estimated size of the cultivated peanut’s genome is about 2.8 Gbp, almost as large as the human genome. Sequencing such a genome is a considerable task. However, especially with new generation sequencing, size is not the biggest problem. The main obstacle is the repeat structures present within large genomes. Very significant portions of large genomes consist of almost identical copies of DNA repeated multiple times. During sequence assembly, the placement of sequence reads derived from these repeats into the wrong position can prevent assembly being completed, or worse, induce the assembly to be completed in the wrong way. The peanut genome is no exception and harbors numerous repeat structures. As discussed above, there seems to have been very significant evolutionary recent activity of transposons. Each retrotransposition event creates a new repeat structure, and a potential problem for assembly. The most problematic transposition events are very recent ones where mutation has not had sufficient time to reduce sequence identity. The size of transposons, and even many solo LTRs, usually substantially exceeds the size of individual sequence reads. Therefore, paired sequence reads at different scales will be particularly important for spanning transposons and other repeat structures of varying scales and enabling assembly.

Although the overall repetitive profile of peanut seems compatible with whole genome shotgun sequencing, the allotetraploid genome with relatively recently diverged A and B components will be especially problematic. Assembly of such a genome may encounter two frequent problems, the first one being breaks in contigs because of misassembles at ends of contigs (A reads at ends of B contigs or vice versa) and the second, the generation of mixed A and B (chimeric) contigs. These problems are likely to be worse with shorter sequence reads because, for instance, identical 100 bp A and B homeologous regions will be much more common than identical 500 bp regions. Strategies will be necessary to overcome these difficulties, especially if the project is to take advantage of Illumina sequencing, which produces massive amounts of data, but short sequence reads. Two possible options are the sequencing of the diploid progenitors to provide templates for a tetraploid assembly, and a multiplexed BAC-by-BAC strategy.


  1. 5 Conclusions

Although, at the time of writing this chapter, there is relatively little ordered or anchored genomic DNA sequence available for peanut, many general features of the genome are apparent. The knowledge of these features is useful in the design of sequencing strategies, and should be useful to guide assembly methods, and to generate and test hypotheses when an assembled genome is available.

Summary points are presented below:

1) Peanut is a recent allotetraploid derived from the diploids A. duranensis and A. ipaënsis or very closely related species, which contributed the A and B genomes, respectively.

2) Diploid Arachis genomes are highly diploidized; their internal genome duplication is likely to be in the range of 10% or less.

3) The A and B genomes diverged about 3.5 Mya and have a very high genetic synteny.

4) The genomes of the ancestral diploids appear not to have undergone major structural rearrangements after polyploidization.

5) In high complexity DNA sequence identity between the A and B genomes is in the range of 94%, but their repetitive DNA has substantially diverged.

6) In gene-space the divergence of repetitive DNA may be substantially due to the activity of relatively few species of LTR retrotransposons since the time of A-B genome divergence.

7) Gene-space is likely to be ordered into about ten conserved blocks that have relatively high synteny even with other economically important legumes that diverged about 55 Mya.

8) The conserved blocks are likely to be intercalated with variable regions with relatively high retrotransposon frequencies and in some cases clusters of resistance gene homologs. Linkage groups A4 and the lower region of LG A2 may consist largely of such variable regions.

9) Peanut has a very narrow genetic base. Activity of transposons including MITEs since polyploidization has probably contributed to the phenotypic variability of peanut, although other mechanisms such differential silencing of A and B homeologs are also likely to be important.

Figures

Figure 1: In situ hybridization on metaphase spreads of Arachis spp. with DAPI counterstaining. a) GISH on the amphidiploid Arachis duranensis × A. ipaënsis with both parentals genomic DNA probes - green signals with A. duranensis on half of the chromosomes (A genome) and red signals with A. ipaënsis in the other half chromosomes (B genome) with signals overlapping part of some chromosomes. b) BAC-FISH with ADH79O23 (F12_Sl2_6OVER) probe with red signals over half of the chromosomes (A genome) and some dots on B genome chromosomes, more concentrated labeling but at different intensity depending on the chromosome. Hybridization signals were absent at centromere and telomere regions; c) BAC-FISH with ADH51I17 probe with diffused red signals in the pericentromere regions only on A genome chromosomes (F12_Sl4_5OVER); d) BAC-FISH with ADH179B13 probe with spotted green signals on A and B genome chromosomes but stronger on A chromosomes. Red signals correspond to the rDNA 5S sites (F17_Sl2_4OVER1). Scale bar: 5 µm.

Figure 2: Representation of BAC clone AD180A21 consisting of two contigs, one of 84,046 bp positioned at left hand side, and one of 5,920 bp positioned at right. Top: Repetitive Index graph; Middle: Annotation scheme, and Bottom: dot plot. Repetitive Index is a score for repeat content based on BLASTN against 41,856 A. duranensis BAC-end sequences. The score is calculated using the formula Repetitive Index = log10(N+1), where N is the number of BLASTN homologies with an evalue of 1e-20 or less. The highest peak represented here is 1.9, which is equivalent to 88 BLASTN homologies; the lowest peak is 0.3, which is equivalent to a single BLASTN homology. The annotation scheme represents long terminal repeats (LTRs) in blue and internal regions of transposons in white. Transposons in positive orientation are represented on upper strand, and those in negative on the lower. The dot plot is of the BAC sequence (horizontal) against whole representative sequences of the transposons FIDEL, Feral, Pipoka, Pipa and Gordo (vertical).

This BAC clone consists almost entirely of LTR retrotranposons their solo elements and remnants, and does not contain any non-transposon gene. The sequence contains two complete FIDELs, one complete Pipa, and a complete Gordo interrupted by one of the FIDEL elements (names in bold type), a lower copy LTR transposon (Element-FIB1), plus transposon fragments. All highly repetitive sequences in the BAC are derived from five retrotranposons: FIDEL, Feral, Pipoka, Pipa and Gordo.



Figure 3: Genome plots of Arachis vs. Medicago and Arachis vs. Lotus, integrated with each other and graphs of synteny with Arachis, and retrotransposon, and resistance gene homolog distributions for Medicago and Lotus (original figure is from Bertioli et al. 2009). Chromosome orders and numbering of synteny blocks are the same as a Medicago vs Lotus plot in Cannon et al. 2006, allowing direct comparisons. Equivalent conserved regions (synteny blocks) and variable regions are present in all possible combinations of species comparisons Arachis-Lotus-Medicago. This shows that some genomic regions (synteny blocks) are consistently more stable during evolution than others.

(a) Genome Plot of Arachis vs. Medicago.

(b) Density of blast detected resistance gene homologs of the TNL (red line) and CNL (green line) subclasses plotted along the Medicago genome. High densities of resistance gene homologs and retrotransposons coincide.

(c) Black line: density of blast detected retrotransposons plotted along the Medicago genome. Cyan-blue line: scaled synteny score of Medicago with Arachis. Synteny blocks occur in regions of low retrotransposon density.

(d) Black line: percentage genome coverage of retrotransposons plotted along the Lotus genome. Cyan-blue line: scaled synteny score of Lotus with Arachis. Synteny blocks tend to occur in regions of low retrotransposon coverage.

(e) Density of resistance gene homolog encoding sequences, TNL (red) and CNL (green), plotted along the Lotus genome. Clusters of resistance gene homologs and retrotransposons coincide.

(f) Genome Plot of Arachis vs. Lotus. Markers mapped to intervals are plotted as horizontal lines.

References

Adams KL, Wendel JF (2005). Polyploidy and genome evolution in plants. Curr Opin Plant Biol 8:135–141.

Araujo A., Nielen S., Vidigal B., Moretzsohn M., Leal-Bertioli S., Ratnaparkhe M., Kim C., Bailey J., Paterson A., Guimarães P., Schwarzacher T., Heslop-Harrison P., Bertioli D. 2012. An analysis of the repetitive component of the peanut genome in the evolutionary context of the Arachis A-B genome divergence. In: Plant Anim Genome XX Confe, San Diego CA, USA, P

Bechara M., Moretzsohn M., Palmieri D., Monteiro J., Bacci M., Martins J., Valls J., Lopes C., Gimenes M. 2010. Phylogenetic relationships in genus Arachis based on ITS and 5.8S rDNA sequences. BMC Plant Biol 10 (1):255.

Bertioli D., Moretzsohn M., Madsen L., Sandal N., Leal-Bertioli S., Guimaraes P., Hougaard B., Fredslund J., Schauser L., Nielsen A., Sato S., Tabata S., Cannon S., Stougaard J. 2009. An analysis of synteny of Arachis with Lotus and Medicago sheds new light on the structure, stability and evolution of legume genomes. BMC Genomics 10 (1):45.

Bomblies K., Weigel D. 2007. Hybrid necrosis: autoimmunity as a potential gene-flow barrier in plant species. Nat Rev Genet 8 (5):382-393

Bonavia D. 1982. Precerámico peruano, Los Gavilanes, oásis en la história del hombre. Corporación Financiera de Desarrollo S.A. COFIDE e Instituto Arqueológico Alemán. Lima, Peru.

Brasileiro A.C., Morgante C.V., Leal-Bertioli S.C., Santos C.M., Araújo A.C., Pappas G., Bonfim O., Silva F.R., Silva A.K., Martins A.C., Bertioli D.J., Guimarães P.M. 2012. Transcriptome survey on wild peanut relatives for discovery of drought-responsive genes. In: Plant Anim Genome XX Conf, San Diego, CA, USA, P?.

Bravo J.P., Hoshino A.A., Angelici C., Lopes C.R., Gimenes M.A. 2006. Transferability and use of microsatellite markers for the genetic analysis of the germplasm of some Arachis section species of the genus Arachis. Genet Mol Biol 29 (3):516 - 524

Budiman M, Jones J, Citek R, Warek U, Bedell J, Knapp S. 2006. Methylation-filtered and shotgun genomic sequences for diploid and tetraploid peanut taxa: http://www.ncbi.nlm.nih.gov/ (Accessed Oct 2011)

Burow M.D., Simpson C.E., Starr J.L., Paterson A.H. 2001. Transmission genetics of chromatin from a synthetic amphidiploid to cultivated peanut (Arachis hypogaea L.). broadening the gene pool of a monophyletic polyploid species. Genetics 159 (2):823-837

Cannon S.B., Sterck L., Rombauts S., Sato S., Cheung F., Gouzy J., Wang X., Mudge J., Vasdewani J., Schiex T., Spannagl M., Monaghan E., Nicholson C., Humphray S.J., Schoof H., Mayer K.F.X., Rogers J., Quetier F., Oldroyd G.E., Debelle F., Cook D.R., Retzel EF, Roe BA, Town CD, Tabata S, Peer Y, Young ND. 2006. Legume genome evolution viewed through the Medicago truncatula and Lotus japonicus genomes. Proc Natl Acad Sci USA 103:14959 - 14964

Cermeno MC, Orellana J, Santos JL, Lacadena JR. 1984. Nucleolar activity and competition (Amphiplasty) in the genus Aegilops. Heredity 53 (3):603-611

Charrier B, Foucher F, Kondorosi E, D’Aubenton-Carafa Y, Thermes C, Kondorosi A, Ratet P. 1999. Bigfoot: a new family of MITE elements characterized from the Medicago genus. Plant J 18 (4):431-441

Cronk Q, Ojeda I, Pennington R. 2006. Legume comparative genomics: progress in phylogenetics and phylogenomics. Curr Opin Plant Biol 9 (2):99-103

Cuc LM, Mace ES, Crouch JH, Quang VD, Long TD, Varshney RK. 2008. Isolation and characterization of novel microsatellite markers and their application for diversity assessment in cultivated groundnut (Arachis hypogaea L.). BMC Plant Biol 8:55

Cui L, Wall PK, Leebens-Mack JH, Lindsay BG, Soltis DE, Doyle JJ, Soltis PS, Carlson JE, Arumuganathan K, Barakat A, Albert VA, Ma H, dePamphilis CW. 2006. Widespread genome duplications throughout the history of flowering plants. Genome Res 16 (6):738-749: doi:10.1101/gr.4825606


Yüklə 126,69 Kb.

Dostları ilə paylaş:
1   2   3




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©genderi.org 2024
rəhbərliyinə müraciət

    Ana səhifə