Abstract

Many plant ESTs have been sequenced as an alternative to whole-genome sequences, including peanuts, due to the size and complexity of the genome in Arachis hypogaea Recombinants. The US peanut research community held the historic 2004 Atlanta Genomics Workshop and named the EST project a top priority. As of August 2011, the peanut research community had deposited 252,832 ESTs in the NCBI Public EST Database, and this resource has provided the community with valuable tools and building blocks for several genome-scale experiments prior to the project whole-genome sequencing.

These EST resources have been used for marker development, gene cloning, gene expression on microarrays, and genetic mapping. Indeed, peanut EST sequence resources have been shown to have a wide range of applications and served their essential function at the time of need. The EST project then contributes to the second landmark event, the 2010 Peanut Genome Project Inaugural Meeting, also held in Atlanta, where it was decided to sequence the entire peanut genome. After complete sequencing of the entire peanut genome, ESTs or the transcriptome will continue to play an important role in filling knowledge gaps, identifying particular genes, and exploring gene function.

EST sequence-based gene cloning

An important goal of EST sequencing is to clone complete gene sequences of agronomic value. The synthesis of fatty acids is one of the most important characteristics of oily peanuts. The first enzyme complex in the synthesis of fatty acids in plants is ACCase, which is composed of 4 subunits, BCCP, BC, α-CT and β-CT. With the help of peanut seed EST sequence information combined with homology cloning, the entire ORF of these four subunits was cloned from cultivated peanuts.

The multifunctional ACCase, a peptide containing three functional domains, was also identified from peanuts. Plant fatty acid biosynthesis is catalyzed by type II fatty acid synthase (FAS) in plastids and mitochondria. The FAS type II complex contains several enzymes and an important protein, acyl carrier protein (ACP). Genes encoding ACP, malonyl-CoA: ACP transacylase, β-ketoacyl-ACP synthase, β-ketoacyl-ACP reductase, β-hydroxyalkyl-ACP dehydratase, and enoyl-ACP reductase were isolated using a similar strategy. One to five gene members encoding each enzyme were identified.

Five different types of ACP genes showing little sequence similarity were cloned. Oleosin is an important component of the oil bodies of plants. The level of expression and accumulation of a specific oleosin could influence the morphology of the oil bodies and the oil content of the seed. There are 284 ESTs that showed high sequence homology with oleosin from other plant species. These ESTs are from 6 contigs and represent 6 subfamilies of peanut oleosins. The complete ORF of these oleosin genes was also cloned (X. Wang, unpublished data).

Gene expression study using EST resources

Microarray analysis is a powerful tool for global gene expression profiling. The first peanut microarray study was performed by Luo et al. to investigate differentially expressed genes in peanuts in response to A. parasiticus infection and drought stress. This microarray was made with a dotted array of cDNA clones using ESTs from two cDNA libraries. Later, an oligonucleotide microarray containing more information about the EST sequence was also designed and used for the genetic profiling of peanuts.

An oligonucleotide microarray containing 15,744 unique probes was created from 49,205 peanut ESTs. A total of 36,766 probes were designed using Agilent Technologies’ server-based array platform, an in situ synthesized microarray platform. A full description of the array is available in the NCBI GEO (Gene Expression Omnibus) database with registration GPL6661. Recently, Guo et al. reported the use of long oligonucleotide sequences in gene expression profiling experiments to identify candidate genes that confer resistance to Aspergillus infection due to upregulation in response to fungal infection. The description of the matrix platform can be found in the NCBI GEO database (accession GPL13178).

Development of EST-SSR markers

Guo’s lab and his collaborator developed a large number of EST-SSR (simple sequence repeat) markers from peanut EST sequences. About 24,000 were analyzed for SSR discovery. A total of 881 EST-SSRs were identified, and 251 of them could be successfully amplified from peanuts. Most of these SSRs exhibited polymorphism in the wild-type peanut; however, only a small number of SSRs showed polymorphism in cultivated peanuts.

In addition, 740 SSRs were discovered from 20,160 new EST sequences derived from the cultivated peanut pod, which are not currently deposited in the database. Amplification and polymorphism of some SSRs were tested in both cultivated and wild peanuts. Using ESTs from the immature peanut seed cDNA library, Song et al. identified 841 EST-SSRs. Part of these SSRs was used to analyze the polymorphism between cultivated and wild-type peanuts.

From 63207 ESTs derived from 5 cDNA libraries constructed using the bacterial wilt resistant peanut line “06-4104”, 2643 EST-SSR loci were identified. Qing et al. reported the largest collection of SSRs, a total of 4576 SSR markers from three sources: published SSR markers, newly developed SSR markers from ESTs, and from bacterial artificial chromosome (BAC) end sequences for linkage map construction. genetic, and QTL analysis of resistance to TSWV.

Transferability and Comparative Genomics

Gene content and order are highly conserved among closely related species, as revealed by comparative genetics. Sequence data obtained from several crop plants indicated the existence of homology between the genomes of two or more closely related genera/species. EST-SSRs have effective transferability between genera/species in many crops. EST-SSR markers have a higher transfer rate than SSR markers from genomic sequences due to the conservation of transcribed regions between related species.

Mace et al. obtained 51 SSR markers from the Leguminosae family using the in silico method and also tested 27 diverse Arachis accessions, revealing 18 polymorphisms. Varshney et al. constructed the first SSR-based map of cultivated peanuts by using partial SSR markers from the AA diploid genome map, and legume anchor markers were developed and compared with maps from Arachis, Lotus, and Medicago.

Moretzsohn et al. constructed a linkage map of the B genome using microsatellite markers developed for other Arachis species and showed high transferability (81.7%). This genome map B was compared to genome map A using 51 common markers. Fonseka et al. reported that synteny analysis between genomes A and B revealed good overall collinearity of the LG homeologous.