Mol. Cells 2014; 37(5): 372-382
Published online May 14, 2014
https://doi.org/10.14348/molcells.2014.2296
© The Korean Society for Molecular and Cellular Biology
Correspondence to : *Correspondence: kimkj@korea.ac.kr
In this study, the chloroplast (cp) genome sequences from three early diverged leptosporangiate ferns were completed and analyzed in order to understand the evolution of the genome of the fern lineages. The complete cp genome sequence of
Keywords chloroplast genomes,
Comparative chloroplast (cp) genomic studies provide an invaluable source of information for understanding plant evolution and plant phylogeny. Therefore, the cp genome is the most widely studied genome when compared to the two other genomes found in plant cells. Approximately 400 cp genome sequences for land plants are available from a public database, but the majority of them belonged to seed plants (
Structural changes in the cp genome, such as gene rearrangements (Chumley et al., 2006; Tangphatsornruang et al., 2010; Wu et al., 2007), gene/intron losses or duplications (Guisinger et al., 2011; Hiratsuka et al., 1989; Jansen et al., 2007), and small inversions (Kim and Lee, 2004; Yi and Kim, 2012) are well known at the genus, family, or ordinal levels of seed plants. Therefore, the genome evolution and phylogenetic relationships of seed plants are relatively well understood. However, the cp genome studies in ferns are limited to just a few lineages.
One of the distinct features of cp genomes is its high levels of adenosine and thiamine (AT) content (Sablok et al., 2011; Smith, 2009). However, a relatively wide range of AT content variation was reported for a number of different plant lineages (Smith, 2009). The GC content differences in the cp genomes usually correlate well with codon usage bias. The effective number of codons (ENCs) represents a simple way to measure synonymous codon usage bias and is independent of coding region length and amino acid composition (Wright, 1990). Therefore, the comparative ENC values may show a broad spectrum of base usage patterns among major lineages of plant groups.
Ferns are an important plant group for the understanding plant evolution because of the long evolutionary history and the complicated phylogenetic relationships (Pryer et al., 2004). The extant ferns are composed of one monophyletic class and 11 monophyletic orders (Pryer et al., 2009). Since the physical map of the
In order to provide the data in the missing lineages, we report three complete cp genome sequences from the early diverged leptosporangiate ferns in this paper. Two are newly reported groups (Osmundales and Gelicheniales) and one (Schizaeales) is a previously reported group. Using these data, we address the following two questions about cp genome evolution of early diverged leptosporangiate ferns: (i) which of the cp genome structures are more similar to that of basal Osmundales, and (ii) whether or not Osmundales really have an intermediate-type cp genome that is between eusporangiate and leptosporangiate ferns.
Osmundales consists of a monophyletic family, three genera, and ca. 20 species (Smith et al., 2006), but it includes more than 150 fossil species (Tidwell and Ash, 1994). Many researchers consider the Osmundales to be closely related to eusporangiate ferns (Pryer et al., 2001; 2004; Schneider et al., 2004; Schuettpelz and Pryer, 2007; Wolf et al., 1995). Osmun-dales also have been considered as intermediate taxa between eusporangiate and leptosporangiate ferns based on their external appearance, and anatomical and meristem characteristics (Cross, 1931a; 1931b; Freeberg and Gifford Jr, 1984; Gifford Jr, 1983). Using fossil records, Osmundaceae could be traced back to the Late Permian period, but the genus
Gleicheniales consists of three families, 10 genera, and ca. 140 species, and most of the species are members of Gleicheniaceae (Smith et al., 2006). Gleicheniaceae is considered as an old lineage originating from the Permian (Pryer et al., 2004; Taylor et al., 2009). We report the cp genome sequence of
Schizaeales consists of three families, four genera, and ca. 155 species (Smith et al., 2006). The oldest Schizaeaceae fossil originated from the Jurassic period (Taylor et al., 2009), and Schizaeales diverged from the core leptosporangiate ferns in the Permian (Pryer et al., 2004). The genus
In this study, the complete cp genome sequences of
Thirty-five cp genome sequences were used for the phylogenetic analysis (Table 1). We sampled all of the published complete cp genome sequences from monilophytes (14), lycophytes (4), and bryophytes (5), and eight selected species from spermatophytes. Two charophytes were included as out-groups. In addition, two unpublished monilophytes sequences were also included in these taxon samplings (H.-T. Kim and K.-J. Kim, unpublished data). Eighty-nine genes, including 84 protein coding genes and five ribosomal RNA genes, were aligned using MUSCLE program (Edgar, 2004), and the phylogenetic trees were constructed using four different tree building methods. First, the maximum parsimony (MP) tree was generated by PAUP (Swofford, 2003) under the options of equal character weighting, random taxon addition, and TBR branch swapping options. Gaps were treated as missing. Second, the neighbor joining (NJ) tree was generated with Geneious 6.1.7 using the HKY genetic distance model. Third, for the maximum likelihood (ML) tree, we selected the optimal model with Modeltest 3.7 (Posada and Crandall, 1998). The ML tree was evaluated by the GTR + I + G model using RAxML (Stamatakis, 2006; Stamatakis et al., 2008) that is performed using the CIPRES Science Gateway (Miller et al., 2010). The strengths of all of the internal branches in MP, NJ, and ML analyses were evaluated by 1,000 bootstrap replications. Fourth, the Bayesian inference (BI) tree was reconstructed by Mrbayes under the following conditions: nst = 6, rates = invgamma, Ngen = 500,000 and samplef = 100, using the CIPRES Science Gateway (Miller et al., 2010).
The cp genomes modifications, such as gene/intron gains or losses, inversion events, and the anticodon changes, were treated as binary characters. A total of 30 variable evolutionary events were recorded from the fern lineages. Next, the character states were plotted on the ML tree topology in order to deduce the evolutionary direction of these characteristics. The evolutionary directions were accounted on the ACCTRAN criteria on the parsimony analysis using PAUP (Swofford, 2003).
The complete cp genome sequences of 194 species of land plants were used to analyze the GC contents of coding sequences in the cp genome (Supplementary Table 1). All cp genome sequences were obtained from NCBI Organelle Genome Resources. The GC contents of the entire coding gene (GCall), first position (GC1), second position (GC2), and third position (GC3), and the effective numbers of codons (ENCs) (Wright, 1990) were calculated using Acua 1.0 (Vetrivel et al., 2007). We also analyzed dispersed repeats using REPuter (Kurtz et al., 2001). Then, each repeat sequence was sorted by similarity. These repeat sequences were reanalyzed using a DNA pattern search (
Five species of the genus
The aligned sequences of 89 cp genes from 35 taxa consisted of 94,790 bp. Among them, 31,312 sites (33.0%) were constant, 10,740 sites (11.3%) were parsimony-uninformative, and 52,738 sites (55.7%) were parsimony-informative. Figure 1 shows the ML tree topology with ML and MP bootstrapping support values and Bayesian probability. The MP, ML, NJ, and BI analyses showed largely concordant tree topologies, except on the two nodes leading to lycophytes and Equisetales. First, the lycophytes was a sister group to the euphyllophytes (spermatophytes + monilophytes) in the ML and BI trees (Fig. 1A). However, the lycophytes was a sister group to the spermatophytes in the MP and NJ trees (Fig. 1B). In addition, ML and MP boot strap values prefer to the lycophytes + spermatophytes clade. The ML values between two topologies are not significantly different for this large data set (LM = ?1,285,886
The physical maps of cp genomes from three early diverged leptosporangiate ferns are shown in Fig. 2, and the three newly completed sequences were deposited in the NCBI database under the Nos. KF 225592?225594. The
A total of 130 genes were identified in the
We compared the GC contents of the cp genome coding sequence of bryophytes (8 spp.), Lycopodiopsida (4 spp.), Polypodiopsida (14 spp.), gymnosperms (26 spp.), and angiosperms (142 spp.; Supplementary Table 1). The GC content ranged from 29.5 to 39.2% in bryophytes, from 36.8 to 54.4% in Lycopodiopsida, from 33.6 to 42.4% in Polypodiopsida, from 35.1 to 40.0% in gymnosperms, and from 34.4 to 41.3% in angiosperms. We analyzed the GC content for each codon position and the ENCs using a box plot for each taxonomic group. In the GC position-plot, almost all data were distributed near the regression line, and the slope of GC3 was twice as high as the slope of GC1 and GC2 (Fig. 3A). The GC3 showed wider variation than the GC1 and the GC2 (Fig. 3B). The median value of GC3 showed a little variation among seed plants. However, the range of GC3 in Polypodiopsida showed substantial variation. The GC3 value seemed to increase from eusporangiate ferns to leptosporangiate ferns. The ENCs showed a similar distribution pattern when compare to the GC3 values. The ENCs of seed plants were concentrated between 45 and 50, but the ENCs of Polypodiopsida ranged from 41 to 54 (Fig. 3B).
The
The
RpoC1 intron loss was described in the
Monilophytes consist of four orders of eusporangiate ferns and a clade of leptosporangiate ferns (Smith et al., 2006). The phylogenetic relationships among the four eusporangiate ferns were uncertain even though most of the recent data indicate they are paraphyletic assemblages. Specifically, the phylogenetic position of Equisetales is largely different among the data sets, and this relationship remains to be resolved (Karol et al., 2010; Pryer et al., 2001). Our phylogenetic tree developed based on the most comprehensive cp genome data so far placed the Equisetales at the most basal position among the members of monilophytes, even though they did not show a large ML value difference (Fig. 1). The branch length leading to the
Several cp genome structural modifications, including gene/intron loss and inversion, have been reported for various ferns (Gao et al., 2009; 2013; Hasebe and Iwatsuki, 1990; Wolf et al., 2003; 2010). However, most of these studies were focused on the core leptosporangiate ferns. The complete cp DNA sequences from three early diverged leptosporangiate ferns provide us with new information on the evolution of the cp genome and the phylogenetic relationships of ferns. Figure 6 shows the genome evolutionary history on the phylogenetic tree. The coding gene losses occurred mainly for eusporangiate ferns. In contrast, the tRNA gene losses and anticodon substitutions usually occur on non-Osmundalean leptosporangiate ferns. Large inversions among IR-LSC are characteristic of the early diverged leptosporangiate ferns.
The inversion between
The patterns of GC contents by codon position and the ENCs of ferns are different from those of seed plants (Fig. 3). Early diverged ferns show low GC contents and ENCs when compared to the recently originated group. They also show a wide range of variation in GC and ENC values, while seed plants were similar to each other. The difference among groups may be due to the sampling error because many cp genome sequences are reported in seed plants, but only fifteen cp genome sequences are reported in ferns. Nevertheless, the value of GC3 and ENCs are notably different between ferns and seed plants. Furthermore, the GC3 and ENCs values are markedly different between the early diverged leptosporangiate and the core leptosporangiate ferns. We need more information about the cp genome sequences from ferns in order to address this question properly.
Osmundales have several common characteristics with eusporangiate ferns. However, it is normally recognized as a member of leptosporangiate ferns based on other morphological characteristics. However, the cp genome of
Molecular characteristics are frequently used to indicate specific taxonomic groups. The large inversion between
The complete cp DNA sequences from three major lineages of basal leptosporangiate ferns provide us a substantial information not only on evolution of the cp genomes and also on the phylogenetics of fern lineages. Our phylogenetic analysis, which was based on the largest numbers of complete cp genomes of Monilophytes so far, showed the paraphyly of the eusporangiate ferns. The Equisetales was the sister group to all other members of monilophytes. The results were consistent for the majority of the other analyses. In contrast to the paraphyly of eusporangiate ferns, the leptosporangiate ferns from a monophyletic group. Within the eusporangiate ferns, the cp genome structures, gene/intron contents, and RNA editing sites of
The list of complete chloroplast genome sequences and
Target | Taxa | Group | GenBank |
---|---|---|---|
Phylogenetic analysis | Spermatophytes | NC000932 | |
Spermatophytes | NC006290 | ||
Spermatophytes | NC006050 | ||
Spermatophytes | NC005086 | ||
Spermatophytes | NC020319 | ||
Spermatophytes | NC016986 | ||
Spermatophytes | NC010654 | ||
Spermatophytes | NC004677 | ||
Polypodiales(core leptosporangiate ferns) | NC004766 | ||
Polypodiales(core leptosporangiate ferns) | NC014592 | ||
Polypodiales(core leptosporangiate ferns) | NC014348 | ||
Cyatheales(core leptosporangiate ferns) | NC012818 | ||
Salviniales(core leptosporangiate ferns) | KC536646 | ||
Schizaeales (early diverged leptosporangiate ferns) | KF225593* | ||
Gleicheniales (early diverged leptosporangiate ferns) | KF225594* | ||
Osmundales (early diverged leptosporangiate ferns) | KF225592* | ||
Marattiales (eusporangiate ferns) | NC008829 | ||
Ophioglossales (eusporangiate ferns) | NC020147 | ||
Ophioglossales (eusporangiate ferns) | NC017006 | ||
Psilotales (eusporangiate ferns) | NC003386 | ||
Psilotales (eusporangiate ferns) | KC117179 | ||
Equisetales (eusporangiate ferns) | NC014699 | ||
Equisetales (eusporangiate ferns) | JN968380 | ||
Equisetales (eusporangiate ferns) | NC020146 | ||
Lycophytes | NC006861 | ||
Lycophytes | NC014675 | ||
Lycophytes | NC013086 | ||
Lycophytes | AB197035 | ||
Bryophytes | NC004543 | ||
Bryophytes | NC012052 | ||
Bryophytes | NC005087 | ||
Bryophytes | NC001319 | ||
Bryophytes | NC010359 | ||
Charophytes | NC004115 | ||
Charophytes | NC008097 | ||
Schizaeales (early diverged leptosporangiate ferns) | KF225595* | ||
Schizaeales (early diverged leptosporangiate ferns) | KF225596* | ||
Hymenophyllales (early diverged leptosporangiate ferns) | KF225597* | ||
Schizaeales (early diverged leptosporangiate ferns) | Not sequenced | ||
Schizaeales (early diverged leptosporangiate ferns) | Not sequenced | ||
Schizaeales (early diverged leptosporangiate ferns) | Not sequenced |
Asterisk on the GenBank accession numbers indicate newly reported sequences in this paper.
The length of quadripartite chloroplast genome of three early diverged leptosporangiate ferns
Taxa | LSC(bp) | IR(bp) | SSC(bp) | Total(bp) |
---|---|---|---|---|
85432 | 25038 | 21634 | 157142 | |
99857 | 14584 | 21982 | 151007 | |
100294 | 10109 | 22300 | 142812 |
Potential and detected RNA editing sites in chloroplast genome of ferns
Group | Taxon | No. of alternative start codons | No. of genes with internal stop codons | Maximum no. of internal stop codons in gene | No. of RNA editing sitesa |
---|---|---|---|---|---|
25 | 18 | 4 | 349 | ||
26 | 22 | 4 | - | ||
29 | 25 | 8 | - | ||
22 | 30 | 10 | - | ||
28 | 21 | 3 | - | ||
21 | 17 | 2 | - | ||
19 | 33 | 15 | - | ||
5 | 0 | 0 | - | ||
1 | 0 | 0 | - | ||
0 | 0 | 0 | - | ||
7 | 1 | 1 | |||
7 | 3 | 1 | - | ||
2 | 0 | 0 | - | ||
1 | 0 | 0 | - |
aThe numbers of RNA editing sites were reported by Wolf et al. (2004).
Mol. Cells 2014; 37(5): 372-382
Published online May 31, 2014 https://doi.org/10.14348/molcells.2014.2296
Copyright © The Korean Society for Molecular and Cellular Biology.
Hyoung Tae Kim, Myong Gi Chung1, and Ki-Joong Kim*
Division of Life Sciences, School of Life Sciences, Korea University, Seoul 136-701, Korea, 1Department of Biology and the Research Institute of Natural Science, Gyeongsang National University, Jinju 660-701, Korea
Correspondence to:*Correspondence: kimkj@korea.ac.kr
In this study, the chloroplast (cp) genome sequences from three early diverged leptosporangiate ferns were completed and analyzed in order to understand the evolution of the genome of the fern lineages. The complete cp genome sequence of
Keywords: chloroplast genomes,
Comparative chloroplast (cp) genomic studies provide an invaluable source of information for understanding plant evolution and plant phylogeny. Therefore, the cp genome is the most widely studied genome when compared to the two other genomes found in plant cells. Approximately 400 cp genome sequences for land plants are available from a public database, but the majority of them belonged to seed plants (
Structural changes in the cp genome, such as gene rearrangements (Chumley et al., 2006; Tangphatsornruang et al., 2010; Wu et al., 2007), gene/intron losses or duplications (Guisinger et al., 2011; Hiratsuka et al., 1989; Jansen et al., 2007), and small inversions (Kim and Lee, 2004; Yi and Kim, 2012) are well known at the genus, family, or ordinal levels of seed plants. Therefore, the genome evolution and phylogenetic relationships of seed plants are relatively well understood. However, the cp genome studies in ferns are limited to just a few lineages.
One of the distinct features of cp genomes is its high levels of adenosine and thiamine (AT) content (Sablok et al., 2011; Smith, 2009). However, a relatively wide range of AT content variation was reported for a number of different plant lineages (Smith, 2009). The GC content differences in the cp genomes usually correlate well with codon usage bias. The effective number of codons (ENCs) represents a simple way to measure synonymous codon usage bias and is independent of coding region length and amino acid composition (Wright, 1990). Therefore, the comparative ENC values may show a broad spectrum of base usage patterns among major lineages of plant groups.
Ferns are an important plant group for the understanding plant evolution because of the long evolutionary history and the complicated phylogenetic relationships (Pryer et al., 2004). The extant ferns are composed of one monophyletic class and 11 monophyletic orders (Pryer et al., 2009). Since the physical map of the
In order to provide the data in the missing lineages, we report three complete cp genome sequences from the early diverged leptosporangiate ferns in this paper. Two are newly reported groups (Osmundales and Gelicheniales) and one (Schizaeales) is a previously reported group. Using these data, we address the following two questions about cp genome evolution of early diverged leptosporangiate ferns: (i) which of the cp genome structures are more similar to that of basal Osmundales, and (ii) whether or not Osmundales really have an intermediate-type cp genome that is between eusporangiate and leptosporangiate ferns.
Osmundales consists of a monophyletic family, three genera, and ca. 20 species (Smith et al., 2006), but it includes more than 150 fossil species (Tidwell and Ash, 1994). Many researchers consider the Osmundales to be closely related to eusporangiate ferns (Pryer et al., 2001; 2004; Schneider et al., 2004; Schuettpelz and Pryer, 2007; Wolf et al., 1995). Osmun-dales also have been considered as intermediate taxa between eusporangiate and leptosporangiate ferns based on their external appearance, and anatomical and meristem characteristics (Cross, 1931a; 1931b; Freeberg and Gifford Jr, 1984; Gifford Jr, 1983). Using fossil records, Osmundaceae could be traced back to the Late Permian period, but the genus
Gleicheniales consists of three families, 10 genera, and ca. 140 species, and most of the species are members of Gleicheniaceae (Smith et al., 2006). Gleicheniaceae is considered as an old lineage originating from the Permian (Pryer et al., 2004; Taylor et al., 2009). We report the cp genome sequence of
Schizaeales consists of three families, four genera, and ca. 155 species (Smith et al., 2006). The oldest Schizaeaceae fossil originated from the Jurassic period (Taylor et al., 2009), and Schizaeales diverged from the core leptosporangiate ferns in the Permian (Pryer et al., 2004). The genus
In this study, the complete cp genome sequences of
Thirty-five cp genome sequences were used for the phylogenetic analysis (Table 1). We sampled all of the published complete cp genome sequences from monilophytes (14), lycophytes (4), and bryophytes (5), and eight selected species from spermatophytes. Two charophytes were included as out-groups. In addition, two unpublished monilophytes sequences were also included in these taxon samplings (H.-T. Kim and K.-J. Kim, unpublished data). Eighty-nine genes, including 84 protein coding genes and five ribosomal RNA genes, were aligned using MUSCLE program (Edgar, 2004), and the phylogenetic trees were constructed using four different tree building methods. First, the maximum parsimony (MP) tree was generated by PAUP (Swofford, 2003) under the options of equal character weighting, random taxon addition, and TBR branch swapping options. Gaps were treated as missing. Second, the neighbor joining (NJ) tree was generated with Geneious 6.1.7 using the HKY genetic distance model. Third, for the maximum likelihood (ML) tree, we selected the optimal model with Modeltest 3.7 (Posada and Crandall, 1998). The ML tree was evaluated by the GTR + I + G model using RAxML (Stamatakis, 2006; Stamatakis et al., 2008) that is performed using the CIPRES Science Gateway (Miller et al., 2010). The strengths of all of the internal branches in MP, NJ, and ML analyses were evaluated by 1,000 bootstrap replications. Fourth, the Bayesian inference (BI) tree was reconstructed by Mrbayes under the following conditions: nst = 6, rates = invgamma, Ngen = 500,000 and samplef = 100, using the CIPRES Science Gateway (Miller et al., 2010).
The cp genomes modifications, such as gene/intron gains or losses, inversion events, and the anticodon changes, were treated as binary characters. A total of 30 variable evolutionary events were recorded from the fern lineages. Next, the character states were plotted on the ML tree topology in order to deduce the evolutionary direction of these characteristics. The evolutionary directions were accounted on the ACCTRAN criteria on the parsimony analysis using PAUP (Swofford, 2003).
The complete cp genome sequences of 194 species of land plants were used to analyze the GC contents of coding sequences in the cp genome (Supplementary Table 1). All cp genome sequences were obtained from NCBI Organelle Genome Resources. The GC contents of the entire coding gene (GCall), first position (GC1), second position (GC2), and third position (GC3), and the effective numbers of codons (ENCs) (Wright, 1990) were calculated using Acua 1.0 (Vetrivel et al., 2007). We also analyzed dispersed repeats using REPuter (Kurtz et al., 2001). Then, each repeat sequence was sorted by similarity. These repeat sequences were reanalyzed using a DNA pattern search (
Five species of the genus
The aligned sequences of 89 cp genes from 35 taxa consisted of 94,790 bp. Among them, 31,312 sites (33.0%) were constant, 10,740 sites (11.3%) were parsimony-uninformative, and 52,738 sites (55.7%) were parsimony-informative. Figure 1 shows the ML tree topology with ML and MP bootstrapping support values and Bayesian probability. The MP, ML, NJ, and BI analyses showed largely concordant tree topologies, except on the two nodes leading to lycophytes and Equisetales. First, the lycophytes was a sister group to the euphyllophytes (spermatophytes + monilophytes) in the ML and BI trees (Fig. 1A). However, the lycophytes was a sister group to the spermatophytes in the MP and NJ trees (Fig. 1B). In addition, ML and MP boot strap values prefer to the lycophytes + spermatophytes clade. The ML values between two topologies are not significantly different for this large data set (LM = ?1,285,886
The physical maps of cp genomes from three early diverged leptosporangiate ferns are shown in Fig. 2, and the three newly completed sequences were deposited in the NCBI database under the Nos. KF 225592?225594. The
A total of 130 genes were identified in the
We compared the GC contents of the cp genome coding sequence of bryophytes (8 spp.), Lycopodiopsida (4 spp.), Polypodiopsida (14 spp.), gymnosperms (26 spp.), and angiosperms (142 spp.; Supplementary Table 1). The GC content ranged from 29.5 to 39.2% in bryophytes, from 36.8 to 54.4% in Lycopodiopsida, from 33.6 to 42.4% in Polypodiopsida, from 35.1 to 40.0% in gymnosperms, and from 34.4 to 41.3% in angiosperms. We analyzed the GC content for each codon position and the ENCs using a box plot for each taxonomic group. In the GC position-plot, almost all data were distributed near the regression line, and the slope of GC3 was twice as high as the slope of GC1 and GC2 (Fig. 3A). The GC3 showed wider variation than the GC1 and the GC2 (Fig. 3B). The median value of GC3 showed a little variation among seed plants. However, the range of GC3 in Polypodiopsida showed substantial variation. The GC3 value seemed to increase from eusporangiate ferns to leptosporangiate ferns. The ENCs showed a similar distribution pattern when compare to the GC3 values. The ENCs of seed plants were concentrated between 45 and 50, but the ENCs of Polypodiopsida ranged from 41 to 54 (Fig. 3B).
The
The
RpoC1 intron loss was described in the
Monilophytes consist of four orders of eusporangiate ferns and a clade of leptosporangiate ferns (Smith et al., 2006). The phylogenetic relationships among the four eusporangiate ferns were uncertain even though most of the recent data indicate they are paraphyletic assemblages. Specifically, the phylogenetic position of Equisetales is largely different among the data sets, and this relationship remains to be resolved (Karol et al., 2010; Pryer et al., 2001). Our phylogenetic tree developed based on the most comprehensive cp genome data so far placed the Equisetales at the most basal position among the members of monilophytes, even though they did not show a large ML value difference (Fig. 1). The branch length leading to the
Several cp genome structural modifications, including gene/intron loss and inversion, have been reported for various ferns (Gao et al., 2009; 2013; Hasebe and Iwatsuki, 1990; Wolf et al., 2003; 2010). However, most of these studies were focused on the core leptosporangiate ferns. The complete cp DNA sequences from three early diverged leptosporangiate ferns provide us with new information on the evolution of the cp genome and the phylogenetic relationships of ferns. Figure 6 shows the genome evolutionary history on the phylogenetic tree. The coding gene losses occurred mainly for eusporangiate ferns. In contrast, the tRNA gene losses and anticodon substitutions usually occur on non-Osmundalean leptosporangiate ferns. Large inversions among IR-LSC are characteristic of the early diverged leptosporangiate ferns.
The inversion between
The patterns of GC contents by codon position and the ENCs of ferns are different from those of seed plants (Fig. 3). Early diverged ferns show low GC contents and ENCs when compared to the recently originated group. They also show a wide range of variation in GC and ENC values, while seed plants were similar to each other. The difference among groups may be due to the sampling error because many cp genome sequences are reported in seed plants, but only fifteen cp genome sequences are reported in ferns. Nevertheless, the value of GC3 and ENCs are notably different between ferns and seed plants. Furthermore, the GC3 and ENCs values are markedly different between the early diverged leptosporangiate and the core leptosporangiate ferns. We need more information about the cp genome sequences from ferns in order to address this question properly.
Osmundales have several common characteristics with eusporangiate ferns. However, it is normally recognized as a member of leptosporangiate ferns based on other morphological characteristics. However, the cp genome of
Molecular characteristics are frequently used to indicate specific taxonomic groups. The large inversion between
The complete cp DNA sequences from three major lineages of basal leptosporangiate ferns provide us a substantial information not only on evolution of the cp genomes and also on the phylogenetics of fern lineages. Our phylogenetic analysis, which was based on the largest numbers of complete cp genomes of Monilophytes so far, showed the paraphyly of the eusporangiate ferns. The Equisetales was the sister group to all other members of monilophytes. The results were consistent for the majority of the other analyses. In contrast to the paraphyly of eusporangiate ferns, the leptosporangiate ferns from a monophyletic group. Within the eusporangiate ferns, the cp genome structures, gene/intron contents, and RNA editing sites of
. The list of complete chloroplast genome sequences and
Target | Taxa | Group | GenBank |
---|---|---|---|
Phylogenetic analysis | Spermatophytes | NC000932 | |
Spermatophytes | NC006290 | ||
Spermatophytes | NC006050 | ||
Spermatophytes | NC005086 | ||
Spermatophytes | NC020319 | ||
Spermatophytes | NC016986 | ||
Spermatophytes | NC010654 | ||
Spermatophytes | NC004677 | ||
Polypodiales(core leptosporangiate ferns) | NC004766 | ||
Polypodiales(core leptosporangiate ferns) | NC014592 | ||
Polypodiales(core leptosporangiate ferns) | NC014348 | ||
Cyatheales(core leptosporangiate ferns) | NC012818 | ||
Salviniales(core leptosporangiate ferns) | KC536646 | ||
Schizaeales (early diverged leptosporangiate ferns) | KF225593* | ||
Gleicheniales (early diverged leptosporangiate ferns) | KF225594* | ||
Osmundales (early diverged leptosporangiate ferns) | KF225592* | ||
Marattiales (eusporangiate ferns) | NC008829 | ||
Ophioglossales (eusporangiate ferns) | NC020147 | ||
Ophioglossales (eusporangiate ferns) | NC017006 | ||
Psilotales (eusporangiate ferns) | NC003386 | ||
Psilotales (eusporangiate ferns) | KC117179 | ||
Equisetales (eusporangiate ferns) | NC014699 | ||
Equisetales (eusporangiate ferns) | JN968380 | ||
Equisetales (eusporangiate ferns) | NC020146 | ||
Lycophytes | NC006861 | ||
Lycophytes | NC014675 | ||
Lycophytes | NC013086 | ||
Lycophytes | AB197035 | ||
Bryophytes | NC004543 | ||
Bryophytes | NC012052 | ||
Bryophytes | NC005087 | ||
Bryophytes | NC001319 | ||
Bryophytes | NC010359 | ||
Charophytes | NC004115 | ||
Charophytes | NC008097 | ||
Schizaeales (early diverged leptosporangiate ferns) | KF225595* | ||
Schizaeales (early diverged leptosporangiate ferns) | KF225596* | ||
Hymenophyllales (early diverged leptosporangiate ferns) | KF225597* | ||
Schizaeales (early diverged leptosporangiate ferns) | Not sequenced | ||
Schizaeales (early diverged leptosporangiate ferns) | Not sequenced | ||
Schizaeales (early diverged leptosporangiate ferns) | Not sequenced |
Asterisk on the GenBank accession numbers indicate newly reported sequences in this paper..
. The length of quadripartite chloroplast genome of three early diverged leptosporangiate ferns.
Taxa | LSC(bp) | IR(bp) | SSC(bp) | Total(bp) |
---|---|---|---|---|
85432 | 25038 | 21634 | 157142 | |
99857 | 14584 | 21982 | 151007 | |
100294 | 10109 | 22300 | 142812 |
. Potential and detected RNA editing sites in chloroplast genome of ferns.
Group | Taxon | No. of alternative start codons | No. of genes with internal stop codons | Maximum no. of internal stop codons in gene | No. of RNA editing sitesa |
---|---|---|---|---|---|
25 | 18 | 4 | 349 | ||
26 | 22 | 4 | - | ||
29 | 25 | 8 | - | ||
22 | 30 | 10 | - | ||
28 | 21 | 3 | - | ||
21 | 17 | 2 | - | ||
19 | 33 | 15 | - | ||
5 | 0 | 0 | - | ||
1 | 0 | 0 | - | ||
0 | 0 | 0 | - | ||
7 | 1 | 1 | |||
7 | 3 | 1 | - | ||
2 | 0 | 0 | - | ||
1 | 0 | 0 | - |
aThe numbers of RNA editing sites were reported by Wolf et al. (2004).