Analysis of Genes with Alternatively Spliced Transcripts in the Leaf, Root, Panicle and Seed of Rice Using a Long Oligomer Microarray and RNA-Seq
Songhwa Chae, Joung Sug Kim, Kyong Mi Jun, Sang-Bok Lee, Myung Soon Kim, Baek Hie Nahm, and Yeon-Ki Kim
Abstract
Pre-mRNA splicing further increases protein diversity acquired through evolution. The underlying driving forces for this phenomenon are unknown, especially in terms of gene expression. A rice alternatively spliced transcript detection microarray (ASDM) and RNA sequencing (RNA-Seq) were applied to differentiate the transcriptome of 4 representative organs of Oryza sativa L. cv. Ilmi: leaves, roots, 1-cm-stage panicles and young seeds at 21 days after pollination. Comparison of data obtained by microarray and RNA-Seq showed a bell-shaped distribution and a co-lineation for highly expressed genes. Transcripts were classified according to the degree of organ enrichment using a coefficient value (CV, the ratio of the standard deviation to the mean values): highly variable (CVI), variable (CVII), and constitutive (CVIII) groups. A higher index of the portion of loci with alternatively spliced transcripts in a group (IAST) value was observed for the constitutive group. Genes of the highly variable group showed the characteristics of the examined organs, and alternatively spliced transcripts tended to exhibit the same organ specificity or less organ preferences, with avoidance of ‘organ distinctness’. In addition, within a locus, a tendency of higher expression was found for transcripts with a longer coding sequence (CDS), and a spliced intron was the most commonly found type of alternative splicing for an extended CDS. Thus, pre-mRNA splicing might have evolved to retain maximum functionality in terms of organ preference and multiplicity.
INTRODUCTION
Eukaryotic organisms have evolved ingenious mechanisms to increase protein diversity. At the genome level, diversity is amplified through various mechanisms of gene duplication, such as whole-genome (WG) duplication and segment and tandem duplications, during evolution (Freeling, 2009). At the protein level, alternative RNA splicing provides another layer of diversity, whereby transcripts of many multi-exonic genes are processed with the inclusion or exclusion of particular exons in the final mRNA molecule (Berget et al., 1977; Chow et al., 1977). It has been reported that ~95% of multi-exonic genes are alternatively spliced in humans; in contrast, the number is ~33% in rice (Pan et al., 2008; Zhang et al., 2010). Pre-mRNA splicing is regulated by a system of trans-acting RNA-binding proteins that bind to cis-acting sites on the primary transcript, followed by recruitment of spliceosomal proteins (Ser/Arg-rich proteins, SRs) to the target site. By recruiting small nuclear ribonucleoproteins (snRNPs) that form the spliceosome, SRs are well known for their role in alternative splicing (Glisovic et al., 2008; Lunde et al., 2007). Although many RNA isoforms can be generated from a single locus, the majority of loci generate only two or three isoforms in abundance.
Splicing isoforms consist of several groups corresponding to patterns of exon truncation or extension, intron retention or inclusion/exclusion of entire exons in a mature transcript (Cambell et al., 2006; Kriventseva et al., 2003). Such splice modes have been classified by several research groups. For example, Lewis et al. (2003) proposed classifying splice modes based on splice site usage and effects on the coding sequence. These authors reported that splice sites can be introduced or lost according to the mode of splice type, such as an intra-exon splice, an imperfect exon skip, alternate sites, exon inclusion, and a perfect exon skip; changes in coding region were also considered. Campbell et al. (2006) classified alternative splice forms into 9 types, including alternate acceptor/donor sites and retained or skipped exons in rice. Several web tools are also available for detecting alternative splicing (Huang et al., 2003; Koscielny et al., 2009; Zhu et al., 2014). Expression Atlas (http://www.ebi.ac.uk/gxa) is a value-added database that provides information about gene, protein and splice variant expression in different cell types, organs, and developmental stages (Petryszak et al., 2014). Indeed, splicing events are often differentially regulated across tissues and during development, suggesting that individual isoforms may have specific spatial or temporal roles. Such analyses of alternative splice forms might provide clues about which genes are under alternative splicing pressure during development or caused by various biotic or abiotic stresses.
Rice (Oryza sativa) is an important crop that supplies food more than 3 billion people. Based on the entire genome sequence and map-based sequence, the genome size of rice is approximately 389 Mb. In addition, the recent International Rice Genome Sequencing Project (IRGSP) 1.0 version annotated approximately 38,000 protein-coding sequences (Price et al., 2002; Project, International Rice Genome Sequencing 2005; Salunkhe et al., 2011), among which 2,478 non-redundant sets of rice transcription factors (TFs) (Gao et al., 2006; Perez-Rodriguez et al., 2010; Priya and Jain, 2013) involving major families such as AP2-EREBP (165), bHLH (160), NAC (144), and MADS (90) were found. As the genome size of rice is relatively small, it has been used as a model monocot plant to provide information at the genome scale for numerous biology fields, including organogenesis, development, and stress responses. The attributes associated with the small genome of rice have allowed researchers in these fields to perform proteomic analysis (Choudhary et al., 2009; Degenkolbe et al., 2013; Rabello et al., 2008), transcriptomic analysis using microarrays (Degenkolbe et al., 2009; 2013; Rabbani et al., 2003; Rensink and Buell, 2005), and genetic analysis using next-generation sequencing (NGS) (Huang et al., 2012). Moreover, to identify genes related to agronomic characters, various cultivars have recently been sequenced using NGS followed by mapping to reference sequences (Arai-Kichise et al., 2014; Huang et al., 2010; Yamamoto et al., 2010).
Over the last two decades, DNA microarray technology has been extensively applied in gene expression studies in various areas of biology, such as transcriptome changes during shoot, root, and callus development (Che et al., 2006; Jiao et al., 2009; Ma et al., 2005; Su et al., 2007; Wang et al., 2010). Microarrays have also been used to determine gene expression profiles of whole plants or various organs during stress or physiological changes (Kreps et al., 2002; Lenka et al., 2011; Minh-Thu et al., 2013; Rabbani et al., 2003; Sarkar et al., 2014; Seki et al., 2002; Shankar et al., 2014; Shinozaki and Yamaguchi-Shinozaki, 2000). In these analyses, several versions of rice chips have been used, such as the GeneChip (25 bp) from Affymetrix (www.affymetrix.com) and customized chips (60 bp) from Agilent (www.agilent.com) or NimbleGen (sequencing.roche.com). A spotting technology-based NSF45K array (50–70 bp) has also been employed to distinguish alternatively spliced transcripts (Jung et al., 2008). Recently, NGS without prior annotation information was utilized to analyze organs under various physiological conditions (Kyndt et al., 2012; Wakasa et al., 2014; Yang et al., 2015a; 2015b; Zhai et al., 2013). However, due to the lack of issues associated with probe redundancy and annotation, RNA sequencing (RNA-Seq) simplifies data interpretation and is rapidly replacing microarray technology.
In this study, we designed feature probes for an alternatively spliced transcript detection microarray (ASDM) for rice. The 40,139 transcripts/36,176 loci in ASDM can be detected in the current Rice Annotation Project (RAP) database containing 44,552 transcripts/37,868 loci. ASDM was applied to differentiate transcriptomes among the leaf, root, 1-cm panicle and young seed. In a parallel analysis, the results were compared with those obtained by RNA-Seq. Both technologies revealed the significance of alternative splicing in terms of organ enrichment. The several benefits of alternatively splicing with organ bias are discussed.
MATERIALS AND METHODS
Plant materials and RNA isolation
All four organs in this study were prepared from Oryza sativa L. cv. Ilmi. Leaves and roots were harvested separately at 14 days after sowing in soil. Flowers before pollination were collected in the range of 0.5–1.5-cm panicles (P1cm). Seeds at 20–30 days after pollination (S21DAP) were collected at the internal ripening stage of the grains. Total RNA was isolated from tissues using Hybrid-R™ (GeneAll, Korea) extraction following the manufacturer’s manual. For experimental reproducibility, all samples were prepared twice from distinct plants and immediately frozen in liquid nitrogen.
Microarray design
An ASDM for rice was designed based on the genome information of IRGSP_1_0_representative_2014_06_25 (http://rapdb.lab.nig.ac.jp). Four 60-nt-long feature probes as representative features (normally ending with ‘-01’) were designed for each gene, starting 60 bp upstream of the end of the stop codon and shifting 30 bp, such that 4 probes covered 60 bp of the coding sequence (CDS) and 90 bp of the 3′ untranslated region (UTR). For loci with alternatively spliced transcripts, additional probes were designed to distinguish alternatively spliced transcripts. Unique exons (UniqExon) or regions (UniqExonU or UniqExonD) were identified by comparing genome coordinates between alternatively splice transcripts. Regions created by joining adjacent exons were designated as ‘UniqExonJ’. Genes from chloroplasts (123) and mitochondria (74) and selection markers such as gfp, gus, hyg, bar, and kan were included. In total, 175,192 probes were designed, with an average probe size of 60 nt and a Tm between 75 and 85°C. The genome version contains 37,868 loci and 44,552 transcripts, and ASDM covers 36,176 loci and 40,139 transcripts. The microarray was manufactured at Agilent (Agilent Technologies, United States, http://www.agilent.com).
Microarray analysis
Probes were labeled using Agilent Low RNA Input Linear Amplification Kit PLUS following
the manufacturer’s protocol (http://www.agilent.com). Total RNA (500 ng) was used to generate labeled cRNA by incorporating cyanine CTP.
The microarray was scanned using an Agilent DNA microarray scanner (G2505C), and raw
intensity data were extracted with the feature extraction software. The data were
processed with the limma package in R (http://www.bioconductor.org/), and background adjustment and normalization were performed with the background
Correct and normalize Between Arrays functions, respectively, in the program. plot
Densities functions were used to analyze the distribution of probe intensities for
8 microarray data. Frequency densities before and after normalization were plotted
against log2-based intensities (
Gene Ontology (GO) analysis
Biological term enrichment among CVI, CVII, CVIII-i, and CVIII-ii genes was assessed using GoMiner (Ashburner et al., 2000; Zeeberg et al., 2003), as described previously (Minh-Thu et al., 2013). To identify a tentative ortholog of a rice gene in the Arabidopsis genome, BLASTP analysis was performed for the two species; genes with a score of 70 or higher were considered tentative counterparts. The microarray contains 27,448 genes, and 21,556 of these match Arabidopsis genes with scores equal to or greater than 70; these genes were used as the total gene set in GoMiner. Among the 21,556 rice genes in this study, significant changes in biological processes, molecular functions and cellular components were found for 1,342, 406 and 308, respectively. As the relationships of GO terms are hierarchical, the top terms (66) in each clade for the 1,342 biological process terms were collected and subjected to hierarchical clustering using limma language in the Bioconductor project (http://master.bioconductor.org/). FDR values were obtained from 100 randomizations. GO terms for which the FDR was less than 0.05 in at least at one group were collected. For presenting GO terms in hierarchical clusters, unique GO terms were first selected. The FDR values ranging from less than 0.05 were scaled 0.5 to 5 such that the more enriched terms have values close to 5 and darker color codes in the heatmap function (http://master.bioconductor.org/). Other non-significant FDR values were transformed to 0 using Perl scripting language. To reduce redundancy, the top terms in each clad were selected, and hierarchical clustering was performed with the Bioconductor R package (http://master.bioconductor.org/).
RNA sequencing
mRNA was purified using the TruSeq RNA sample preparation kit (http://www.illumina.com/) following the company’s protocol. Following purification, the mRNA was fragmented, and first-strand cDNA was generated using reverse transcriptase and random primers. Second-strand cDNA synthesis was then performed using DNA polymerase I and RNase H. These cDNA fragments were subjected to an end-repair process, the addition of a single A base, and adaptor ligation. The products were purified and enriched by polymerase chain reaction (PCR) to create a cDNA library in a bridged amplification reaction that occurred on the surface of the flow cell. A flow cell containing millions of unique clusters was loaded into a HiSeq 2000 for automated cycles of extension and imaging. Each sequencing cycle was carried out in the presence of all four nucleotides, generating a series of images with each representing a single base extension at a specific cluster. Libraries were constructed for the leaf, root, P1cm, and S21DAP. Repetitions were performed for all samples, except roots. In the initial sequencing, 2.7–0 Gb was sequenced at each end. These results generated 16–43 × 106 paired-end reads.
Data processing
Programs such as TopHat, Cufflinks and HTSeq were used to process sequence reads. Differential exon usage was examined with the DEXseq package in Bioconductor. Briefly, the rice genome sequence (IRGSP-1.0_2014-06-25) in the RAP database (http://rapdb.lab.nig.ac.jp) was indexed using Bowtie 2 with an FM Index based on the Burrows-Wheeler transform (Langmead et al., 2009). Reads in FASTQ files were aligned using TopHat (v2.0.4) with default parameters (Kim et al., 2013). TopHat aligned the RNA-Seq reads to genomes using the short read aligner Bowtie 2 and then analyzed the mapping results to identify splice junctions between exons. In the program, samtools (version 1.2) was used for indexing and sorting of bam and sam files. For transcriptome assembly and isoform quantitation from the RNA-Seq reads, Cufflinks was used and tested for comparison (Trapnell et al., 2012). Prior to statistical analysis of exons using the DEXseq package, per-gene read counts were performed with htseq-count in HTSeq-0.6.1 using the annotated gtf file (Anders et al., 2015). To extract exons among transcripts, non-overlapping exonic regions were defined with dexseq_prepare_annotation.py, and then the per-exon read counts were extracted with the script dexseq_count.py in the DEXSeq package (Anders et al., 2012). For each exon of each gene, the data contain the number of reads in each sample that overlap with the exon. If an exon’s boundary is not the same in all transcripts, the exon is separated into two or more parts, referred to as “counting bin”. A read that overlaps with several counting bins of the same gene is counted for each of these. The program returned one file for each biological replicate with the exon counts. Finally, differential exon usage was assessed with the DEXseq package in Bioconductor (http://bioconductor.org). In the package, generalized linear models (GLMs) for model read counts and parameters are calculated from a negative binomial (NB) distribution of these count data (McCullagh and Nelder, 1989).
Reverse transcription PCR (RT-PCR) and real-time PCR
For RT-PCR, 5 μg of total RNA was reverse transcribed using oligo (dT) and a RevertAid
H Minus First Strand cDNA Synthesis Kit (Thermo Scientific™, USA). PCR was performed
with one-hundredth of the first-strand cDNA mixture and gene-specific primers. The
PCR conditions used were as follows: 10 min at 94°C and 30 cycles of 30 s at 94°C,
30 s at 55°C, and 30 s at 72°C. The 20 μl contained 10 μl Solg™ 2X Real-Time PCR Smart
mix [Solgent, Korea], 2 μl cDNA, 1 μl primers, and 7 μl water, and real-time PCR was
performed using the CFX96 touch real-time PCR detection system (Bio-Rad, USA) with
three technical replicates. Expression was assessed by evaluating threshold cycle
(CT) values. The relative expression level of tested genes was normalized to the tubulin gene and calculated using the 2−ΔΔCT method (Livak et al., 2001). The sequences of all primers used in this study can be found in
RESULTS
Target probes were designed based on unique sequences among alternatively spliced transcripts
It has been reported that the 3′ regions of genes in plants are more specific than the 5′ regions (Eveland et al., 2008). To reflect these features, four 60-nt-long feature probes were designed for each gene, starting 60 bp upstream of the end of the stop codon and shifting 30 bp, such that the four probes covered 60 bp of the CDS and 90 bp of the 3′ UTR (arrows collectively indicate R for locus Os08g0482700 in Fig. 1A). To distinguish alternatively spliced transcripts, additional probes were designed based on unique sequence variations among alternatively spliced transcripts. For example, two transcripts, Os08t0482700-01 (encoding cupredoxin domain-containing protein) and Os08t0482700-02 (encoding conserved hypothetical protein), are alternatively spliced from mRNA produced by the Os08g0482700 locus. In addition, probes were designed to represent the unique features of each transcript. The Os08t0482700-01 transcript is composed of two exons and a 546-bp CDS contained within these exons. In contrast, the Os08t0482700-02 transcript is composed of one exon and has a 228-bp CDS located within this exon. The Os08t0482700-01 transcript has unique junction sites that join the spliced transcripts, and a 60-bp feature probe (UniqExonJ) was designed based on two 30-bp-long regions in the CDS for each exon. Thus, the probe represents a part of the unique exon (Os08t0482700-01_UE) of Os08t0482700-01 and is used to represent Os08t 0482700-01 alone. When representative probes cannot distinguish between alternatively spliced transcripts, ‘-01_UE’ can be used for expression of the ‘-01’ transcript. The Os08 t0482700-02 transcript was compared to Os08t0482700-01; a portion of the 5′ UTR, the CDS and a portion of the 3′ UTR are unique, and these regions were designated as UniqExonU, UniqExon, and UniqExonD, respectively. Three 60-bp target probes were designed according to these regions to represent Os08t0482700-02.
A total of 37,868 genes are representatively annotated in the IRGSP and RAP databases (http://rapdb.lab.nig.ac.jp), and 5,254 loci encode 11,938 alternatively spliced transcripts (Table 1). Among these alternatively spliced transcripts, 7,502 transcripts were designed with the unique exon features indicated above. Ultimately, 175,192 feature probes representing 40,139 of the 44,552 transcripts were designed and designated ASDM.
Confirmation of organ preferential genes with alternative splicing transcripts
We applied the rice ASDM to detect alternatively spliced genes among the leaf, root,
panicle at 0.5–1.5 cm (P1cm) and seed at 20–30 days after pollination (S21DAP). The
175,193 feature probes for 41,953 transcripts were normalized as described in the
Methods section (
To confirm the detection of alternatively spliced transcripts by ASDM, several genes were tested. As shown in Fig. 1B (ASDM), among the four organs, the strongest intensity for Os08t0482700-01 were found in the root. The Os08t04827 00-01_UE transcript represented by the unique junction Os08t0482700-01 confirmed this result. The intensities of Os08t0482700-01 and Os08t0482700-01_UE showed a similar pattern among the organs, being highly expressed in the root, followed by the leaf, S21DAP and P1cm. Although Os08t0482700-02 exhibited lower expression compared to Os08t0482700-01, similar to Os08t0482700-01 and Os08t0 482700-01_UE, it also was highly expressed in the root. Semi-quantitative PCR confirmed these patterns of expression (Fig. 1C). Primers were designed to represent the unique regions of each transcript, and as expected, these primers amplified segments presenting a tendency similar to the microarray data shown in Fig. 1B.
We also examined Os04g0612500, Os10g0471100 and Os07g0673400 in the leaf, P1cm and S21DAP samples, respectively (
The microarray data showed that transcript Os10t04711 00-01 (encoding a protein with high similarity to proteins involved in cuticular wax biosynthesis)
is specifically expressed in the P1cm stage (
The microarray probe intensities showed the highest expression of Os07g0673400 in S21DAP, and the result for Os07t0673400-01 (encoding Rossmann-like alpha/beta/alpha sandwich fold domain) confirmed this result
(
Genome-wide comparison of rice ASDM and RNA-Seq suggest co-linearity between the two technologies
In a parallel analysis, RNA-Seq was performed using the Illumina HiSeq 2000 platform,
as indicated in the Methods section. Approximately 17–43 million reads were obtained
from 30–80 Gb raw reads (
We tested genome-wide co-linearity between the microarray intensities and RNA-Seq
counts. The log2-based average intensities of the leaf microarray data and the log2-based
counts for the organs were compared (Fig. 2). Pearson’s correlation from two methods ranged 0.30–0.36 for the leaf, root, P1cm
and S21DAP, suggesting correlation between the data. We also downloaded publicly available
RNA-Seq data from TENOR (http://tenor.dna.affrc.go.jp), and data for the leaf and root at 10 days after germination were compared to those
in ASDM. Pearson correlation coefficients ranged 0.25–0.27. The p-values of the F-statistics
were also less than 2.2e-16 and significant (
A list of exons of the transcripts was generated from a GTF file in IRGSP-1.0 subversion
2016-03-09 with a python script, as described in the Methods section (Anders et al., 2012). The overlapping exon regions are marked as “counting bins” by the program, and
the read counts are distinguished by the statistical model in the program. The bins
of transcripts such as Os08t0482700-01 and Os08t0482700-02 for the root (Fig. 1B, RNA-Seq), Os04t0612500-1 and Os04t0612500-2 for the leaf Os10t0471100-1 and Os10t0471100-2 for P1cm and Os07t0673400-1 and Os07t0673400-2 for S21DAP (
Application of rice ASDM and RNA-Seq technologies to distinguish between genes that are enriched in organs and those that are constitutively expressed across organs
Among 45,128 transcripts, the number with normalized counts below 1 ranged from 12,000
to 15,000, and the median intensity of ASDM for these transcripts was 100.5. Based
on the microarray data, the transcripts (32,718) with intensity greater than 300 (3*
the median intensity of the transcripts, where RNA-Seq counts were greater than 1)
in at least one organ were chosen for further analysis. The average values are presented
in
Based on these CV values in ASDM, genes were tentatively categorized into 3 groups,
as shown in Fig. 3A: CVI (higher than 1.1), CVII (higher than 0.7), and CVIII (below 0.7). We classified
these groups as an organ-preferential group (CVI, 5,410), an organ-enriched group
(CVII, 6,547), and a nonspecific group (CVIII, 20,761). CVIII was sub-divided as -i
(highest), -ii (medium) and -iii (lowest) according to the degree of expression according
to ASDM. As the CV values of RNA-Seq were comparable to those of ASDM, the CV values
from RNA-Seq are also categorized as those from ASDM in Fig. 3B and
When the intensities and counts were set ≥ 300 and ≥ 1 for ASDM and RNA-Seq, respectively, the effective number of expressed genes was approximately 37,000 for both technologies. Genes commonly detected using these technologies were searched according to CV group (Fig. 3C): approximately 40–50% of transcripts were commonly detected for each of the CVI and CVII groups, whereas approximately 60% of transcripts were detected for CVIII. For each technology, the number of transcripts commonly detected for most CV groups was greater than the number of excluded transcripts.
Hierarchical clustering of genes with a single transcript and alternatively spliced transcripts
As the genes of the CVI group for ASDM showed organ enrichment, those with a single
transcript (4,172) and alternatively spliced transcripts (1,169) in this group were
delineated by hierarchical clustering (Fig. 4). Highly expressed genes (618) in the leaf with a single transcript were enriched
for a metabolic process represented by photosynthesis (
Highly expressed genes with alternatively spliced transcripts in the leaf clustered
into 4-5 groups (
To examine biological term enrichment, GO analyses were performed for genes of the
CVI, CVII, CVIII-i, and CVIII-ii groups from the ASDM dataset (Ashburner et al., 2000; Zeeberg et al., 2003). This comparison was performed for loci of whole genes and those with single (16,999)
or alternatively spliced (5,844) transcripts (
The portion of genes with alternatively spliced transcripts is increased among genes that are constitutively expressed across organs
In the rice genome, there are 5,254 loci with more than two alternative transcripts
(http://rapdb.lab.nig.ac.jp); among these, 3,539 loci were included in the current microarray. We determined
the influence of organ specificity on the number of alternatively spliced transcripts
or the influence of the degree of expression on the genome. As shown in Table 2, the ratio of loci with alternatively spliced transcripts was 0.14 (5,254/37,869)
in the current genome annotation and 0.15 (5,244/36,141) in ASDM; these ratios were
designated as the index of loci with alternatively spliced transcripts (IAST) in the
group. IAST of expressed genes was 0.19, slightly higher than those of the genome
and ASDM. These values were 0.14, 0.22 and 0.24 for CVI, CVII, and CVIII, respectively.
The CVI group was subdivided into the leaf, root, P1cm, and S21DAP, with values of
0.26, 0.08, 0.13, and 0.09, respectively. For the CVII group, these values were 0.18–0.28
and generally higher than those in the CVI group. When CVIII was divided into -i (highest),
-ii (medium) and -iii (lowest) according to the degree of expression, IAST values
of these groups ranged from 0.31–0.34. The IAST values appeared to increase from CVI
to CVIII, suggesting that the ratio of genes with alternative splicing increased as
the genes were expressed throughout each organ. We tested this hypothesis by performing
a literature review using the web-based database CREP (Wang et al., 2010) and projected the IAST values of the 100 genes that were most constitutively expressed
in CREP. Interestingly, IAST was 0.43 (
We also assessed IAST for CV groups based on RNA-Seq analysis (Table 3). The IAST values for CVI, CVII, and CVIII were 0.09, 0.12 and 0.21, respectively, and comparable to those based by ASDM. The values among the CVI and CVII organs were relatively higher in the leaf and lower in the root, P1cm, and S21DAP samples, but the values were somewhat lower than those based on ASDM. The IAST values among the CVIII subgroups were higher than those among other groups, as shown for ASDM, and were also less than those for 100 constitutive genes (Consti100 in Table 2). These data confirm the observation in the ASDM analysis, whereby IAST of genes constitutively expressed throughout organs was higher than that of organ-enriched genes.
A transcript confers organ preference in a locus, and the alternatively spliced transcript(s) confers the same preference or reduced organ specificity
We tested organ preference among alternatively spliced transcripts of the CVI group
enriched in each organ. The number of these transcript variants were 527, 179, 120
and 124 from 367, 137, 91 and 96 loci of the leaf, root, P1cm, S21DAP, respectively
(Table 4). We tested CV groups and organ preference between a major transcript and its alternatively
spliced transcripts. For example, two transcripts, Os04t0612500-01 and Os04t0612500-02, are alternatively spliced from pre-mRNA of Os04g0612500 (
A longer CDS transcript among alternatively spliced transcripts at a locus tends to be more expressed
We tested the extent of expression among spliced transcripts within a locus by comparing
transcripts with the highest and lowest expression values. To simplify expression
variability, we compared the log2-based ratio between transcripts with the highest
and lowest expression according to the ASDM dataset. The log ratio of the highest
and lowest values is shown for all tested organs in
Campbell et al. (2006) classified types of alternative splicing into 9 groups based on the results of Program
to Assemble Spliced Alignments (PASA): alternate acceptor, alternate donor, alternate
terminal exon, retained exon and skipped exon, initiation within an intron, termination
within an intron, and spliced intron and retained intron. We categorized the type
of alternative splicing to assess splice types and variation in expression between
transcripts with the highest and lowest expression. PASA was applied first to whole
transcripts, detecting 5,028 alternative splices for 2,438 of 4,143 loci in the genome
and 3,598 alternative splices for 2,415 loci in the microarray (
It has been reported that expression varies among alternatively spliced transcripts. Nonsense-mediated mRNA decay (NMD) has been postulated as the target of RNA degradation (Lewis et al., 2003). When an alternative splice is introduced, the stop codon is classified, and the corresponding mRNA isoform is labeled an NMD candidate, resulting in lower expression compared to the isoform with a longer CDS. If the spliced transcript escapes NMD, the result is a transcript with a shorter CDS. To examine how CDS length, due to alternative splicing, affects the degree of expression, transcripts with alternative splicing are sorted by the degree of expression and the CDS lengths compared according to organ and CV group. As shown in Table 5, among alternatively spliced transcripts, longer CDS transcripts appeared to be more expressed than shorter ones for all organs and CV groups. A paired t-test showed a p-value of 0.029, suggesting that these differences are significant.
DISCUSSION
ASDM and RNA-Seq provide similar GO profiles for the transcripts, as based on CV values
CV values for constitutive expression vary according to the number of organ samples.
In this report, CV values of 0.7 and 0.8 were tentatively used for ASDM and RNA-Seq,
respectively (Fig. 3). Wang et al. (2010) and our previous study (Chae et al., 2016) reported values of 0.1 and 0.3 using 39 tissues and 7 representative organs/tissues,
respectively. For both technologies, the variation in and criteria for constitutive
expression increased as the number of samples decreased. Because values vary depending
on the approach, the medium range of CVII groups was adopted as an intermediate for
each technology. These demarcations showed that analysis of genes with alternatively
spliced transcripts can be achieved for both in a similar manner. The effective number
of expressed transcripts was approximately 33,000 for ASDM and RNA-Seq (Fig. 3C), and the genes commonly detected ranged from 40 to 70%, more than the number transcripts
exclusive for each technology in each CV group. Among the organs examined, higher
numbers for constitutively expressed genes in both datasets were found for commonly
expressed group. In particular, the results of GO analysis of the CV group between
ASDM and RNA-Seq were quite similar (
Expressed TFs with a single transcript in P1cm included MADS box TFs Os05t0423400-01 (encoding PISTILLATA-like MADS box protein), Os02t0682200-01 (OsMADS6), and Os07t0605200-01 (OsMADS18). In a previous report, prolonged expression of Os02t0682200-01 and Os07t0605200-01 to the stage after pollination was observed (Chae et al., 2016), suggesting that these genes are involved in panicle development. MADS box and AP TFs are crucial for floral morphology (Coen and Meyerowitz 1991; Pelaz et al., 2000; Yun et al., 2013). Highly expressed genes in the S21DAP sample included those coding for CCAAT-containing TFs, such as Os10t0191900-00, Os10t0397900-00, and Os01t 0102400-01, and NAC domain TFs, such as Os03t0327800-01, Os01t0393100-01 and Os01t0104500-01. NAC genes, which constitute one of the largest families of plant-specific TFs and are present in a wide range of species, have been implicated in the regulation of many developmental processes, including seed development, embryo development, and shoot apical meristem formation (Duval et al., 2002; Kim et al., 2007a; Sperotto et al., 2009). These NAC TFs are also enhanced during the seed stage according to the Hana database (http://evolver.psc.riken.jp).
Analysis of the transcriptome using ASDM might be useful for alternatively spliced transcripts that include TFs. Indeed, a TF with an alternatively spliced transcript in P1cm, Os07t 0108900-02 (encoding MADS-domain-containing protein for sexual reproduction), was highly expressed compared to Os07t0108900-01. In rice, florigen and Hd3a, associated with the 14-3-3 protein in the cytoplasm, translocate to the nucleus and regulate expression of the MADS box-containing TF gene Os07g0108900 (OsMADS15, Tsuji et al., 2013), which is a homolog of an Arabidopsis floral identity gene APETALA1. Overexpression of the gene results in precocious phenotypes with regard to early internode elongation, shoot-borne crown root development, reduced plant height and early flowering at the seedling stage (Lu et al., 2012). As both transcripts contain a MADS box, an expression vector may need to be designed to discern between them. Os02t0258300-01 (encoding zinc finger FYVE/PHD-type domain-containing protein) was also highly expressed, as were four C3H family members, such as Os03t0329200-01 and Os03t0328900-01. In the S21DAP sample, Os12t 0621100-01, Os12t0621100-02 (encoding C2C2-YABBY TF) and Os03t0799600-01 (encoding PHD-containing TF) were significantly expressed from loci with alternatively spliced transcripts. Some YABBY genes are expressed in the apical-ventral region during vascular tissue development and early embryogenesis (Itoh et al., 2016; Liu et al., 2007), suggesting that these TFs are involved in early seed development.
Constitutively expressed genes are under greater alternative splicing pressure
Pre-mRNA or alternative splicing is found in eukaryotic organisms, some of which become highly developed through various processes. Many trans- and cis-acting factors have been identified, and one basic question with regard to alternative splicing pressure for common and organ-preferential genes remains. To address this question, we examined the influence of organ preference on the number of alternatively spliced transcripts and the influence of the degree of expression on the genome. Ratios were designated as IAST. The IAST value of expressed genes was 0.19, slightly higher than those of the genome and ASDM. These values were 0.14, 0.22 and 0.24 for CVI, CVII, and CVIII, respectively (Table 2). When CVIII was divided according to the extent of expression, IAST ranged from 0.31–0.34. IAST values also were higher in groups with greater constitutive expression, culminating in 0.43 for the most constitutive 100 genes in the literature (Wang et al., 2010). These observations were also confirmed by RNA-Seq data (Table 3). Although IAST values for CVI, CVII, and CVIII were 0.09, 0.12 and 0.21, respectively, slightly lower than those based on ASDM, the tendency toward increase with constitutively expressed groups and variations among organs were similar in these independent analyses.
To date, it remains unclear why constitutively expressed genes would have a higher IAST. All exon boundaries require canonical splice sites (e.g., GT-AG, GC-AG, AT-AC), and all bases in the genome can be exposed to mutation (Burset et al., 2000; Lorkovic et al., 2005; Mount and Steize 1981). DNA damage events can occur at a high frequency in higher eukaryotes, such that a single mammalian cell can accumulate ~10,000 abasic (apurinic) lesions per day (Lindahl and Nyberg, 1972). For comparison, mutation rates in microbes with DNA-based chromosomes are close to 1/300 per genome per replication (Drake et al., 1998). A high level of transcription has been associated with elevated spontaneous mutations and recombination rates in eukaryotic organisms (Datta and Jinks-Robertson, 1995). For example, the effect of transcription on the rate of spontaneous mutation in the yeast S. cerevisiae was examined using a lys2 frameshift allele under the control of a highly inducible promoter. As the rate of spontaneous reversion was shown to increase when the mutant gene was highly transcribed, transcriptionally active DNA and enhanced spontaneous mutation rates are correlated in yeast. Several hypotheses have been proposed to explain transcription-associated mutagenesis, e.g., transcription impairs replication fork progression in a directional manner, and DNA lesions accumulate under high-transcription conditions (Kim et al., 2007). Genes constitutively expressed across organs might have higher rates of mutation than organ-preferential genes, which may undergo reduced transcription. However, a high rate transcription might not be the sole factor explaining this phenomenon. Subdivision of CVIII by intensity, i.e., CVIII-i, -ii, and -iii, did not appear to result in variability (Tables 2 and 3). Alternatively, the neutral theory of molecular evolution may offer an explanation, such that organ-preferential genes within cells of differentiated organs might be under pressure toward alternative splicing (Kimura, 1968). Specifically, the products of alternative splicing may be detrimental to organogenesis and development. Indeed, positions within introns corresponding to known functional elements involved in pre-mRNA splicing, including the branch site, splice sites, and polypyrimidine tract, show reduced levels of genetic variation in human polymorphism data (Lomelin et al., 2010).
Alternatively spliced transcripts do not show a distinct preference of organ
According to our analysis, a transcript confers organ preference and alternatively spliced transcript(s) confer(s) the same preference or show less organ specificity. However, alternatively spliced transcripts do not exhibit additional organ distinctness or preference (Table 4). In other words, among CVI genes that show strong organ preference, none of alternatively spliced forms belong to the CVI group of the other organ. These finding suggest that multiple specificity may not be allowed for any two alternatively spliced transcripts in the CVI group. However, alternative splicing may be allowed for genes that are less detrimental to organogenesis and development, as mentioned above with regard to the observation of lower IAST in CVI compared to CVIII. Nonetheless, alternatively spliced transcripts that escape an organism’s surveillance evolve additional functionality and may be more abundant in other organs, resulting in CVII or CVIII. These aspects may place more constraint on pre-mRNA splicing of CVI genes, with lower IAST, as shown in Tables 2 and 3.
Numerous alternatively spliced transcripts have been implicated in flowering (Marquardt et al., 2014; Wang et al., 2014), circadian clock (Filichkin et al., 2010; Seo et al., 2011), and abiotic stress (Shang et al., 2017). In these analyses, alternatively spliced transcripts have a similar biological function. In the example of OsDREB2B expressed under drought conditions, two splice variants are differentially expressed in response to heat and drought stresses in rice (Matsukura et al., 2010). The transcript of OsDREB2B1 is more abundant under normal growth conditions; it contains a 53-base pair exon 2 insertion in the mature mRNA that introduces a shift in open reading frame. In plants exposed to high temperature stress, OsDREB2B2, in which exon 2 is spliced out in the mature mRNA, dominates. Genome-wide research in salt stress has revealed several features of pre-mRNA splicing in plants under abiotic stress; the expression level of these stress-induced variants is quite low relative to that of the major splicing variant(s), and most stress-induced splicing variants are associated with intron retention (Cui and Xiong, 2015; Ding et al., 2014) with a premature CDS. These observations might be in line with the allowance of alternative splicing for maximum function by avoiding expression of other alternatively spliced transcript(s) under unwanted circumstances.
Pre-mRNA splicing might be involved in balancing maximization of function and additional functionality to avoid distinct functions. For alternatively spliced variants of mRNAs, there is no way to distinguish functional differences that may be at risk of a waste of resources. Differential regulation through the components of the spliceosome might be possible, which would provide additional complexity in gene regulation. In this regard, pre-mRNA splicing might be involved in unique strategies contrasting those shown in gene families that provide direct functional distinctness by using unique promoters to respond to different signals (Freeling, 2009). As an example of a gene family, two transcripts, Os10t0505700-01 and Os11t0620300-01 (Nonspecific lipid-transfer protein 2) belong to the same ortholog group, OG5_137065 (orthomcl.org). However, Os10t0505700-01 is leaf preferential, while Os11t0620300-01 is seed preferential.
Among alternatively spliced transcripts, those with a longer CDS show higher expression
Our data also suggest that reciprocal expression is complementary among transcripts
within a locus, regardless of organ specificity (Liu et al., 2011). The log2-based values of the highest and lowest expressed transcripts were 2.5–2.7
(sd = 1.9) (
There may be additional mechanisms for maintaining alternative splicing. Reciprocal or complementary expression patterns have been reported, whereby a copy among gene duplicates is expressed in a certain organ or tissue type and another copy is expressed to a different extent in other organs or tissue types (Liu et al., 2011). Approximately 30% of whole-genome duplicate pairs and 38% of tandem duplicate pairs show reciprocal expression patterns in Arabidopsis. This result suggests that a balancing pressure exists among these duplicated genes and that the balance is broken when the genes undergo sub-functionalization due to partitioning of the ancestral expression pattern or ‘new functionalization’ due to acquisition of a new expression pattern (Wegmann et al., 2008). Such expression balancing might be subject to alternative splicing pressure. Indeed, commonly expressed genes are under evolutionary pressure toward alternative splicing, whereas organ-preferential genes might be less subjected to alternative splicing. Favorable conditions for the occurrence of alternative splicing might be the result of constant expression and balancing of alternatively spliced isoforms under a developmental process. Based on our data, a large fraction of isoforms for transcripts with low expression and a longer CDS demonstrate reciprocal expression with other alternatively spliced transcripts. In a sense, reciprocal expression might be a common phenomenon for gene families and also for alternatively spliced transcripts of a gene.
In this study, rice ASDM and RNA-Seq were applied to differentiate alternatively spliced transcripts between representative rice organs, including the leaf, root, panicle at 1 cm, and young seed. Transcripts were classified according to organ enrichment, and the results are collectively explained in terms of evolutionary pressure. The portion of loci with multiple transcripts due to alternative splicing was higher among constitutively expressed genes than organ-preferential loci. The allowance of alternative splicing is maximized by avoiding expression of the other alternative splicing transcript under unwanted circumstances. Among loci, transcripts with a longer CDS tended to be more highly expressed than those with a shorter CDS. These genome-wide technologies might be useful for studying transcriptomes and their biological significance with regard to pre-mRNA splicing.
Supplementary data
Article information
Articles from Mol. Cells are provided here courtesy of Mol. Cells
References
- Anders, S., Reyes, A., and Huber, W. (2012). Detecting differential usage of exons from RNA-seq data. Genome Res. 22, 2008-2017.
- Anders, S., Pyl, P.T., and Huber, W. (2015). HTSeq-a Python framework to work with high-throughput sequencing data. Bioinformatics. 31, 166-169.
- Arai-Kichise, Y., Shiwa, Y., Ebana, K., Shibata-Hatta, M., Yoshikawa, H., Yano, M., and Wakasa, K. (2014). Genome-wide DNA polymorphisms in seven rice cultivars of temperate and tropical japonica groups. PLoS One. 9, e86312.
- Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S., and Eppig, J.T. (2000). Gene Ontology, tool for the unification of biology. Nat Genet. 25, 25-29.
- Berget, S.M., Moore, C., and Sharp, P.A. (1977). Spliced segments at the 5′ terminus of adenovirus 2 late mRNA. Proc Natl Acad Sci USA. 74, 3171-3175.
- Burset, M., Seledtsov, I.A., and Solovyev, V.V. (2000). Analysis of canonical and non-canonical splice sites in mammalian genomes. Nucleic Acids Res. 28, 4364-4375.
- Campbell, M.A., Haas, B.J., Hamilton, J.P., Mount, S.M., and Buell, C.R. (2006). Comprehensive analysis of alternative splicing in rice and comparative analyses with Arabidopsis. BMC Genomics . 7, 327.
- Chae, S.H., Kim, J.S., Jun, K.M., Pahk, Y.-M., Kim, M.-J., Lee, S.-B., Park, H.-M., Lee, T.-H., Nahm, B.H., and Kim, Y.-K. (2016). Analysis of representative organ-specific genes and promoters of rice using a 3′ORF-oriented long oligomer microarray. J Plant Biol. 59, 579-593.
- Che, P., Lall, S., Nettleton, D., and Howell, S.H. (2006). Gene expression programs during shoot, root, and callus development in Arabidopsis tissue culture. Plant Physiol. 141, 620-637.
- Choudhary, C., Kumar, C., Gnad, F., Nielsen, M.L., Rehman, M., Walther, T.C., Olsen, J.V., and Mann, M. (2009). Lysine acetylation targets protein complexes and co-regulates major cellular functions. Science. 325, 834-840.
- Chow, L.T., Gelinas, R.E., Broker, T.R., and Roberts, R.J. (1977). An amazing sequence arrangement at the 5′ ends of adenovirus 2 messenger RNA. Cell. 12, 1-8.
- Coen, E.S., and Meyerowitz, E.M. (1991). The war of the whorls, genetic interactions controlling flower development. Nature. 353, 31-37.
- Cui, P., and Xiong, L. (2015). Environmental Stress and Pre-mRNA Splicing. Molecular Plant. 8, 1302-1303.
- Datta, A., and Jinks-Robertson, S. (1995). Association of increased spontaneous mutation rates with high levels of transcription in yeast. Science. 268, 1616-1619.
- Degenkolbe, T., Do, P.T., Kopka, J., Zuther, E., Hincha, D.K., and Köhl, K.I. (2013). Identification of drought tolerance markers in a diverse population of rice cultivars by expression and metabolite profiling. PLoS One. 22, e63637.
- Degenkolbe, T., Do, P.T., Zuther, E., Repsilber, D., Walther, D., Hincha, D.K., and Köhl, K.I. (2009). Expression profiling of rice cultivars differing in their tolerance to long-term drought stress. Plant Mol Biol. 69, 133-153.
- Ding, F., Cui, P., Wang, Z.Y., Zhang, S.D., Ali, S., and Xiong, L.M. (2014). Genome-wide analysis of alternative splicing of pre-mRNA under salt stress in Arabidopsis. BMC Genomics. 15, 431.
- Drake, J.W., Charlesworth, B., Charlesworth, D., and Crow, J.F. (1998). Rates of Spontaneous Mutation. Genetics. 148, 1667-1686.
- Duval, M., Hsieh, T.F., Kim, S.Y., and Thomas, T.L. (2002). Molecular characterization of AtNAM, a member of the Arabidopsis NAC domain super family. Plant Mol Biol. 50, 237-248.
- Eveland, A.L., McCarty, D.R., and Koch, K.E. (2008). Transcript Profiling by 3′-Untranslated Region Sequencing Resolves Expression of Gene Families1. Plant Physiol. 146, 32-44.
- Everitt, B. (1998). . The Cambridge Dictionary of Statistics, , ed. (Cambridge, UK:Cambridge University Press), pp. .
- Filichkin, S.A., Priest, H.D., Givan, S.A., Shen, R., Bryant, D.W., Fox, S.E., Wong, W.K., and Mockler, T.C. (2010). Genome-wide mapping of alternative splicing in Arabidopsis thaliana. Genome Res. 20, 45-58.
- Freeling, M. (2009). Bias in plant gene content following different sorts of duplication, tandem, whole-genome, segmental, or by transposition. Annu Rev Plant Biol. 60, 433-453.
- Gao, G., Zhong, Y., Guo, A., Zhu, Q., Tang, W., Zheng, W., Gu, X., Wei, L., and Luo, J. (2006). DRTF, a database of rice transcription factors. Bioinformatics. 22, 1286-1287.
- Glisovic, T., Bachorik, J.L., Yong, J., and Dreyfuss, G. (2008). RNA-binding proteins and post-transcriptional gene regulation. FEBS Letters. 582, 1977-1986.
- Gordon, S.P., Priest, H., Des Marais DLSchackwitz, W., Figueroa, M., Martin, J., Bragg, J.N., Tyler, L., Lee, C.R., and Bryant, D. (2014). Genome diversity in Brachypodium distachyon, deep sequencing of highly diverse inbred lines. Plant J. 79, 361-374.
- Huang, H.D., Horng, J.T., Lee, C.C., and Liu, B.J. (2003). ProSplicer, a database of putative alternative splicing information derived from protein, mRNA and expressed sequence tag sequence data. Genome Biol. 4, R29.
- Huang, Y., Niu, B., Gao, Y., Fu, L., and Li, W. (2010). CD-HIT Suite, a web server for clustering and comparing biological sequences. Bioinformatics. 26, 680-682.
- Huang, X., Kurata, N., Wei, X., Wang, Z., Wang, A., Zhao, Q., Zhao, Y., Liu, K., Lu, H., and Li, W. (2012). A map of rice genome variation reveals the origin of cultivated rice. Nature. 490, 497-501.
- Itoh, J., Sato, Y., Sato, Y., Hibara, K., Shimizu-Sato, S., Kobayashi, H., Takehisa, H., Sanguinet, K.A., Namiki, N., and Nagamura, Y. (2016). Genome-wide analysis of spatiotemporal gene expression patterns during early embryogenesis in rice. Development. 143, 1217-1227.
- Jiao, Y., Tausta, S.L., Gandotra, N., Sun, N., Liu, T., Clay, N.K., Ceserani, T., Chen, M., Ma, L., and Holford, M. (2009). A transcriptome atlas of rice cell types uncovers cellular, functional and developmental hierarchies. Nat Genet. 41, 258-263.
- Jung, K.-H., Bartley, L.E., Cao, P., Canlas, P.E., and Ronald, P.C. (2008). Analysis of alternatively spliced rice transcripts using microarray data. Rice . 2, 9020.
- Kim, N., Abdulovic, A.L., Gealy, R., Lippert, M.J., and Jinks-Robertson, S. (2007). Transcription-associated mutagenesis in yeast is directly proportional to the level of gene expression and influenced by the direction of DNA replication. DNA Repair (Amst). 6, 1285-1296.
- Kim, S.G., Kim, S.Y., and Park, C.M. (2007a). A membrane-associated NAC transcription factor regulates salt-responsive flowering via FLOWERING LOCUS T in Arabidopsis. Planta. 226, 647-654.
- Kim, D., Pertea, G., Trapnell, C., Pimentel, H., Kelley, R., and Salzberg, S.L. (2013). TopHat2, accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biology. 14, R36.
- Kimura, M. (1968). Evolutionary rate at the molecular level. Nature. 217, 624-626.
- Koscielny, G., Le Texier, V., Gopalakrishnan, C., Kumanduri, V., Riethoven, J., Nardone, F., Stanley, E., Fallsehr, C., Hofmann, O., and Kull, M. (2009). ASTD: the alternative splicing and transcript diversity database. Genomics. 93, 213-220.
- Kreps, J.A., Wu, Y., Chang, H.S., Zhu, T., Wang, X., and Harper, J.F. (2002). Transcriptome changes for Arabidopsis in response to salt, osmotic, and cold stress. Plant Physiol. 130, 2129-2141.
- Kriventseva, E.V., Koch, I., Apweiler, R., Vingron, M., Bork, P., Gelfand, M.S., and Sunyaev, S. (2003). Increase of functional diversity by alternative splicing. Trends in Genet. 19, 124-128.
- Kyndt, T., Denil, S., Haegeman, A., Trooskens, G., De Meyer, T., Van Criekinge, W., and Gheysen, G. (2012). Transcriptome analysis of rice mature root tissue and root tips in early development by massive parallel sequencing. J Exp Bot. 63, 2141-2157.
- Langmead, B., Trapnell, C., Pop, M., and Salzberg, S.L. (2009). Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25.
- Lenka, S.K., Katiyar, A., Chinnusamy, V., and Bansal, K.C. (2011). Comparative analysis of drought-responsive transcriptome in Indica rice genotypes with contrasting drought tolerance. Plant Biotech J. 9, 315-327.
- Lewis, B.P., Green, R.E., and Brenner, S.E. (2003). Evidence for the widespread coupling of alternative splicing and nonsense-mediated mRNA decay in humans. Proc Natl Acad Sci USA. 100, 189-192.
- Lindahl, T., and Nyberg, B. (1972). Rate of depurination of native deoxyribonucleic acid. Biochemistry. 11, 3610-3618.
- Liu, H.L., Xu, Y.Y., Xu, Z.H., and Chong, K. (2007). A rice YABBY gene, OsYABBY4, preferentially expresses in developing vascular tissue. Dev Genes Evol. 217, 629-637.
- Liu, S.-L., Baute, G.J., and Adams, K.L. (2011). Organ and cell type–specific complementary expression patterns and regulatory neofunctionalization between duplicated genes in Arabidopsis thaliana. Genome Biol Evol. 3, 1419-1436.
- Livak, K.J., and Schmittgen, T.D. (2001). Analysis of relative gene expression data using real-time quantitative PCR and the 2(-Delta Delta C(T)) method. Methods. 25, 402-408.
- Lomelin, D., Jorgenson, E., and Risch, N. (2010). Human genetic variation recognizes functional elements in noncoding sequence. Genome Res. 20, 311-319.
- Lorkovic, Z.J., Lehner, R., Forstner, C., and Barta, A. (2005). Evolutionary conservation of minor U12-type spliceosome between plants and animals. RNA. 11, 1095-1107.
- Lu, S.-J., Wei, H., Wang, Ya., Wang, H.-M., Yang, R.-F., Zhang, X.-B., and Tu, J.-M. (2012). Overexpression of a transcription factor OsMADS15 modifies plant architecture and flowering time in rice (Oryza sativa L.). Plant Mol Biol Rep. 30, 1461-1469.
- Lunde, B.M., Moore, C., and Varani, G. (2007). RNA-binding proteins, modular design for efficient function. Nat Rev Mol Cell Biol. 8, 479-490.
- Ma, L., Chen, C., Liu, X., Jiao, Y., Su, N., Li, L., Wang, X., Cao, M., Sun, N., and Zhang, X. (2005). A microarray analysis of the rice transcriptome and its comparison to Arabidopsis. Genome Res. 15, 1274-1283.
- Marquardt, S., Raitskin, O., Wu, Z., Liu, F., Sun, Q., and Dean, C. (2014). Functional consequences of splicing of the antisense transcript COOLAIR on FLC transcription. Mol Cell. 54, 156-165.
- Matsukura, S., Mizoi, J., Yoshida, T., Todaka, D., Ito, Y., Maruyama, K., Shinozaki, K., and Yamaguchi-Shinozaki, K. (2010). Comprehensive analysis of rice DREB2-type genes that encode transcription factors involved in the expression of abiotic stress-responsive genes. Mol Genet Genom. 283, 185-196.
- McCullagh, P., and Nelder, J.A. (1989). . Generalized linear models, , ed. (Boca Raton, FL:Chapman & Hall/CRC), pp. .
- Minh-Thu, P.T., Hwang, D.J., Jeon, J.S., Nahm, B.H., and Kim, Y.K. (2013). Transcriptome analysis of leaf and root of rice seedling to acute dehydration. Rice . 6, 38.
- Mortazavi, A., Williams, B.A., McCue, K., Schaeffer, L., and Wold, B. (2008). Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Meth. 5, 621-628.
- Mount, S.M., and Steize, J.A. (1981). Sequence of U1 RNA from Drosophila melanogaster, implications for U1 secondary structure and possible involvement in splicing. Nucleic Acids Res. 9, 6351-6383.
- Nookaew, I., Papini, M., Pornputtapong, N., Scalcinati, G., Fagerberg, L., Uhlén, M., and Nielsen, J. (2012). A comprehensive comparison of RNA-Seq-based transcriptome analysis from reads to differential gene expression and cross-comparison with microarrays, a case study in Saccharomyces cerevisiae. Nucleic Acids Res. 40, 10084-10097.
- Pan, Q., Shai, O., Lee, L.J., Frey, B.J., and Blencowe, B.J. (2008). Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nat Genet. 40, 1413-1415.
- Pelaz, S., Ditta, G.S., Baumann, E., Wisman, E., and Yanofsky, M.F. (2000). B and C floral organ identity functions require SEPALLATA MADS-box genes. Nature. 405, 200-203.
- Perez-Rodriguez, P., Riano-Pachon, D.M., Correa, L.G., Rensing, S.A., Kersten, B., and Mueller-Roeber, B. (2010). PlnTFDB, updated content and new features of the plant transcription factor database. Nucleic Acids Res. 38, D822-7.
- Petryszak, R., Burdett, T., Fiorelli, B., Fonseca, N.A., Gonzalez-Porta, M., Hastings, E., Huber, W., Jupp, S., Keays, M., and Kryvych, N. (2014). Expression Atlas update--a database of gene and transcript expression from microarray- and sequencing-based functional genomics experiments. Nucleic Acids Res. 42, D926-32.
- Price, A.H., Cairns, J.E., Horton, P., Jones, H.G., and Griffiths, H. (2002). Linking drought-resistance mechanisms to drought avoidance in upland rice using a QTL approach, progress and new opportunities to integrate stomatal and mesophyll responses. J Exp Bot. 53, 989-1004.
- Priya, P., and Jain, M. (Array). RiceSRTFDB: a database of rice transcription factors containing comprehensive expression, cis-regulatory element and mutant information to facilitate gene function analysis. Database (Oxford). , .
- , (2005). The map-based sequence of the rice genome. Nature. 436, 793-800.
- Rabbani, M.A., Maruyama, K., Abe, H., Khan, M.A., Katsura, K., Ito, Y., Yoshiwara, K., Seki, M., Shinozaki, K., and Yamaguchi-Shinozaki, K. (2003). Monitoring expression profiles of rice genes under cold, drought, and high-salinity stresses and abscisic acid application using cDNA microarray and RNA gel-blot analyses. Plant Physiol. 133, 1755-1767.
- Rabello, A.R., Guimaraes, C.M., Rangel, P.H., da Silva, F.R., Seixas, D., de Souza, E., Brasileiro, A.C., Spehar, C.R., Ferreira, M.E., and Mehta, A. (2008). Identification of drought-responsive genes in roots of upland rice (Oryza sativa L). BMC Genomics . 9, 485.
- Rensink, W.A., and Buell, C.R. (2005). Microarray expression profiling resources for plant genomics. Trends Plant Sci. 10, 603-609.
- Ritchie, M.E., Phipson, B., Wu, D., Hu, Y., Law, C.W., Shi, W., and Smyth, G.K. (2015). limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 43, e47.
- Salunkhe, A.S., Poornima, R., Prince, K.S.J., Kanagaraj, P., Sheeba, J.A., Amudha, K., Suji, K., Senthil, A., and Babu, R.C. (2011). Fine mapping QTL for drought resistance traits in rice (Oryza sativa L.) using bulk segregant analysis. Mol Biotechnol. 49, 90-95.
- Sarkar, N.K., Kim, Y., and Grover, A. (2014). Coexpression network analysis associated with call of rice seedlings for encountering heat stress. Plant Mol Biol . 84, 125-143.
- Seki, M., Narusaka, M., Ishida, J., Nanjo, T., Fujita, M., Oono, Y., Kamiya, A., Nakajima, M., Enju, A., and Sakurai, T. (2002). Monitoring the expression profiles of 7000 Arabidopsis genes under drought, cold and high-salinity stresses using a full-length cDNA microarray. Plant J. 31, 279-292.
- Seo PJPark MJLim MHKim SGLee MBaldwin, I.T., and Park, C.M. (2011). A self-regulatory circuit of CIRCADIANCLOCK-ASSOCIATED1 underlies the circadian clock regulation of temperature responses in Arabidopsis. Plant Cell 2011. 24, 2427-2442.
- Severin, A.J., Woody, J.L., Bolon, Y.T., JosephB Diers, B.W., Farmer, A.D., Muehlbauer, G.J., Nelson, R.T., Grant, D., and Specht, J.E. (2010). RNA-Seq Atlas of Glycine max, a guide to the soybean transcriptome. BMC Plant Biol. 10, 160.
- Shang, X., Cao, Y., and Ma, L. (2017). Alternative splicing in plant genes: a means of regulating the environmental fitness of plants. Int J Mol Sci. 18, 432.
- Shankar, A., Srivastava, A.K., Yadav, A.K., Sharma, M., Pandey, A., Raut, V.V., Das, M.K., Suprasanna, P., and Pandey, G.K. (2014). Whole genome transcriptome analysis of rice seedling reveals alterations in Ca 2 ion signaling and homeostasis in response to Ca 2 deficiency. Cell Calcium. 55, 155-165.
- Shinozaki, K., and Yamaguchi-Shinozaki, K. (2000). Molecular responses to dehydration and low temperature, differences and cross-talk between two stress signaling pathways. Curr Opin Plant Biol. 3, 217-223.
- Sperotto, R.A., Ricachenevsky, F.K., Duarte, G.L., Boff, T., Lopes, K.L., Sperb, E.R., Grusak, M.A., and Fett, J.P. (2009). Identification of upregulated genes in flag leaves during rice grain filling and characterization of OsNAC5, a new ABA-dependent transcription factor. Planta. 230, 985-1002.
- Su, N., He, K., Jiao, Y., Chen, C., Zhou, J., Li, L., Bai, S., Li, X., and Deng, X.W. (2007). Distinct reorganization of the genome transcription associates with organogenesis of somatic embryo, shoots, and roots in rice. Plant Mol Biol. 63, 337-349.
- Trapnell, C., Roberts, A., Goff, L., Pertea, G., Kim, D., Kelley, D.R., Pimentel, H., Salzberg, S.L., Rinn, J.L., and Pachter, L. (2012). Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat Protocols. 7, 562-578.
- Tsuji, H., Taoka, K., and Shimamoto, K. (2013). Florigen in rice, complex gene network for florigen transcription, florigen activation complex, and multiple functions. Curr Opin Plant Biol. 16, 228-235.
- Wakasa, Y., Oono, Y., Yazawa, T., Hayashi, S., Ozawa, K., Handa, H., Matsumoto, T., and Takaiwa, F. (2014). RNA sequencing-mediated transcriptome analysis of rice plants in endoplasmic reticulum stress conditions. BMC Plant Biol . 14, 101.
- Wang, L., Xie, W., Chen, Y., Tang, W., Yang, J., Ye, R., Liu, L., Lin, Y., Xu, C., Xiao, J., and Zhang, Q. (2010). A dynamic gene expression atlas covering the entire life cycle of rice. Plant J. 61, 752-766.
- Wang, Z., Wu, Z., Raitskin, O., Sun, Q., and Dean, C. (2014). Antisense-mediated FLC transcriptional repression requires the p-TEFb transcription elongation factor. Proc Natl Acad Sci USA. 111, 7468-7473.
- Wegmann, D., Dupanloup, I., and Excoffier, L. (2008). Width of gene expression profile drives alternative splicing. PLoS One. 3, e3587.
- Yamamoto, T., Nagasaki, H., Yonemaru, J., Ebana, K., Nakajima, M., Shibaya, T., and Yano, M. (2010). Fine definition of the pedigree haplotypes of closely related rice cultivars by means of genome-wide discovery of single-nucleotide polymorphisms. BMC Genomics . 11, 267.
- Yang, J., Chen, X., Zhu, C., Peng, X., He, X., Fu, J., Ouyang, L., Bian, J., Hu, L., and Sun, X. (2015). RNA-seq reveals differentially expressed genes of rice (Oryza sativa) spikelet in response to temperature interacting with nitrogen at meiosis stage. BMC Genomics . 16, 959.
- Yang, S.Y., Hao, D.L., Song, Z.Z., Yang, G.Z., Wang, L., and Su, Y.H. (2015). RNA-Seq analysis of differentially expressed genes in rice under varied nitrogen supplies. Gene. 555, 305-317.
- Yuan, J., Chen, D., Ren, Y., Zhang, X., and Zhao, J. (2008). Characteristic and expression analysis of a metallothionein gene, OsMT2b, down-regulated by cytokinin suggests functions in root development and seed embryo germination of rice. Plant Physiol. 146, 1637-1650.
- Yun, M.-S., Umemoto, T., and Kawagoe, Y. (2011). Rice debranching enzyme isoamylase3 facilitates starch metabolism and affects plastid morphogenesis. Plant Cell Physiol. 52, 1068-1082.
- Yun, D., Liang, W., Dreni, L., Yin, C., Zhou, Z., Kater, M.M., and Zhang, D. (2013). OsMADS16 genetically interacts with OsMADS3 and OsMADS58 in specifying floral patterning in rice. Mol Plant. 6, 743-756.
- Zeeberg, B.R., Feng, W., Wang, G., Wang, M.D., Fojo, A.T., Sunshine, M., Narasimhan, S., Kane, D.W., Reinhold, W.C., and Lababidi, S. (2003). GoMiner, a resource for biological interpretation of genomic and proteomic data. Genome Biol. 4, R28.
- Zhang, G., Guo, G., Hu, X., Zhang, Y., Li, Q., Li, R., Zhuang, R., Lu, Z., He, Z., and Fang, X. (2010). Deep RNA sequencing at single base-pair resolution reveals high complexity of the rice transcriptome. Genome Res. 20, 646-654.
- Zhu, Y., Hultin-Rosenberg, L., Forshed, J., Branca, R.M., Orre, L.M., and Lehtiö, J. (2014). SpliceVista, a tool for splice variant identification and visualization in shotgun proteomics data. Mol Cell Proteomics. 13, 1552-1562.
- Zhai, R., Feng, Y., Wang, H., Zhan, X., Shen, X., Wu, W., Zhang, Y., Chen, D., Dai, G., and Yang, Z. (2013). Transcriptome analysis of rice root heterosis by RNA-Seq. BMC Genomics . 14, 19.
- Zhao, S., Fung-Leung, W.-P., Bittner, A., Ngo, K., and Liu, X. (2014). Comparison of RNA-Seq and microarray in transcriptome profiling of activated T cells. PLoS One. 9, e78644.