Mol. Cells 2020; 43(1): 86-95
Published online January 14, 2020
https://doi.org/10.14348/molcells.2019.0190
© The Korean Society for Molecular and Cellular Biology
Correspondence to : jsedwards@salud.unm.edu (JSE); paekwk@naver.com (WKP); jongbhak@genomics.org (JB)
This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/.
The red-crowned crane (
Keywords genome, longevity, red-crowned crane
The red-crowned crane (
There are currently two primary red-crowned crane populations, including a non-migratory population in northern Japan and a migratory continental group that ranges across Southeastern Russia, Northeastern China, Eastern Mongolia, and Korea (IUCN, 2017). It relies on wetlands for breeding and the loss and pollution of these habitats has resulted in severe population declines (Wang et al., 2011). It has been classified as threatened since 2000 and is classified as ‘endangered’ on the International Union for Conservation of Nature Red List (Yu et al., 2001). With only estimated 3,000 individuals in 2017, coupled with long generational lengths (12 years), it is one of the most endangered avian species (IUCN, 2017). Conservation genetic studies have primarily focused on microsatellite and mitochondrial markers, and an absence of genomic-level sequence data has precluded more comprehensive demographic modeling.
To infer the population history and evolutionary adaptations of this endangered species, we sequenced the first red-crowned crane whole genome and compared it to 18 other avian species. The Avian Phylogenomics Project (
The protocol for blood sample preparation was carried out in accordance with guidelines of Korean Association for Bird Protection under the Cultural Heritage Administration (Korea) permit. All experimental protocols were approved by the Genome Research Foundation. All methods were carried out in accordance with relevant guidelines and regulations. A blood sample was secured from a single female red-crowned crane collected upstream of Gunnam dam, Gyeonggi-do, Republic of Korea (Lat = 38°06′24.8″N and Long = 127°01′17.1″E). Genomic DNA was extracted using a QIAamp DNA Mini Kit (Qiagen, USA) following the manufacturer’s instructions. The DNA concentration was measured using the Qubit dsDNA assay kit (Invitrogen, USA) and an Infinite 200 PRO Nanoquant system (Tecan, Germany). Fragmentation of high-molecular weight genomic DNA was carried out with a Covaris S2 Ultrasonicator (Covaris, USA), generating 500 bp fragments. Whole-genome shotgun libraries were prepared using a TruSeq library sample prep kit (Illumina, USA). Aliquots were analyzed on an Agilent 2100 Bioanalyzer (Agilent Technologies, USA) to determine the library concentration and size. Sequencing was performed on an Illumina HiSeq 2000 sequencer (Illumina), using the TruSeq Paired-End Cluster Kit v3 (Illumina) and the TruSeq SBS HS Kit v3 (Illumina) for 200 cycles.
Low quality DNA reads with phred quality scores < 20 and/or ambiguous nucleotides ‘N’ ratios > 10% were filtered out. Clean reads were aligned to the grey-crowned crane genome sequence using BWA-aln 0.6.2 (Li and Durbin, 2009) at default settings. Polymerase chain reaction duplicates from the reads were removed using ‘rmdup’ command of SAMtools 0.1.18 (Li et al., 2009) at default settings. Complete consensus sequences were determined by SAMtools mpileup, Bcftools view, and SAMtools vcfutils.pl vcf2fq pipelines from the SAMtools 0.1.19 suite (Li et al., 2009). The consensus sequence included all alignment depths. Single nucleotide variant (SNV) calling was conducted based on the consensus genome sequences (McKenna et al. 2010; Van der Auwera et al., 2013). Indels were called using the GATK 3.3 UnifiedGenotyper with the -dcov 1000 option. The indels were marked by VariantFiltration in GATK 3.3 with the following criteria: (1) hard to validate, MQ0 ≥ 4 && ((MQ0 / (1.0 × DP)) > 0.1); (2) quality filter, QUAL < 10.0; (3) depth filter, DP < 5. The SnpEff 3.3 software was used to predict the effects of the indels. Heterozygous SNV rate was used as the nucleotide diversity value. The average avian nucleotide diversity was calculated by averaging the values of the American flamingo (0.00372), Anna’s hummingbird (0.00288), common ostrich (0.00176), downy woodpecker (0.00455), grey-crowned crane (0.00202), peregrine falcon (0.00112), white-throated tinamou (0.00560), and white-tailed tropicbird (0.00162). Coding sequence (CDS) were selected by filtering out the following conditions: (1) there is an ambiguous nucleotide ‘N’ in a CDS; (2) there is a premature stop codon in a CDS. Among the 14,173 CDSs, 13,407 CDSs were selected and used in subsequent analyses.
The protein sequences of the red-crowned crane were predicted using the gene model of the grey-crowned crane. We downloaded 18 high quality avian reference genomes from the GigaDB dataset (Zhang et al., 2014) (
We applied the tree topology and divergence times of the 18 avian reference genomes represented from Jarvis et al. (2015) to the positively selected gene (PSG) analysis. The divergence time of the red-crowned crane and grey-crowned crane was taken from TimeTree database (Hedges et al., 2015). The white-throated tinamou’s divergence time was also taken from TimeTree (Hedges et al., 2015). The multiple sequence alignment of orthologous genes was constructed using PRANK alignment program package (Loytynoja and Goldman, 2010), and the rates of synonymous (
Whole genome sequencing data can be used to accurately estimate the population history (Li and Durbin, 2011; Osada, 2014). We inferred the demographic history of the red-crowned crane using a pairwise sequentially Markovian coalescent (PSMC) analysis with scaffolds of length ≥ 50 kb. For grey-crowned crane, we downloaded illumina short reads from National Center for Biotechnology Information (Acc. PRJNA212879) and used in the PSMC analysis. The PSMC analysis was performed with 100 bootstrapping rounds, with 9.59×10−9 substitutions per site per generation and generation time of 12.3 years, as previously reported (IUCN, 2017).
Genomic DNA from a red-crowned crane individual was sequenced using the Illumina HiSeq2000 platform. We produced a total of 989 million paired-end reads with a 100 bp read length and an insert size of 216 bp (Supplementary Table S2, Supplementary Fig. S1). After trimming the low-quality reads, we obtained 750 M (75.86%) reads, yielding 75 Gb and a depth of coverage of 66x. A
To investigate the environmental adaptations of the red-crowned crane and the genetic basis of its longevity (Table 1) (Ji and DeWoody, 2017), we analyzed the red-crowned crane consensus sequence compared to 18 other avian genomes (Supplementary Table S6). Among these genomes, we identified a total of 31,795 orthologous genes, 7,455 of which were conserved across all 18 of the avian genomes (Fig. 1A). Of these, 5,878 orthologous families had one-to-one relationships among the 19 avian genomes. We used a phylogenetic tree topology which was constructed by Bayesian phylogenomic analysis (Zhang et al., 2014) (Fig. 1B) and estimated the divergence time with one-to-one orthologous genes. It estimated the divergence time between the red-crowned and grey-crowned cranes to be 20 Mya. The reported divergence time between the red-crowned crane and the common ostrich (used as an outgroup in this study) is roughly 111 Mya (Hedges et al., 2015).
Avian body sizes are known to be positively correlated with lifespan (Supplementary Table S1) and negatively correlated with metabolic rate (Zhang et al., 2014). In order to investigate the contribution of genes in these pathways to the longevity of the red-crowned crane, PSGs were identified among the 5,878 one-to-one orthologues shared by all 19 species. A total of three and 162 genes were identified as putative PSGs in the red-crowned crane by applying the branch-site model and branch model, respectively (Supplementary Tables S7 and S8). With the PSG set, a functional enrichment analysis of the red-crowned crane PSGs was conducted using the DAVID functional annotation tool (Huang et al., 2008) for Gene Ontology (GO) categories (Fig. 2A, Supplementary Table S9) and the Fisher’s exact tests for the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways (Fig. 2B, Supplementary Table S10). Two statistically enriched GO terms included “ribonucleoprotein complex biogenesis” (GO:0022613;
Telomere length has been proposed to be a ‘molecular clock’ that underlies organismal aging (Collins, 2008; Collins and Mitchell, 2002). Interestingly, the
We further investigated red-crowned crane PSGs involved in metabolic pathways, which are associated with longevity (Speakman, 2005), by searching against the KEGG database (Kanehisa and Goto, 2000). We could identify seven red-crowned crane PSGs (
The common ostrich has a lifespan comparable to the red-crowned crane (Wasser and Sherman, 2010), and it is possible that the longevity of these species could be due in part to convergent polymorphisms affecting the rate of energy metabolism. To identify any shared adaptations for longevity-related physiological characteristics, we compared the PSGs of the red-crowned crane with that of the common ostrich. We identified 36 and 76 ostrich PSGs using branch-site and branch model, respectively (Supplementary Tables S11 and S12). Six of the ostrich PSGs (
The SSAAs with possible function-change can provide evolutionary information to given species. We hypothesized that the common genes with function altering SSAAs in both red-crowned crane and common ostrich are probably associated with longevity by comparing common ostrich SSAAs to the other 18 avian species with an average lifespan. We could identify 10,126 SSAAs for the common ostrich from 3,551 genes (Supplementary Table S13). Among them, 2,001 genes contained at least one function altering SSAA (in a total of 3,698 ostrich SSAAs). Interestingly, 435 genes with ostrich function-altering SSAAs were shared with the red-crowned crane, suggesting the possibility of some level of convergent evolution (Supplementary Table S14). The permutation test of these genes identified an enrichement of “cilium morphogenesis” (
A PSMC analysis was used to model the historical population size fluctuations of the red-crowned crane and the closely related grey-crowned crane (Fig. 3) (Li and Durbin, 2011). We hypothesized that historical population size of these two cranes helps understand crane’s adaptation associated with environmental condition. Based on the genomic data of the grey-crowned crane (Balericinae) and red-crowned crane (Gruinae), we estimated effective population size (
In conclusion, we generated and analyzed the first whole genome sequence data for the red-crowned crane. Evolutionary genomic analyses of 19 avian species, including the long-lived red-crowned crane and common ostrich, yielded PSGs and function-altering SSAA candidate genes associated with longevity. Functional annotations and enrichment analyses were conducted using GO, KEGG pathways, and GeneAge databases. Both PSGs and function-altering SSAA analyses showed candidate genes responsible for longevity in red-crowed crane by using bioinformatic analysis and discussed longevity-related biological and physiological features previously identified in both human and animal models. Demographic modeling of the red-crowned crane revealed low genetic diversity and a trend of population decline. Taken together, these findings further highlight the importance of continued monitoring and management of this endangered species. Despite the SNV based genetic differences predicted by PSGs and SSAAs which provided us with some speculative interpretations on how such variations affected certain longevity related genes in the red-crowned crane species, it should be noted that the distinct phenotypic differences between the grey-crowned crane reference and whole genome sequences of red-crowned crane that are mapped to the reference may not be fully explained by SNPs only. SNPs are usually functionally relevant in expressed genes and longevity as a biological phenomenon is a whole species evolution and regulation problem that involves numerous regulatory mechanisms of transcriptional, translational, and even epigenetic factors which are often associated with the genome structure itself. Therefore, it will be absolutely necessary to produce a high quality red-crowned crane de novo assembly in the future to compare the structural and copy number variations between the two species’ references with long DNA read generation methods such as PacBio SMRT and Oxford Nanopore sequencing. Nevertheless, we hope the consensus genome presented here will be a valuable resource for aging research and future conservation genetic studies of this iconic species.
This study was supported by PGI of Genome Research Foundation and Clinomics Ltd. internal research funds and the No.10075262 of National Center for standard Reference Data. This research was also supported by Ulsan National Institute of Science & Technology (UNIST) internal research funds. And the National Research Foundation of Korea (2013M3A9A5047052 and 2017M3A9A5048999) research funds.
The authors thank many people not listed as authors who provided analyses, data, feedback, samples, and encouragement. Especially, thanks to Taehyung Kim and Byungchul Kim.
O.C., Y.S.C., and J.J. are employees and J.B. is one of the founders of Clinomics Ltd. All other authors have no potential conflicts of interest to disclose.
Whole-genome sequence data were deposited in the SRA database at NCBI with BioSample accession number SAMN07580860. The data can be accessed via reference number SRX3148473 or through BioProject accession number PRJNA400839. Currently, another genome project of the red-crowned crane was reported (PRJNA400839).
Reference genomes and biological traits
Latin name | Common name | Abbreviation | Height (cm) | Body weight (kg) | Metabolic rate (W/kg) | Lifespan (y) | ||
---|---|---|---|---|---|---|---|---|
Wild | Captivity | |||||||
1 | American flamingoc | PHORU | 120–145 | 2.8 | 15.254d | - | - | |
2 | Anna’s hummingbirdb | CALAN | 10–11 | 0.004 | - | 8.5 | - | |
3 | Bald eagleb | HALEU | 70–102 | 5.6 (female) | - | - | - | |
4 | Budgerigar (parakeet)b | MELUN | 18 | 0.03–0.04 | 9.8e | - | 21.0 | |
5 | Common cuckoob | CUCCA | 32–34 | 0.11–0.13 | 0.838d | 12.9 | - | |
6 | Common ostrichb | STRCA | 170–280 | 63–145 | 6.305d | - | 70.0 | |
7 | Downy woodpeckerb | PICPU | 14–17 | 0.02–0.03 | 0.383d | 11.9 | - | |
8 | Emperor penguinb | LEPDI | 110–130 | 23 | 42.871d | - | 23.4 | |
9 | Grey-crowned cranec | BALRE | 100 | 3.5 | - | - | 27.2 | |
10 | Hoatzinb | OPHHO | 65 | 0.8 | - | - | - | |
11 | Killdeerb | CHAVO | 23–27 | 0.09 | 0.416d | 10.9 | - | |
12 | Little egretb | EGRGA | 55–65 | 0.35–0.55 | - | 22.34 | - | |
13 | Peking duckb | ANAPL | 50–76 | 1.6–2.3 | 4.068d | 23.4 | - | |
14 | Peregrine falconb | FALPE | 34–58 | 0.33–1.5 | - | - | - | |
15 | Chickena | GALGA | 30–45 | 1–3 | 6.005d | - | 30.0 | |
16 | Rock pigeonb | COLLI | 32–37 | 0.36 | 1.714d | - | - | |
17 | White-tailed tropicbirdc | PHALE | 71–80 | 0.33 | - | - | - | |
18 | White-throated tinamoub | THGUT | 32–36 | - | - | - | - | |
19 | Red-crowned crane | GRJAP | 150–158 | 8.9 | - | 30.0 | 65.0 |
-, data not available.
bhigh-coverage genomes,
clow-coverage genomes.
Basal metabolic rate values were obtained from the literature
PSGs involved in pathways related to metabolism in the red-crowned crane
KEGG pathways | Branch model PSGs | ||
---|---|---|---|
Histidine metabolism | 0.3162 | 1.5981 | |
Oxidative phosphorylation | 0.1328 | 1.1281 | |
0.1965 | 2.2221 | ||
Glycosaminoglycan biosynthesis - heparan sulfate / heparin | 0.0437 | 1.1207 | |
Metabolic pathways | 0.0304 | 1.0047 | |
0.1328 | 1.1281 | ||
0.1965 | 2.2221 | ||
0.1664 | 1.7130 | ||
Nicotinate and nicotinamide metabolism | 0.0304 | 1.0047 | |
Pantothenate and CoA biosynthesis | 0.1664 | 1.7130 | |
Selenocompound metabolism | 0.3001 | 1.1039 |
PSGs shared in both red-crowned crane and common ostrich
PSGs | Statistics of PSGs prediction of red-crowned crane | Statistics of PSGs prediction of common ostrich | ||||||
---|---|---|---|---|---|---|---|---|
Branch models | Branch-site models | Branch models | Branch-site models | |||||
M0:one-ratioa | M1:free-ratiob | 2ΔLc | M0:one-ratioa | M1:free-ratiob | 2ΔLc | |||
0.4014 | 1.0295 | - | - | - | - | 10.8273 | 0.0155 | |
0.1016 | 1.5792 | - | - | 0.1016 | 1.8746 | 7.9132 | 0.0489 | |
0.1454 | 1.2380 | - | - | 0.1454 | 1.4134 | - | - | |
0.0186 | 1.4472 | - | - | 0.0186 | 1.6574 | - | - | |
0.2138 | 2.1448 | - | - | 0.2138 | 2.6535 | - | - | |
0.4847 | 2.1552 | - | - | 0.4847 | 1.3068 | - | - |
-, no statistical significance.
bM1 denotes
c2ΔL: Likelihood ratio tests (LRT) were used to detect positive selection.
Mol. Cells 2020; 43(1): 86-95
Published online January 31, 2020 https://doi.org/10.14348/molcells.2019.0190
Copyright © The Korean Society for Molecular and Cellular Biology.
HyeJin Lee1,8, Jungeun Kim1,8, Jessica A. Weber2,8, Oksung Chung3, Yun Sung Cho3, Sungwoong Jho1, JeHoon Jun3, Hak-Min Kim4,5, Jeongheui Lim6, Jae-Pil Choi1, Sungwon Jeon4,5, Asta Blazyte4,5, Jeremy S. Edwards7,*, Woon Kee Paek6,*, and Jong Bhak1,3,4,5,*
1Personal Genomics Institute, Genome Research Foundation, Cheongju 28160, Korea, 2Department of Genetics, Harvard Medical School, Boston, MA 02115, USA, 3Clinomics, Ulsan 44919, Korea, 4KOGIC, Ulsan National Institute of Science and Technology, Ulsan 44919, Korea, 5Department of Biomedical Engineering, School of Life Sciences, Ulsan National Institute of Science and Technology (UNIST), Ulsan 44919, Korea, 6National Science Museum, Ministry of Science and ICT, Daejeon 34143, Korea, 7Chemistry and Chemical Biology, UNM Comprehensive Cancer Center, University of New Mexico, Albuquerque, NM 87131, USA, 8These authors contributed equally to this work.
Correspondence to:jsedwards@salud.unm.edu (JSE); paekwk@naver.com (WKP); jongbhak@genomics.org (JB)
This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/.
The red-crowned crane (
Keywords: genome, longevity, red-crowned crane
The red-crowned crane (
There are currently two primary red-crowned crane populations, including a non-migratory population in northern Japan and a migratory continental group that ranges across Southeastern Russia, Northeastern China, Eastern Mongolia, and Korea (IUCN, 2017). It relies on wetlands for breeding and the loss and pollution of these habitats has resulted in severe population declines (Wang et al., 2011). It has been classified as threatened since 2000 and is classified as ‘endangered’ on the International Union for Conservation of Nature Red List (Yu et al., 2001). With only estimated 3,000 individuals in 2017, coupled with long generational lengths (12 years), it is one of the most endangered avian species (IUCN, 2017). Conservation genetic studies have primarily focused on microsatellite and mitochondrial markers, and an absence of genomic-level sequence data has precluded more comprehensive demographic modeling.
To infer the population history and evolutionary adaptations of this endangered species, we sequenced the first red-crowned crane whole genome and compared it to 18 other avian species. The Avian Phylogenomics Project (
The protocol for blood sample preparation was carried out in accordance with guidelines of Korean Association for Bird Protection under the Cultural Heritage Administration (Korea) permit. All experimental protocols were approved by the Genome Research Foundation. All methods were carried out in accordance with relevant guidelines and regulations. A blood sample was secured from a single female red-crowned crane collected upstream of Gunnam dam, Gyeonggi-do, Republic of Korea (Lat = 38°06′24.8″N and Long = 127°01′17.1″E). Genomic DNA was extracted using a QIAamp DNA Mini Kit (Qiagen, USA) following the manufacturer’s instructions. The DNA concentration was measured using the Qubit dsDNA assay kit (Invitrogen, USA) and an Infinite 200 PRO Nanoquant system (Tecan, Germany). Fragmentation of high-molecular weight genomic DNA was carried out with a Covaris S2 Ultrasonicator (Covaris, USA), generating 500 bp fragments. Whole-genome shotgun libraries were prepared using a TruSeq library sample prep kit (Illumina, USA). Aliquots were analyzed on an Agilent 2100 Bioanalyzer (Agilent Technologies, USA) to determine the library concentration and size. Sequencing was performed on an Illumina HiSeq 2000 sequencer (Illumina), using the TruSeq Paired-End Cluster Kit v3 (Illumina) and the TruSeq SBS HS Kit v3 (Illumina) for 200 cycles.
Low quality DNA reads with phred quality scores < 20 and/or ambiguous nucleotides ‘N’ ratios > 10% were filtered out. Clean reads were aligned to the grey-crowned crane genome sequence using BWA-aln 0.6.2 (Li and Durbin, 2009) at default settings. Polymerase chain reaction duplicates from the reads were removed using ‘rmdup’ command of SAMtools 0.1.18 (Li et al., 2009) at default settings. Complete consensus sequences were determined by SAMtools mpileup, Bcftools view, and SAMtools vcfutils.pl vcf2fq pipelines from the SAMtools 0.1.19 suite (Li et al., 2009). The consensus sequence included all alignment depths. Single nucleotide variant (SNV) calling was conducted based on the consensus genome sequences (McKenna et al. 2010; Van der Auwera et al., 2013). Indels were called using the GATK 3.3 UnifiedGenotyper with the -dcov 1000 option. The indels were marked by VariantFiltration in GATK 3.3 with the following criteria: (1) hard to validate, MQ0 ≥ 4 && ((MQ0 / (1.0 × DP)) > 0.1); (2) quality filter, QUAL < 10.0; (3) depth filter, DP < 5. The SnpEff 3.3 software was used to predict the effects of the indels. Heterozygous SNV rate was used as the nucleotide diversity value. The average avian nucleotide diversity was calculated by averaging the values of the American flamingo (0.00372), Anna’s hummingbird (0.00288), common ostrich (0.00176), downy woodpecker (0.00455), grey-crowned crane (0.00202), peregrine falcon (0.00112), white-throated tinamou (0.00560), and white-tailed tropicbird (0.00162). Coding sequence (CDS) were selected by filtering out the following conditions: (1) there is an ambiguous nucleotide ‘N’ in a CDS; (2) there is a premature stop codon in a CDS. Among the 14,173 CDSs, 13,407 CDSs were selected and used in subsequent analyses.
The protein sequences of the red-crowned crane were predicted using the gene model of the grey-crowned crane. We downloaded 18 high quality avian reference genomes from the GigaDB dataset (Zhang et al., 2014) (
We applied the tree topology and divergence times of the 18 avian reference genomes represented from Jarvis et al. (2015) to the positively selected gene (PSG) analysis. The divergence time of the red-crowned crane and grey-crowned crane was taken from TimeTree database (Hedges et al., 2015). The white-throated tinamou’s divergence time was also taken from TimeTree (Hedges et al., 2015). The multiple sequence alignment of orthologous genes was constructed using PRANK alignment program package (Loytynoja and Goldman, 2010), and the rates of synonymous (
Whole genome sequencing data can be used to accurately estimate the population history (Li and Durbin, 2011; Osada, 2014). We inferred the demographic history of the red-crowned crane using a pairwise sequentially Markovian coalescent (PSMC) analysis with scaffolds of length ≥ 50 kb. For grey-crowned crane, we downloaded illumina short reads from National Center for Biotechnology Information (Acc. PRJNA212879) and used in the PSMC analysis. The PSMC analysis was performed with 100 bootstrapping rounds, with 9.59×10−9 substitutions per site per generation and generation time of 12.3 years, as previously reported (IUCN, 2017).
Genomic DNA from a red-crowned crane individual was sequenced using the Illumina HiSeq2000 platform. We produced a total of 989 million paired-end reads with a 100 bp read length and an insert size of 216 bp (Supplementary Table S2, Supplementary Fig. S1). After trimming the low-quality reads, we obtained 750 M (75.86%) reads, yielding 75 Gb and a depth of coverage of 66x. A
To investigate the environmental adaptations of the red-crowned crane and the genetic basis of its longevity (Table 1) (Ji and DeWoody, 2017), we analyzed the red-crowned crane consensus sequence compared to 18 other avian genomes (Supplementary Table S6). Among these genomes, we identified a total of 31,795 orthologous genes, 7,455 of which were conserved across all 18 of the avian genomes (Fig. 1A). Of these, 5,878 orthologous families had one-to-one relationships among the 19 avian genomes. We used a phylogenetic tree topology which was constructed by Bayesian phylogenomic analysis (Zhang et al., 2014) (Fig. 1B) and estimated the divergence time with one-to-one orthologous genes. It estimated the divergence time between the red-crowned and grey-crowned cranes to be 20 Mya. The reported divergence time between the red-crowned crane and the common ostrich (used as an outgroup in this study) is roughly 111 Mya (Hedges et al., 2015).
Avian body sizes are known to be positively correlated with lifespan (Supplementary Table S1) and negatively correlated with metabolic rate (Zhang et al., 2014). In order to investigate the contribution of genes in these pathways to the longevity of the red-crowned crane, PSGs were identified among the 5,878 one-to-one orthologues shared by all 19 species. A total of three and 162 genes were identified as putative PSGs in the red-crowned crane by applying the branch-site model and branch model, respectively (Supplementary Tables S7 and S8). With the PSG set, a functional enrichment analysis of the red-crowned crane PSGs was conducted using the DAVID functional annotation tool (Huang et al., 2008) for Gene Ontology (GO) categories (Fig. 2A, Supplementary Table S9) and the Fisher’s exact tests for the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways (Fig. 2B, Supplementary Table S10). Two statistically enriched GO terms included “ribonucleoprotein complex biogenesis” (GO:0022613;
Telomere length has been proposed to be a ‘molecular clock’ that underlies organismal aging (Collins, 2008; Collins and Mitchell, 2002). Interestingly, the
We further investigated red-crowned crane PSGs involved in metabolic pathways, which are associated with longevity (Speakman, 2005), by searching against the KEGG database (Kanehisa and Goto, 2000). We could identify seven red-crowned crane PSGs (
The common ostrich has a lifespan comparable to the red-crowned crane (Wasser and Sherman, 2010), and it is possible that the longevity of these species could be due in part to convergent polymorphisms affecting the rate of energy metabolism. To identify any shared adaptations for longevity-related physiological characteristics, we compared the PSGs of the red-crowned crane with that of the common ostrich. We identified 36 and 76 ostrich PSGs using branch-site and branch model, respectively (Supplementary Tables S11 and S12). Six of the ostrich PSGs (
The SSAAs with possible function-change can provide evolutionary information to given species. We hypothesized that the common genes with function altering SSAAs in both red-crowned crane and common ostrich are probably associated with longevity by comparing common ostrich SSAAs to the other 18 avian species with an average lifespan. We could identify 10,126 SSAAs for the common ostrich from 3,551 genes (Supplementary Table S13). Among them, 2,001 genes contained at least one function altering SSAA (in a total of 3,698 ostrich SSAAs). Interestingly, 435 genes with ostrich function-altering SSAAs were shared with the red-crowned crane, suggesting the possibility of some level of convergent evolution (Supplementary Table S14). The permutation test of these genes identified an enrichement of “cilium morphogenesis” (
A PSMC analysis was used to model the historical population size fluctuations of the red-crowned crane and the closely related grey-crowned crane (Fig. 3) (Li and Durbin, 2011). We hypothesized that historical population size of these two cranes helps understand crane’s adaptation associated with environmental condition. Based on the genomic data of the grey-crowned crane (Balericinae) and red-crowned crane (Gruinae), we estimated effective population size (
In conclusion, we generated and analyzed the first whole genome sequence data for the red-crowned crane. Evolutionary genomic analyses of 19 avian species, including the long-lived red-crowned crane and common ostrich, yielded PSGs and function-altering SSAA candidate genes associated with longevity. Functional annotations and enrichment analyses were conducted using GO, KEGG pathways, and GeneAge databases. Both PSGs and function-altering SSAA analyses showed candidate genes responsible for longevity in red-crowed crane by using bioinformatic analysis and discussed longevity-related biological and physiological features previously identified in both human and animal models. Demographic modeling of the red-crowned crane revealed low genetic diversity and a trend of population decline. Taken together, these findings further highlight the importance of continued monitoring and management of this endangered species. Despite the SNV based genetic differences predicted by PSGs and SSAAs which provided us with some speculative interpretations on how such variations affected certain longevity related genes in the red-crowned crane species, it should be noted that the distinct phenotypic differences between the grey-crowned crane reference and whole genome sequences of red-crowned crane that are mapped to the reference may not be fully explained by SNPs only. SNPs are usually functionally relevant in expressed genes and longevity as a biological phenomenon is a whole species evolution and regulation problem that involves numerous regulatory mechanisms of transcriptional, translational, and even epigenetic factors which are often associated with the genome structure itself. Therefore, it will be absolutely necessary to produce a high quality red-crowned crane de novo assembly in the future to compare the structural and copy number variations between the two species’ references with long DNA read generation methods such as PacBio SMRT and Oxford Nanopore sequencing. Nevertheless, we hope the consensus genome presented here will be a valuable resource for aging research and future conservation genetic studies of this iconic species.
This study was supported by PGI of Genome Research Foundation and Clinomics Ltd. internal research funds and the No.10075262 of National Center for standard Reference Data. This research was also supported by Ulsan National Institute of Science & Technology (UNIST) internal research funds. And the National Research Foundation of Korea (2013M3A9A5047052 and 2017M3A9A5048999) research funds.
The authors thank many people not listed as authors who provided analyses, data, feedback, samples, and encouragement. Especially, thanks to Taehyung Kim and Byungchul Kim.
O.C., Y.S.C., and J.J. are employees and J.B. is one of the founders of Clinomics Ltd. All other authors have no potential conflicts of interest to disclose.
Whole-genome sequence data were deposited in the SRA database at NCBI with BioSample accession number SAMN07580860. The data can be accessed via reference number SRX3148473 or through BioProject accession number PRJNA400839. Currently, another genome project of the red-crowned crane was reported (PRJNA400839).
. Reference genomes and biological traits.
Latin name | Common name | Abbreviation | Height (cm) | Body weight (kg) | Metabolic rate (W/kg) | Lifespan (y) | ||
---|---|---|---|---|---|---|---|---|
Wild | Captivity | |||||||
1 | American flamingoc | PHORU | 120–145 | 2.8 | 15.254d | - | - | |
2 | Anna’s hummingbirdb | CALAN | 10–11 | 0.004 | - | 8.5 | - | |
3 | Bald eagleb | HALEU | 70–102 | 5.6 (female) | - | - | - | |
4 | Budgerigar (parakeet)b | MELUN | 18 | 0.03–0.04 | 9.8e | - | 21.0 | |
5 | Common cuckoob | CUCCA | 32–34 | 0.11–0.13 | 0.838d | 12.9 | - | |
6 | Common ostrichb | STRCA | 170–280 | 63–145 | 6.305d | - | 70.0 | |
7 | Downy woodpeckerb | PICPU | 14–17 | 0.02–0.03 | 0.383d | 11.9 | - | |
8 | Emperor penguinb | LEPDI | 110–130 | 23 | 42.871d | - | 23.4 | |
9 | Grey-crowned cranec | BALRE | 100 | 3.5 | - | - | 27.2 | |
10 | Hoatzinb | OPHHO | 65 | 0.8 | - | - | - | |
11 | Killdeerb | CHAVO | 23–27 | 0.09 | 0.416d | 10.9 | - | |
12 | Little egretb | EGRGA | 55–65 | 0.35–0.55 | - | 22.34 | - | |
13 | Peking duckb | ANAPL | 50–76 | 1.6–2.3 | 4.068d | 23.4 | - | |
14 | Peregrine falconb | FALPE | 34–58 | 0.33–1.5 | - | - | - | |
15 | Chickena | GALGA | 30–45 | 1–3 | 6.005d | - | 30.0 | |
16 | Rock pigeonb | COLLI | 32–37 | 0.36 | 1.714d | - | - | |
17 | White-tailed tropicbirdc | PHALE | 71–80 | 0.33 | - | - | - | |
18 | White-throated tinamoub | THGUT | 32–36 | - | - | - | - | |
19 | Red-crowned crane | GRJAP | 150–158 | 8.9 | - | 30.0 | 65.0 |
-, data not available..
bhigh-coverage genomes,
clow-coverage genomes.
Basal metabolic rate values were obtained from the literature.
. PSGs involved in pathways related to metabolism in the red-crowned crane.
KEGG pathways | Branch model PSGs | ||
---|---|---|---|
Histidine metabolism | 0.3162 | 1.5981 | |
Oxidative phosphorylation | 0.1328 | 1.1281 | |
0.1965 | 2.2221 | ||
Glycosaminoglycan biosynthesis - heparan sulfate / heparin | 0.0437 | 1.1207 | |
Metabolic pathways | 0.0304 | 1.0047 | |
0.1328 | 1.1281 | ||
0.1965 | 2.2221 | ||
0.1664 | 1.7130 | ||
Nicotinate and nicotinamide metabolism | 0.0304 | 1.0047 | |
Pantothenate and CoA biosynthesis | 0.1664 | 1.7130 | |
Selenocompound metabolism | 0.3001 | 1.1039 |
. PSGs shared in both red-crowned crane and common ostrich.
PSGs | Statistics of PSGs prediction of red-crowned crane | Statistics of PSGs prediction of common ostrich | ||||||
---|---|---|---|---|---|---|---|---|
Branch models | Branch-site models | Branch models | Branch-site models | |||||
M0:one-ratioa | M1:free-ratiob | 2ΔLc | M0:one-ratioa | M1:free-ratiob | 2ΔLc | |||
0.4014 | 1.0295 | - | - | - | - | 10.8273 | 0.0155 | |
0.1016 | 1.5792 | - | - | 0.1016 | 1.8746 | 7.9132 | 0.0489 | |
0.1454 | 1.2380 | - | - | 0.1454 | 1.4134 | - | - | |
0.0186 | 1.4472 | - | - | 0.0186 | 1.6574 | - | - | |
0.2138 | 2.1448 | - | - | 0.2138 | 2.6535 | - | - | |
0.4847 | 2.1552 | - | - | 0.4847 | 1.3068 | - | - |
-, no statistical significance..
bM1 denotes
c2ΔL: Likelihood ratio tests (LRT) were used to detect positive selection.
Hanseul Lee and Seung-Jae V. Lee
Mol. Cells 2022; 45(11): 763-770 https://doi.org/10.14348/molcells.2022.0097Shuhei Nakamura, and Tamotsu Yoshimori
Mol. Cells 2018; 41(1): 65-72 https://doi.org/10.14348/molcells.2018.2333Bo-Rahm Lee, Suhyung Cho, Yoseb Song, Sun Chang Kim, and Byung-Kwan Cho
Mol. Cells 2013; 35(5): 359-370 https://doi.org/10.1007/s10059-013-0127-5