Chromosome-level genome assembly of Calamus simplicifolius ========================================================== Zhao H; Wang S; Wang J; Chen C; Hao S; Chen L; Fei B; Han K; Li R; Shi C; Sun H; Wang S; Xu H; Yang K; Xu X; Shan X; Shi J; Feng A; Fan G; Liu X; Zhao S; Zhang C; Gao Q; Gao Z; Jiang Z (2018): GigaScience Database. http://dx.doi.org/10.5524/101052 Summary: -------- Calamus simplicifolius is a spiny, evergreen, climbing palm, usually forming an open cluster of vigorous, unbranched stems that can reach a length of 50 metres and about 12 - 15mm in diameter. This species produces cane of medium diameter, supreme for all types of binding and weaving in the furniture industry and widely used in China for cordage, house construction and the finest basketware. The lack of reference genome sequences is a major obstacle for basic and applied biology on rattan. Here we provide the chromosome-level genome assembly of C. simplicifolius using the Illumina, PacBio, and Hi-C sequencing data. A total of ~730 Gb of raw data covering the predicted genome length ~1.98 Gb to ~ 372× read depth. The de novo genome assembly of ~1.94 Gb generated a scaffold N50 of ~160 Mb with 51,235 intact predicted protein-coding gene models. BUSCO evaluation demonstrated that the genome completeness reached 96.4%. These essential data will not only provide a fundamental resource of functional genomics particularly in promoting germplasm utilization for breeding improved rattan material property, but also will serve as a reference genome for performing comparative studies between and among different species. Files: ------ C.simplicifolius.coding.gene.cds - coding gene file (CDS) for C. simplicifolius genome C.simplicifolius.coding.gene.gff - coding gene file(gff) for C. simplicifolius genome C.simplicifolius.coding.gene.pep - coding gene file (PEP) for C. simplicifolius genome perl_python_scripts.tar.gz - compressed archive of all scripts and commands used in analysis process. Note directory names ending with "bin" are pipeline scripts and file names ending with .sh are the shells or commands RAxML.phylogenetic_tree.newick - Phylogentices tree file(newick) single-copy.cds.phy.phase1 - Multiple alignments for 1647 single-copy gene family BUSCO_output_file/full_table_Genome_out.tsv - BUSCO assessment result file BUSCO_output_file/missing_busco_list_Genome_out.tsv - BUSCO assessment result file BUSCO_output_file/short_summary_Genome_out.txt - Summary of the BUSCO assessment file final_assembly_fasta/C.simplicifolius.HIC.fasta.gz - The genome of HIC version based on WGS version for C. simplicifolius final_assembly_fasta/C.simplicifolius.WGS.fasta.gz - The finally assembled genome sequences for C. simplicifolius genome intermediary-pre-combined-assemblies/step1.Platanus.contigs.fa.gz - the contigs assembled by platanus intermediary-pre-combined-assemblies/step2.DBG2OLC.assembles.fa.gz - genome scaffold sequences assembled by DBG2OLC intermediary-pre-combined-assemblies/step3.SSPACE.assembles.fa.gz - genome scaffold sequences linked by SSPACE lignin_genes/C.simp_lignin.4CL.cds - coding gene file (CDS) for 4CL (4-coumarate CoA ligase) familiy in C. simplicifolius genome lignin_genes/C.simp_lignin.C3H.cds - coding gene file (CDS) for C3H (Coumarate 3-hdroxylase) familiy in C. simplicifolius genome lignin_genes/C.simp_lignin.C4H.cds - coding gene file (CDS) for C4H (Cinnamate 4-hydroxylase) familiy in C. simplicifolius genome lignin_genes/C.simp_lignin.CAD.cds - coding gene file (CDS) for CAD (Cinnamyl alcohol dehydrogenase) familiy in C. simplicifolius genome lignin_genes/C.simp_lignin.CCoAOMT.cds - coding gene file (CDS) for CCoAMOT (Caffeoyl-CoA 3-O-methyltransferase) familiy in C. simplicifolius genome lignin_genes/C.simp_lignin.CCR.cds - coding gene file (CDS) for CCR (Cinnamoyl-CoA reductase) familiy in C. simplicifolius genome lignin_genes/C.simp_lignin.CHS.cds - coding gene file (CDS) for CHS (Chalcone synthase) familiy in C. simplicifolius genome lignin_genes/C.simp_lignin.COMT.cds - coding gene file (CDS) for COMT (Caffeic acid 3-O-methyltransferase) familiy in C. simplicifolius genome lignin_genes/C.simp_lignin.F5H.cds - coding gene file (CDS) for F5H (Ferulate 5-hydroxylase) familiy in C. simplicifolius genome lignin_genes/C.simp_lignin.HCT.cds - coding gene file (CDS) for HCT (hydroxycinnamoyl-CoA) familiy in C. simplicifolius genome lignin_genes/C.simp_lignin.LAC.cds - coding gene file (CDS) for LAC (Laccase) familiy in C. simplicifolius genome lignin_genes/C.simp_lignin.PAL.cds - coding gene file (CDS) for PAL (Phenylalanine ammonia-lyase) familiy in C. simplicifolius genome lignin_genes/C.simp_lignin.POD.cds - coding gene file (CDS) for POD (Peroxidase) familiy in C. simplicifolius genome repeats_transposable_elements_ncRNAs/proteinmask.gff - C. simplicifolius genome repeats_transposable_elements_ncRNAs/repeatmasker.gff - C. simplicifolius genome repeats_transposable_elements_ncRNAs/ncRNA/C.simplicifolius.miRNA.gff - Result file of miRNA for C. simplicifolius genome repeats_transposable_elements_ncRNAs/ncRNA/C.simplicifolius.rRNA.gff - Result file of rRNA for C. simplicifolius genome repeats_transposable_elements_ncRNAs/ncRNA/C.simplicifolius.snRNA.gff - Result file of snRNA for C. simplicifolius genome repeats_transposable_elements_ncRNAs/ncRNA/C.simplicifolius.tRNA.gff - Result file of tRNA for C. simplicifolius genome repeats_transposable_elements_ncRNAs/repeat/denovo.gff - C. simplicifolius genome transcript_assembly/Csim-XB-1-1A-Unigene.fa.gz - DNA was extracted from young leaves at the vegetative growth stage of the rattan (C. simplicifolius) in Guangzhou, China. (N: 23º11′29″, E: 113º22′40″, 87 M). The distal cirrus at the different stage were used to construct RNA library transcript_assembly/Csim-XB-1-2A-Unigene.fa.gz - DNA was extracted from young leaves at the vegetative growth stage of the rattan (C. simplicifolius) in Guangzhou, China. (N: 23º11′29″, E: 113º22′40″, 87 M). The distal cirrus at the different stage were used to construct RNA library transcript_assembly/Csim-XB-1-3A-Unigene.fa.gz - DNA was extracted from young leaves at the vegetative growth stage of the rattan (C. simplicifolius) in Guangzhou, China. (N: 23º11′29″, E: 113º22′40″, 87 M). The distal cirrus at the different stage were used to construct RNA library transcript_assembly/Csim-XB-1-4A-Unigene.fa.gz - DNA was extracted from young leaves at the vegetative growth stage of the rattan (C. simplicifolius) in Guangzhou, China. (N: 23º11′29″, E: 113º22′40″, 87 M). The distal cirrus at the different stage were used to construct RNA library transcript_assembly/Csim-XB-2-1A-Unigene.fa.gz - DNA was extracted from young leaves at the vegetative growth stage of the rattan (C. simplicifolius) in Guangzhou, China. (N: 23º11′29″, E: 113º22′40″, 87 M). The distal cirrus at the different stage were used to construct RNA library transcript_assembly/Csim-XB-2-2A-Unigene.fa.gz - DNA was extracted from young leaves at the vegetative growth stage of the rattan (C. simplicifolius) in Guangzhou, China. (N: 23º11′29″, E: 113º22′40″, 87 M). The distal cirrus at the different stage were used to construct RNA library transcript_assembly/Csim-XB-2-3A-Unigene.fa.gz - DNA was extracted from young leaves at the vegetative growth stage of the rattan (C. simplicifolius) in Guangzhou, China. (N: 23º11′29″, E: 113º22′40″, 87 M). The distal cirrus at the different stage were used to construct RNA library transcript_assembly/Csim-XB-2-4A-Unigene.fa.gz - DNA was extracted from young leaves at the vegetative growth stage of the rattan (C. simplicifolius) in Guangzhou, China. (N: 23º11′29″, E: 113º22′40″, 87 M). The distal cirrus at the different stage were used to construct RNA library transcript_assembly/Csim-XB-3-1A-Unigene.fa.gz - DNA was extracted from young leaves at the vegetative growth stage of the rattan (C. simplicifolius) in Guangzhou, China. (N: 23º11′29″, E: 113º22′40″, 87 M). The distal cirrus at the different stage were used to construct RNA library transcript_assembly/Csim-XB-3-2A-Unigene.fa.gz - DNA was extracted from young leaves at the vegetative growth stage of the rattan (C. simplicifolius) in Guangzhou, China. (N: 23º11′29″, E: 113º22′40″, 87 M). The distal cirrus at the different stage were used to construct RNA library transcript_assembly/Csim-XB-3-3A-Unigene.fa.gz - DNA was extracted from young leaves at the vegetative growth stage of the rattan (C. simplicifolius) in Guangzhou, China. (N: 23º11′29″, E: 113º22′40″, 87 M). The distal cirrus at the different stage were used to construct RNA library transcript_assembly/Csim-XB-3-4A-Unigene.fa.gz - DNA was extracted from young leaves at the vegetative growth stage of the rattan (C. simplicifolius) in Guangzhou, China. (N: 23º11′29″, E: 113º22′40″, 87 M). The distal cirrus at the different stage were used to construct RNA library