Diploid Perilla de novo genome sequencing


Tae-Ho Kim

National Institute of Agricultural Science, Korea

: J Plant Physiol Pathol

Abstract


Perilla is a self-pollinating annual herbaceous plant of the Laminaceae family and has mainly been cultivated as an oil crop in East Asia. Due to the recent increase in demand for functional foods, the structural and functional characteristics of Perilla genome have been heavily focused upon. Here we report progress on the de novo genome assembly of P. citriodora (2n=2X=20). We undertook a draft de novo genome assembly by combining data from multiple sequencing platforms (Illumina, PacBio RS II) using various libraries with different insertions. Using a total of 643.9 Gb (about 985.3 X coverage) merged with two assemblies (platanus and falcon), the genome was assembled into 1,622 scaffolds with 12,325,979 bp (N50). The assembly covered the size of K-mer analysis of the genome. BAC-ends constructed with the insert size of 102 kb and 80 kb respectively were highly mapped to the scaffolds. Also 10 whole BAC sequences involved in omega-3 biosynthesis pathway were highly covered in the scaffolds. CEGMA showed that the percentage of completeness for this assembly was 92.74% and 97.58% for completely and partially aligned core Eukaryotic genes. BUSCO analysis revealed a completeness score of about 95.5%. Repeat analysis using repeat modeler showed that 61.47% of the assembled genome was predicted to be repetitive. A total of 196,413 ab initio gene models were predicted along with Perilla scaffolds using MAKER. Of them, 41,751 gene models were matched at least once with GO, Protein families DB and CCD. A total of 39,025 gene models were predicted along with the Perilla transcripts using Blastx. From the above evidences, we found that 55,418 gene models were involved at least once. Finally, a total of 56,604 gene models were predicted by using GenBank. The results above will provide important information on the genome structure to understand the functional genomics of Perilla.

Biography


Tae-Ho Kim has studied mainly the analysis of useful genes isolation and their functions for the molecular breeding in crops. His focuses are genes such as disease resistance, developmental physiology in rice and Perilla. Recently, as an advance of big data he works in the field of comparative and de novo genomics to find variable locus (or genes, SNPs et al) and to analyze the structure of genome using NGS technology and bioinformatics.

Track Your Manuscript

Awards Nomination

GET THE APP