Application of RNA-sequencing for characterization of hepatocellular carcinoma

Bioinformatics Internship Presentation

Maedeh Mohammadnetaj (Mentor: Dr. Habtom Ressom, Department of Oncology, Lombardi Comprehensive Cancer Center, Georgetown University)

August 30th, 2016, 2:00pm, Room 1300, Harris Building

Liver cancer is the fifth most common worldwide. The predominant histological subtype of liver cancer is hepatocellular carcinoma (HCC), which represents over 80% of all diagnosed liver cancers. HCC ranks third in mortality rates due to cancer worldwide, affecting more than half a million people. This is due to the late diagnosis at advanced stages for most HCC patients, leaving them with limited options, which signifies the importance of improving the accuracy of prognosis for HCC. Identifying biomarkers is an effective approach for early diagnosis of cancer. Gene fusions, which are made as a result of genomic rearrangement, have been known as critical identifiers of cancer developments. Next-generation sequencing has helped us gain deeper insight into this matter.

The Ressom Lab recently performed RNA-seq analysis of liver tissues from five HCC cases and five patients with liver cirrhosis using Illumina HiSeq 2500 (paired-end 125 bp). The objective was to identify differentially expressed genes/transcripts between tumor vs. cirrhotic tissues. In this project, I used the RNA-seq data generated to investigate the occurrences of fusion genes in tumor and cirrhotic tissues. Alignment of the sequence reads, quality assessment of the data, differential expression analysis, and fusion detection were done using the Partek Flow software.

We identified 218 genes and 251 exons with statistically significant changes in expression in HCC vs. liver cirrhosis, with a false discovery rate less than 0.05 and fold change of 2. Molecular functional classification using DAVID and PANTHER indicated that catalytic activity related and binding related genes were the most significantly enriched. Moreover, ONCOMINE was used for computing the genes expression signature by looking at previously done studies. Potential fusion genes were identified by duFuse algorithm and a total number of 2227 gene fusion (1971 unique) candidates were found, 30% of which occurred in HCC groups. 60% of these fusion events were between two different chromosomes. The candidates were searched against publicly available RNA-seq datasets and databases, cosmic, fusion cancer, and Chimerdb for further confirmation. The search resulted in identification of 86 gene fusions, out of which 56 were unique.