Development of a python program for VCF file analysis in a metastatic melanoma study

Bioinformatics Internship Presentation

Muzi LiMuzi Li (Mentor: Dr. Shaojun Tang, Innovation Center for Biomedical Informatics, Georgetown University)

May 12th, 2:00pm, Room 1202, Harris Building

Melanoma is an aggressive form of skin cancer. Because it’s more likely to grow and spread than other types of skin cancers. Patients with metastatic melanoma (stage IV melanoma) have a poor survival of 7-19%. In this stage, melanoma has spread to other parts of body, such as liver, lungs, bones, and brain.

Cabozantinib (XL184) is a potent targeted therapy which inhibits the activity of tyrosine kinases like VEGFR2 and c-Met. VEGF and c-Met signaling pathways are involved in tumor angiogenesis and metastasis. Cabozantinib has been proved to reduce angiogenesis and metastasis in multiple tumor types.

15 metastatic melanoma patients were treated with oral cabozantinib 100mg per day. Patients were divided into three groups (progression, stable and shrinkage) based on the degree of tumor shrinkage after two 5-day cycles of therapy. Tumor samples from the 15 patients were collected and sequenced using NuGen sequencing platform before the treatment. We retrospectively examine the mutations that account for the different treatment outcome.

After obtaining the short read alignment files (BAM files) and pre-processing the BAM files, we identified the mutational variants (in VCF format) using SAMtools and annotate the variants using ANNOVAR. A python program was developed to work with the annotated VCF files. The program can merge multiple VCF files, split INFO field in VCF format, filter out specific variants, and perform downstream analysis. The python program not only helps us to understand the implication of biomarkers in different clinical responses for the metastatic melanoma study, but also can be applied to other similar variants studies.