Shiny Application Development for Self-Guided Investigation of Biological Hypotheses in Pan-cancer Transcriptomic Datasets
Kyle Rodrigues
Mentor: Dr. Ben Weeder, Bioinformatics Scientist, Arcus Biosciences.
Date/Time: August 22nd, 2024 at 4:00pm.
Abstract: Arcus Biosciences is a clinical-stage biopharmaceutical company aimed at developing therapies for treating cancer and other diseases. Arcus Biosciences’ research spans a wide variety of topics from basic science and discovery to clinical-stage analysis. A fundamental goal of the Bioinformatics team at Arcus is to help facilitate their collaborators across departments in answering data-driven hypotheses that further the company’s research goals. While this often involves hands-on analysis, it also provides an opportunity to create tools that empower collaborators in answering their own questions through interactive frameworks and code-free analysis. Therefore, there is a direct need for the development of an interactive graphical comparison tool that enables collaborators to perform self-guided investigation of biological questions in transcriptomic datasets.
This interactive tool is programmed in R and uses the Shiny app framework to construct a graphical user interface (GUI) to enable code-free exploratory analysis. Multiple external, publicly available datasets were leveraged for these analyses, specifically data from The Cancer Genome Atlas (TCGA), MET500, Open Pediatric Cancer (OpenPedCan) and GTEx projects. First, these datasets were processed and standardized so that metadata can be easily ingested for downstream visualization. Next, basic visualization functions were coded that take expected input formats and generate comparison figures with metadata and mutation status annotations (expression box plots across cancer types and heatmaps). Finally, these visualizations were incorporated into a Shiny app framework that allows for code-free data exploration.
The ability for individuals without a coding background to leverage transcriptomic data, including comparisons by phenotype, splitting by key genetic mutations, or exploring other important factors that alter gene expression, is key in rapidly answering research questions and ultimately facilitating the development of novel therapeutics.