Transcription Factor Variant Mapping

Jiayu Fu

Mentor: Dr. Matthew McCoy, Innovation Center for Biomedical Informatics, Department of Oncology and Department of Biostatistics, Bioinformatics and Biomathematics, Georgetown University

Date/Time: August 25, 2020, 3:40pm

Abstract: Binding of transcription factors to transcription factor binding sites is key to the mediation of transcriptional regulation. TFBSs are generally recognized by scanning a position weight matrix against DNA using one of a number of available computer programs. In our project, we use a program called FIMO. FIMO scans a set of sequences for individual matches to each of the motifs I provide. The program computes a log-likelihood ratio score for each position in a given sequence database, uses established dynamic programming methods to convert this score to a P-value, and then applies false discovery rate analysis to estimate a q-value for each position in the given sequence.

In the project, We use the PAH promoter as a sample sequence and TFBSshape as the input motif database. Phenylalanine hydroxylase (PAH) is an enzyme that catalyzes the hydroxylation of the aromatic sidechain of phenylalanine to generate tyrosine. We find that TF binding motifs may have a big change after mutations and we calculate a score to show how tolerant each motif is. We also find that many new TF binding motifs will occur after the mutations and it shows mutations will impact TF Binding.