Identifying Candidate Epistatic Interactions Among Missense Mutations in Key Players in the JAK-STAT Pathway Using the All of Us Database
Basith Faizal
Mentor: Dr. Markus Hoffmann, Department of Biochemistry and Molecular & Cellular Biology, Georgetown University.
Date/Time: August 22nd, 2025 at 10:30 AM.
Abstract: The Janus Kinase–Signal Transducer and Activator of Transcription (JAK-STAT) pathway plays a critical role in immune regulation, inflammation, and cell signaling, and has been implicated in a variety of complex diseases (Hoffmann & Hennighausen, 2025);(Wahnschaffe et al., 2019). While many studies have explored the impact of individual variants within this pathway, the potential for epistatic interactions—additive effects between multiple genetic variants—remains largely unexamined, particularly at a population scale (Hoffmann et al., 2024). This research aimed to address this gap by conducting a systematic screen for epistasis among missense variants in JAK and STAT genes using the NIH All of Us Research Program dataset, which provides access to rich genomic, phenotypic, and demographic data from a diverse cohort(The All of Us Research Program Genomics Investigators, 2024).
In this study, candidate epistatic interactions within the JAK/STAT pathway were identified and evaluated for their potential association with diverse conditions, including osteoarthritis of the knee, senile hyperkeratosis, nuclear senile cataract, gastroesophageal reflux disease without esophagitis, and hyperlipidemia. Through the All of Us workbench, we created a systematic filtering approach applied to the genomic and phenotypic data from 250,233 individuals present within the All of Us Database version 7 to examine amino acid and DNA changes underlying these conditions. From the 250,233 individuals, 84,792 were found to have more than one variant mutation and thereby would be included for the analysis. Using the All of Us workbench, we first examined how the genotypic and phenotypic information was stored. Then, we constructed SQL joins to merge relevant tables, creating newly integrated dataframes in python(https://github.com/BasithFaizal/JAK-STAT-Epistasis–All-of-Us-Research-Program-version-7/tree/main). These dataframes were then processed to detect amino acid changes and DNA changes and screened to identify epistatic interactions that could be correlated with disease phenotypes in the cohort of 84,792 individuals. Using our systematic filtering approach, we identified epistatic interactions that were present in at least <20 participants, with at least 50% of those individuals sharing a specific condition. This was able to yield 94 candidate epistatic interactions that could be prioritized for further study. For gastroesophageal reflux disease (GERD) without esophagitis, one candidate epistatic interaction was identified in <20 individuals, of whom 80% were diagnosed with the condition. This interaction involved variants in the STAT5B gene and the JAK3 gene.
JAK3 and STAT5B are components of the common γ-chain cytokine receptor signaling pathway, where JAK3 phosphorylates STAT5B upon cytokine stimulation (ex: IL-2, IL-7, IL-15)(Kiwanuka et al., 2015). STAT5B then dimerizes and translocates to the nucleus to regulate immune-related gene expression. Missense mutations in either protein can therefore alter pathway efficiency (Kiwanuka et al., 2015). The variant on the STAT5B gene results from a single nucleotide change at the DNA level(c.1691C>T), resulting in the amino acid substitution from arginine (positively charged) to cysteine (polar but uncharged)(Ensembl, 2025). We hypothesize that this could result in disruption within the protein structure or electrostatic interactions within the signaling domain (Vainchenker & Constantinescu, 2012). The variant's location, near amino acid position 564, locates it near the src homology 2(SH2) domain. This domain is essential for phosphotyrosine recognition and STAT5B dimerization and is also vital for STAT5B’s recruitment to activated receptors and subsequent nuclear translocation (Lupardus et al., 2014).
The JAK3 variant results from a DNA mutation(c.394C>A) that causes a glutamine (polar, uncharged) to lysine(positively charged) amino acid substitution. This could potentially alter binding affinity or stability of protein-protein interactions. This variant is located at amino acid position 131, which is within the FERM domain (Ensembl, 2025). The FERM domain is responsible for anchoring JAK3 to the common γ-chain cytokine receptor (Babon et al., 2014).
The FERM domain facilitates the interaction of JAK3 with the receptor complex and STAT5B’s SH2 domain binds phosphorylated receptor sites(Ferrao et al., 2017). A hypothesis that was formed based on the interaction of the mutations present with these genes was that these specific amino acid mutations—introducing a positive charge into the FERM domain and removing a positive charge near the SH2 domain—occur in regions critical for JAK/STAT pathway function. Such mutations could, in theory, alter the electrostatic and structural properties necessary for optimal protein-protein interactions or protein-receptor interactions (Vainchenker & Constantinescu, 2012). This could lead to impairing signal transduction and contribute to GERD susceptibility.
Previous literature supports the functional relevance regarding the interaction between the FERM and SH2 domains. A study done by Vainchenker and Constantinescu(2012) was able to provide keen insights into the FERM and SH2 domains in context to the JAK signaling pathway. As part of their study, they delved into the mutational impact through which they were able to get a key takeaway: mutations that affect the integrity of the FERM domain could lead to improper positioning of the SH2 domain, impairing its abilities to recognize and bind phosphorylated sites on the receptor. Impairments like this could result in altered signaling and thereby contribute to conditions linked to the JAK/STAT pathway(Wahnschaffe et al., 2019;(Vainchenker & Constantinescu, 2012).
In terms of limitations within the study, there could be missing or incomplete clinical information of patients that could limit certain criteria regarding genotype and phenotype data. Variability in EHR data could also play a role in the filtering pipeline not being able to gather all the information that is present within the database. Additionally, patients could develop a condition after submitting their information to All of Us without updating the database, meaning some disease/condition association may be missed or underrepresented.
Future research based on this project could look to broaden the analysis to include a wider range of immune-related genes within the JAK/STAT pathway. One such manner is to expand beyond the selected the JAK and STAT genes to include additional cytokine receptors that signal through the JAK/STAT pathway, such as IL2R(interleukin-2 receptor) and IL3R (interleukin-3 receptor) families. Similarly, associated cytokine ligands(ex: IL-2, IL-3) could be incorporated to explore how ligand receptor pathway interactions could contribute to epistatic interactions. Another direction the research could take would be to investigate whether the identified genetic interactions show correlations with demographic variables such as sex, ethnicity, or age, potentially revealing population-specific genetic risk patterns.
This project combines bioinformatic analyses, literature based validation, and population-level screening to uncover novel epistatic interactions that may underlie hidden mechanisms of certain diseases and conditions. Studies like this could help identify genetic interactions that contribute to disease susceptibility. In the longer term, understanding the genetic interactions may guide the development of personalized therapies, targeted interventions, and enhance risk prediction models for immune and inflammation associated disorders.
References: – Babon, J., Lucet, I., Murphy, J., Nicola, N., & Varghese, L. (2014). The molecular regulation of janus kinase (jak) activation. Biochemical Journal, 462(1), 1-13. https://doi.org/10.1042/bj20140712 – Ensembl. (2025). Ensembl genome browser (Release 113). EMBL-EBI. https://www.ensembl.org – Ferrao, R. and Lupardus, P. (2017). The janus kinase (jak) ferm and sh2 domains: bringing specificity to jak–receptor interactions. Frontiers in Endocrinology, 8. https://doi.org/10.3389/fendo.2017.00071 – Hoffmann, M., & Hennighausen, L. (2025). Spotlight on amino acid changing mutations in the JAK-STAT pathway: From disease-specific mutation to general mutation databases. NPJ Genomic Medicine, 10, Article 17. https://doi.org/10.1038/s41525-025-00315-4 – Hoffmann, M., Poschenrieder, J. M., Incudini, M., Baier, S., Fritz, A., Maier, A., Hartung, M., Hoffmann, C., Trummer, N., Adamowicz, K., Picciani, M., Scheibling, E., Harl, M. V., Lesch, I., Frey, H., Kayser, S., Wissenberg, P., Schwartz, L., Hafner, L., … Blumenthal, D. B. (2024). – Kiwanuka, K., Lin, J., Leonard, W., & Ryan, J. (2015). Stat5 tetramer formation is critical for mast cell function (hyp4p.312). The Journal of Immunology, 194(1_Supplement), 123.11-123.11. https://doi.org/10.4049/jimmunol.194.supp.123.11 – Network medicine-based epistasis detection in complex diseases: Ready for quantum computing. Nucleic Acids Research, 52(17), 10144–10160. https://doi.org/10.1093/nar/gkae697 – Lupardus, P., Ultsch, M., Wallweber, H., Kohli, P., Johnson, A., & Eigenbrot, C. (2014). Structure of the pseudokinase–kinase domains from protein kinase tyk2 reveals a mechanism for janus kinase (jak) autoinhibition. Proceedings of the National Academy of Sciences, 111(22), 8025-8030. https://doi.org/10.1073/pnas.1401180111 – The All of Us Research Program Genomics Investigators. (2024). Genomic data in the All of Us Research Program. Nature, 627, 340–346. https://doi.org/10.1038/s41586-023-06957-x – Vainchenker, W. and Constantinescu, S. N. (2012). Jak/stat signaling in hematological malignancies. Oncogene, 32(21), 2601-2613. https://doi.org/10.1038/onc.2012.347 – Wahnschaffe, L., Braun, T., Timonen, S., Giri, A., Schrader, A., Wagle, P., … & Herling, M. (2019). JAK/STAT-activating genomic alterations are a hallmark of T-PLL. Cancers, 11(12), 1833. https://doi.org/10.3390/cancers11121833