Manual Review of Text-Mining Results for Accuracy and Incidences of PTM Cross-Talk

Bioinformatics Internship Presentation

Megan Yung (Mentor: Dr. Karen Ross, Biochemistry and Molecular & Cellular Biology, Georgetown University)

August 26th, 2016, 2:00pm, Room 1202, Harris Building

Many protein post-translation modifications (protein post-translation modification) serve as crucial regulatory mechanisms for cellular activity and much time has been devoted to studying disruptions in PTMs that may cause various diseases. As such, there have been numerous publications dedicated to detailing studies related to disruptions in PTM events. While these individual papers and studies are educational, it is often difficult to compile the separate sources of data, in order to find indications of PTM cross-talk, into a user friendly format that is both informative and comprehensive. It is often unreasonable to manually curate such data as a result of the intensive time and labor needed for such curation alone.

iPTMnet has been created as an integrative resource for PTM analysis. iPTMnet is a website that combines text-mined information from eFIP and RLIMS-P with information found in already curated databased such as UniProtKB to create a more comprehensive view of PTM events. RLIMS-P was used to identify papers which mention kinases, substrates, and phosphorylation sites. eFIP was used to identify phosphorylation-dependent protein-protein interactions mentioned in the texts.

At this stage of iPTMnet, it is important to determine whether the results are accurate. To test accuracy, we took a subset of results and manually checked whether the information in eFIP was correct for several hundred PPI PMIDs. The subset of data we chose included events where the interactant is a PTM enzyme. This is because these could be candidates where the phosphorylation has an impact on another PTM via the phosphorylation-dependent PPI with the PTM enzyme. If multiple phosphorylation sites were mentioned in the eFIP results, then we looked for evidence that they worked together to create the interaction. If they did, then we marked the entry as a possible example of PTM cross-talk. Finally, after manually checking the data, we took a smaller subset and created GO terms.