Optimization of a SWATH data acquisition method on a Q-Exactive MS for label-free, global proteomics

Bioinformatics Internship Presentation

Dave Mitchel (Mentors: Drs. Brian L. Hood, Thomas P. Conrads, and Nicholas W. Bateman; DOD Gynecologic Cancer Center of Excellence, Women's Health Integrated Research Center and Inova Schar Cancer Institute, Inova Center for Personalized Health)

August 30th, 2016, 2:00pm, Room 1300, Harris Building

Intro: Data-dependent acquisition (DDA) represents a method commonly applied for the acquisition of bottom-up, label-free global proteomic data. However, this method exhibits poor analytical reproducibility and quantitative accuracy and thus data-independent acquisition (DIA) methods, such as Sequential Window Acquisition of all Theoretical Mass Spectra (SWATH), are actively being refined for global proteomic analyses as these methods improve upon the analytical deficiencies observed with DDA. Few SWATH methods have been described using a Q-Exactive (QE) high-resolution mass spectrometer (MS, Thermo) due to the lower scan speed of this instrument, i.e. 12Hz at R=17,500. We set out to define an optimized SWATH data acquisition method for application on a QE MS.

Method: We assessed a six-point dilution series (50amol – 100fmol) of a standard peptide mixture (PRTC) spiked into K562 cell line digest using a four sequential SWATH isolation window strategy, i.e. 11, 16, 21 and 26 m/z windows at R=17500 (17.5k) within a narrow precursor mass range (399-700 m/z). This study was performed to enable limits of quantification (LOQ)/detection (LOD) measures and the identification of a high confidence peptide dot-product score cut-off for use in global proteomic analyses. For global proteomic analyses, we applied the same SWATH isolation window strategy at two instrument resolutions, R=17500 (17.5k) and R=35000 (35k). Retention time alignment peptides (iRT, Biognosys) were added to all complex cell line digest samples (MSPE; 500 ng/injection) and data were collected in duplicate for all analyses. SWATH data were imported into the Skyline software platform and data was searched against an iRT-aligned spectral library assembled from four DDA analyses of MSPE digest corresponding to 12,865 unique peptides/ 3,749 proteins, restricting peptide integration events to the top 4 most-abundant peptide fragment ions within a RT ± 5 minutes of iRT-aligned library candidate entries. Resulting .CSV data exports were parsed using programming language written in Python 3.4 that further included modules such as CSVreader, Matplotlib and Statistics. These modules were used to compare extracted Skyline data fields (e.g. sequence, dot product score and area under the curve (AUC).

Results: LOQ/LOD analyses of duplicate PRTC dilution injections of 50amol, 200amol, 1fmol, 5fmol, 20fmol, 100fmol were evaluated using various peptide features, (e.g. top 4 fragment ion abundance trends, inter-injection AUC variability and dot-product score statistics for eight candidate PRTC peptides). Analyses revealed that 11m/z SWATH windows provided the most confident (%RSD ≤ 20) measurements for a total of four out of eight PRTC peptides at low femtomole levels (R2 ≥ 0.989 ± 0.02). Mean dot-product scores for the lowest dilution points measured (1 fmol) were 0.7 ± 0.09. Global proteomic analyses revealed that inter-injection variability was slightly better for R=17.5K versus R=35k, as slightly more peptides were co-identified between technical replicate injections exhibiting an AUC of ≤ 20% RSD at R=17.5K. Further analyses revealed that both the 11 and 21 m/z isolation schemes at R=17.5K exhibited the greatest number of co-identified peptides with an AUC of ≤ 20% RSD relative to other isolation windows assessed. Application of the dot-product score cut-off identified from analyses of PRTC standard peptides revealed that 5469 numbers of peptides (42.5% of total possible peptides) were confidently identified using an isolation scheme of 11 m/z and 5603 peptides (43.6% of total possible peptides) were identified using a 21 m/z scheme.  

Conclusion: This analysis identified the advantage of collecting SWATH data at R=17.5K using a 11 m/z isolation scheme within a limited mass range of 399-700 m/z when considering both the limits of quantitation observed in analyses of standard peptides combined with the total numbers of peptides confidently identified in global proteomic analyses, e.g. peptide AUC = RSD ≤ 20%. These efforts have enabled the identification of optimal SWATH data acquisition parameters for label-free, global proteomic analyses on a QE-MS. These efforts have further established a data analysis pipeline affording the extraction of pertinent data from Skyline exports and the visualization of analytical performance metrics that will directly support QA/ QC assessments of SWATH data exports generated using Skyline.