
- A CINDERELLA STORY IF THE SHOE FITS TARGET HOW TO
- A CINDERELLA STORY IF THE SHOE FITS TARGET FULL
- A CINDERELLA STORY IF THE SHOE FITS TARGET SOFTWARE
- A CINDERELLA STORY IF THE SHOE FITS TARGET PC
A high-specificity filter accepts at most one as the correct peptide ID for its spectrum. A high-sensitivity search engine gross-guesses many peptide hypotheses - the more the better - using a subjective criteria (search score). Most people solve a crossword by gross-guessing words and then seeing if any one fits exceptionally well. We can view peptide identification as a crossword puzzle (peptide) with numerical clues (fragment m/z’s). MS/MS by nature does not identify a molecule per se, but rather reports fragments to be compared to a hypothesis. Besides, it may be next to impossible to capture more than one peptide from very low abundance proteins. This eliminates statistical imprecision from inferring a protein from multiple peptides. Finally, with many matched fragments, a precise precursor mass becomes less critical - very important for DIA analysis.Ī natural strategy emerges to analyze any low-abundance protein: Try to capture at least one protein-unique peptide using DIA, which would be designated the surrogate for its “one-hit wonder” protein for both identification and relative quantitation. Longer peptides are also part of fewer proteins a long-enough one is unique to its protein. Longer peptides (with more matchable fragments) allow higher confidence identification. The concept is nothing more than this: a girl is likely our quarry if she is an outlier in terms of both the number and the tightness of garments that fit.Ī MS/MS peptide ID hypothesis is likely correct to the extent it is an outlier in both the number and closeness of matching m/z’s, period.įundamentally, confidence can never reach 100 percent due to possible random matches, but it increases asymptotically with each closely matched fragment m/z.
A CINDERELLA STORY IF THE SHOE FITS TARGET FULL
So MS/MS identification is akin to identifying Cinderella in a sizable city using one shoe (precursor mass) plus a full wardrobe (many fragment m/z’s). But if it fits, we don’t know whether it’s her or a random girl. If the shoe doesn’t fit, it’s surely not her. To appreciate mass spec’s informational asymmetry, consider its parallel to the Cinderella story. Mass Spec Identification: A Cinderella Story Note the search engine’s inherent subjectivity is irrelevant as long as it is sensitive enough to include the correct peptide among its guesses. For example, intuition suggests a peptide with >20 fragment ion matches at <0.01 average m/z error is likely correct a scatter plot proves and extends this intuition. A high-specificity multidimensional filter uses physical parameters to accept a small number of hypotheses as high-likelihood peptide IDs. We discovered this remarkable simple abstraction: A high-sensitivity search engine guesses many peptide ID hypotheses from a mass spectrum.

Peptides and proteins are physical objects with true identities that can’t be discerned with MS/MS alone. 1D data analysis of matched fragment ions - the foundation of MS/MS DIA molecular identification - starting with first principles.
A CINDERELLA STORY IF THE SHOE FITS TARGET HOW TO
Here we illustrate how to produce precise and reproducible results by comparing 2D vs.
A CINDERELLA STORY IF THE SHOE FITS TARGET PC
Popular PC programs can be coaxed to identify 15% more IDs than is realistic- an impossible high-water mark for any rigorous software.
A CINDERELLA STORY IF THE SHOE FITS TARGET SOFTWARE
Many labs treat analysis software as a black box and choose a loose one - like a player seeking the loosest slot machine - that reports the most IDs. 10%) for a matched fragment all models qualitatively agree on easy ‘yes’ and ‘no’ answers but differ in-between where it matters most. Different software uses different probabilities (6% vs. Prevalent data analysis uses a binomial probability (i.e colored balls from a bag) to model fragment ion signals that are not independent and identically distributed (probability’s “IID” requirement), which injects random uncertainty into physical mass/charge (m/z) data. Expecting physics-level precision without physics-level IT is simply wishful thinking. While physics uses powerful servers to mine deep data for needle-in-a-haystack discoveries, proteomics is trapped in the shallows by simple PC programs that calculate subjective probabilistic scores. MS/MS biomolecule analysis is really closer to physics than traditional chemistry. Tandem mass spectrometers (MS/MS), like particle colliders and space telescopes, produce big datasets with dynamic range that spans orders of magnitude. Here we explain our powerfully simple idea: Use multidimensional separation - already applied chemically in chromatography - to numerically filter correct peptide IDs from a search engine’s guesses, particularly with data-independent acquisition (DIA) data.

Those who can will achieve breakthroughs. Few researchers can identify these proteins with confidence.


But it’s stalled by imprecise and often irreproducible data analysis. Proteomics is a powerful technology for analyzing low-abundance proteins for disease research.
