Skip to content

Inference algorithms

Julian Uszkoreit edited this page Mar 10, 2014 · 2 revisions

A reported protein group can have more than one accession due to shared peptides and homologs / isoforms / redundant database entries. All currently implemented methods make no decision of which accession to report, instead all of them are reported for a protein group

Report All

This is the simplest possible inference method, just returning any possible protein group in the compilation. Taking the PIA graph structure the reported proteins are very fast calculated, as only one protein group for each group in the graph containing protein nodes needs to be created. The advantage of this method is its short runtime, with the disadvantage of reporting no sub proteins.

Occam's Razor

Here the goal is to use the principle of parsimony to report the minimal set of proteins, which are explained by the identified peptides. This method also reports sub proteins.

Spectrum Extractor

The Spectrum Extractor is spectrum centric, in contrast to the two other implementations, which are peptide centric. The major difference in this concept is that a spectrum, which gets once assigned to a peptide, never gets assigned to another peptide. This concept is closer to the reality, as in most cases one spectrum contains only one possibly modified peptide, although this may not always get the highest score by the search engines.

In short, this inference method creates proteins from not yet assigned spectra iteratively and assigns spectra always to the best scoring protein group, until all possibly filtered spectra are assigned or no more proteins can be created with the given settings.