The X! search engine project

X! Search Engine Development

  X! TANDEM Spectrum Modeler

X! Tandem open source is software that can match tandem mass spectra with peptide sequences, in a process that has come to be known as protein identification.

This software has a very simple, sophisticated application programming interface (API): it simply takes an XML file of instructions on its command line, and outputs the results into an XML file, which has been specified in the input XML file. The output format is described here (PDF). This format is used for all of the X! series search engines, as well as the GPM and GPMDB.

Unlike some earlier generation search engines, all of the X! Series search engines calculate statistical confidence (expectation values) for all of the individual spectrum-to-sequence assignments. They also reassemble all of the peptide assignments in a data set onto the known protein sequences and assign the statistical confidence that this assembly and alignment is non-random. The formula for which can be found here. Therefore, separate assembly and statistical analysis software, e.g. PeptideProphet and ProteinProphet, do not need to be used.

   Latest release: CYCLONE (2010.12.01)
This is the first release in the CYCLONE project. There are numerous small fixes and changes from the last TORNADO release, mainly aimed at improving the speed of the application. Some of the new features are listed below.
  1. An improved scoring function for ETD data, incorporating the ideas described in Sun, R-X, et al. J. Proteome Res. 2010 (DOI: 10.1021/pr100648r).
  2. A more complete implementation of the mzML v. 1.1.0 file format (in collaboration with Fredrik Levander).
  3. A mechanism for reading the fragmentation type from mzXML files, when available. This mechanism allows X! Tandem to read mzXML files that contain mixtures of CID/HCD and ETD generated spectra and correctly apply the appropriate set of fragment ions to the individual spectra for interpretation (in collaboration with Peter Lobel).
  4. A change to the interpretation of the "refine, unanticipated cleavage" directive to being a "semi"-type cleavage rather than a full non-specific cleavage. The previous behavior can be obtained using the new "refine, full unanticipated cleavage" directive.
  5. An improved implementation of the "quick acetyl" checking mechanism brought out in the last TORNADO release.
  6. Explicit use of SIMD pragmas in the Windows version to speed up the native X! Tandem scoring function.
Copyright © 2004-2011, The Global Proteome Machine Organization