The Global Proteome Machine Organization   The Global Proteome Machine Organization
  www.thegpm.org

  HUNTER project

  | Home | Libraries | File format |

X! HUNTER ASL file format (2006.06.01)

NOTE: This format has been replaced by the 2006.09.15 format.

The X! Hunter Annotated Spectrum Library (ASL) system uses a binary format to store the spectra and annotations. This format was designed to make loading the data from the libraries as fast as possible. The structure of this binary format and all of the required data fields are specified below. All storage is in little-endian format.

The first 4 bytes of any ASL file contain the number of annotated spectra in the file in an unsigned int, T. The annotation and spectra are stored sequentially, using the following format:

  1. 8-byte double: parent ion M+H (Daltons);
  2. 4-byte int: parent ion charge;
  3. 4-byte float: sum of the squares of the fragment ion intensities;
  4. 4-byte int: length of the peptide sequence, L;
  5. L-byte char array: peptide sequence;
  6. 4-byte int: number of spectrum intensity-m/z pairs, P;
  7. P-byte unsigned char array: spectrum intensities;
  8. P*4-byte float array: spectrum m/z values;
  9. 4-byte int: number of sequence modifications, M;
  10. M modification objects, each containing:
    • 4-byte int: modification sequence position;
    • 8-byte double: modification mass.
  11. 4-byte int: number of protein sequences containing the peptide, N;
  12. N protein objects, each containing:
    • 4-byte int: length of protein sequence accession string, S;
    • S-byte char array: protein sequence accession string;
    • 4-byte int: position of peptide in protein sequence;
  13. Repeat until all T spectra loaded.

NOTES:

  1. The spectra are not stored in any particular order: spectra associated with the same protein may be located anywhere within the file.
  2. Annotations are based on sequence accession numbers for particular sequence collections, e.g., ENSEMBL, IPI or SWISS-PROT protein accession numbers.
  3. X! Hunter ASLs store the twenty (20) most intense peaks for a particular MS/MS spectrum.
  4. Parent ion masses are calculated based on the mono-isotopic masses of the peptide residues.

Copyright © 2006, The Global Proteome Machine Organization Privacy Statement