Quickstart
Here we briefly introduce spectrum_utils’ spectrum processing and visualization functionality:
Load a spectrum from an online data resource by its Universal Spectrum Identifier (USI).
Restrict the mass range to 100–1400 m/z to filter out irrelevant peaks.
Remove the precursor peak.
Remove low-intensity noise peaks by only retaining peaks that are at at least 5% of the base peak intensity and restrict the total number of peaks to the 50 most intense peaks.
Scale the peak intensities by their square root to de-emphasize overly intense peaks.
Annotate peaks corresponding to a, b, and y peptide fragments in the spectrum based on a ProForma 2.0 peptide string.
Visualize the spectrum with the annotated peaks highlighted.
IO functionality to read spectra from MS data files is not directly included in spectrum_utils. Instead you can use excellent libraries to read a variety of mass spectrometry data formats such as Pyteomics or pymzML.
import matplotlib.pyplot as plt
import spectrum_utils.plot as sup
import spectrum_utils.spectrum as sus
# Retrieve the spectrum by its USI.
usi = "mzspec:PXD004732:01650b_BC2-TUM_first_pool_53_01_01-3xHCD-1h-R2:scan:41840"
peptide = "WNQLQAFWGTGK"
spectrum = sus.MsmsSpectrum.from_usi(usi)
# Process the spectrum.
fragment_tol_mass, fragment_tol_mode = 10, "ppm"
spectrum = (
spectrum.set_mz_range(min_mz=100, max_mz=1400)
.remove_precursor_peak(fragment_tol_mass, fragment_tol_mode)
.filter_intensity(min_intensity=0.05, max_num_peaks=50)
.scale_intensity("root")
.annotate_proforma(
peptide, fragment_tol_mass, fragment_tol_mode, ion_types="aby"
)
)
# Plot the spectrum.
fig, ax = plt.subplots(figsize=(12, 6))
sup.spectrum(spectrum, grid=False, ax=ax)
ax.spines["right"].set_visible(False)
ax.spines["top"].set_visible(False)
plt.savefig("quickstart.png", bbox_inches="tight", dpi=300, transparent=True)
plt.close()
As demonstrated, each of the processing steps can be achieved using a single, high-level function call. These calls can be chained together to easily perform multiple processing steps.
Spectrum plotting can similarly be achieved using a high-level function call, resulting in the following figure:
Note that several processing steps modify the peak m/z and intensity values and are thus not idempotent.
It is recommended to make a copy of the MsmsSpectrum
object prior to any processing if the raw peak values need to remain available as well.