HPA data contributes to multimodal cell maps


music2.PNG

In 2021, Trey Idekers team at UCLA in collaboration with Emma Lundbergs team at KTH and Stanford presented a new way of creating hierarchical maps of subcellular protein organization by integrating data from immunofluorescence images from the HPA with protein interaction data. The resulting cell map was referred to as the multi-scale integrated cell (MuSIC 1.0). Four years later, the same team has now released the next version of such a multimodal cell map, again accompanied by a prestigious publication in Nature.

The cell is a multi-scale structure with modular and dynamic organization of components across several orders of magnitude. Understanding this complex subcellular organization and its relation to biological functions in human health and disease is one of the fundamental goals for biological sciences. In order to create holistic maps that are robust and that span across physical scales there is a need to combine and integrate data from complementary mapping methods. Two major methods for mapping protein subcellular organization are microscopy-based visualization and protein-protein interaction network analysis. Schauffer et al. used a self-supervised machine learning approach to enable computational integration of protein localization data from immunofluorescence (IF) images generated within the HPA and protein interaction data generated by affinity purification mass spectrometry (AP-MS). The resulting cell map covers 5,100 proteins in U2-OS osteosarcoma cells and resolves 275 molecular assemblies, ranging from nanometer to micrometer scale. The assemblies have been annotated using a combination of manual curation and the generative large language model (LLM) GPT-4, and validated using proteome-wide SEC-MS.

Intriguingly, the authors demonstrate that this hierarchical map can be used as a guide to explore various aspects of structural and functional biology, including potential new functions for 975 of the proteins. When comparing the new cell map of U2-OS with that of HEK293T (MuSIC 1.0) it was also clear that many protein assemblies display cell type specific patterns of biophysical interactions, providing a basis for studying different cell phenotypes and possibly identifying cell-type specific drug targets. In the future, we envision cell maps integrating data for various biomolecules regarding their concentration, localization, interactions and structure across both length- and time scales, ultimately approaching virtual cell models, which will help us understand and predict complex cellular phenotypes, thereby accelerating life sciences and translational research.

The map is available for exploration at http://musicmaps.ai/u2os-cellmap/