The Human Protein AtlasThe Human Protein Atlas is a Swedish-based program initiated in 2003 with the aim to map all the human proteins in cells, tissues, and organs using an integration of various omics technologies, including antibody-based imaging, mass spectrometry-based proteomics, transcriptomics, and systems biology. All the data in the knowledge resource is open access to allow scientists both in academia and industry to freely access the data for exploration of the human proteome.
The Human Protein Atlas program has already contributed to several thousands of publications in the field of human biology and disease and it is selected by the organization ELIXIR (www.elixir-europe.org) as a European core resource due to its fundamental importance for a wider life science community. The Human Protein Atlas consortium is mainly funded by the Knut and Alice Wallenberg Foundation. The full publication list is available here. TissueThis section of the Human Protein Atlas focuses on the expression profiles in human tissues of genes both on the mRNA and protein level. The protein expression data from 44 normal human tissue types is derived from antibody-based protein profiling using conventional and multiplex immunohistochemistry. All underlying images of immunohistochemistry stained normal tissues are available together with knowledge-based annotation of protein expression levels. The protein data covers 15318 genes (76%) for which there are available antibodies. The mRNA expression data is derived from deep sequencing of RNA (RNA-seq) from 256 different normal tissue types. More information about the specific content and the generation and analysis of the data in the section can be found on the Methods Summary. Learn about:
Example: FCAMR Selective microvilli expression in proximal renal tubules, group enriched in kidney and lymphoid tissue at the mRNA level. BrainThe Brain section gives an overview of protein expression and distribution in the mammalian brain. Externally and “In-house” generated data are integrated to explore regional protein expression in the human, pig and mouse brain. Protein expression data are based on quantification of messenger RNA using RNA sequencing techniques and in situ hybridization. Protein distribution data are generated using antibody-based immunohistochemistry and immunofluorescence techniques. The brain section can be utilized to create an overview of regional and cross species expression of proteins of interest or can be used to identify regional or functional clustered genes based on expression levels across regions of the brain. More information about the specific content and the generation and analysis of the data in this section can be found in the Methods Summary. Learn about:
Example: NECAB1 Subsets of neurons show distinct somato-dendritic immunoreactivity throughout the brain. The image show protein location in subsets of neurons in the hippocampus of mouse brain. Single Cell TypeThis section contains Single Cell Type information based on single cell RNA sequencing (scRNAseq) data from 29 human tissues and peripheral blood mononuclear cells (PBMCs). The data is linked to in-house generated immunohistochemically stained tissue sections presented in the Tissue section in order to visualize the corresponding spatial protein expression patterns. The scRNAseq analysis was based on publicly available genome-wide expression data and comprises all protein-coding genes in 536 individual cell type clusters corresponding to 15 different cell type groups. A specificity classification was performed to determine the number of genes elevated in these single cell types. The genes expressed in each of the cell types can be explored in interactive UMAP plots and bar charts, with links to corresponding immunohistochemical stainings in human tissues. More information about the specific content and the generation and analysis of the data in the section can be found on the Methods Summary. Learn about:
Example: TSPY2 Selective nuclear expression of spermatogonia at the protein level, enriched in spermatogonia at the mRNA level. Tissue Cell TypeThe Tissue Cell Type section contains cell type expression specificity predictions for all human protein coding genes, generated using integrated network analysis of publicly available bulk RNAseq data. A specificity classification is used to predict which genes are enriched in each constituent cell type within an individual tissue. The data can be explored on a tissue-by-tissue basis, together with in-house generated immunohistochemically stained tissue sections. In addition, a core cell type analysis focuses on the cell types found in all, or the majority, of the profiled tissues, e.g., endothelial cells or macrophages. Here, genes with predicted specificity in these core cell types in multiple tissues are detailed. More information about the specific content and data analysis in the section can be found in the Methods Summary.
Example: KRTAP2-1 Keratin associated protein 2-1. Selective expression in hair follicle cortex cells at the protein level, mRNA specificity prediction in skin: hair follicle cortex cells. PathologyThis section contains Pathology information based on mRNA and protein expression data from 17 different forms of human cancer, together with millions of in-house generated immunohistochemically stained tissue sections images and Kaplan-Meier plots showing the correlation between mRNA expression of each human protein gene and cancer patient survival. More information about the specific content and the generation and analysis of the data in the section can be found in the Methods Summary. Cancer statistics from relevant international and Swedish databases are summarized here, and hallmarks of cancer are described here. Learn about:
Example: MKI67 Nuclear expression in varying fractions of tumor cells in all cancer types at protein level and expressed in all cancers at mRNA level. High expression of this gene is associated with unfavorable prognosis in renal, liver and pancreatic cancer.
Disease BloodThe Human Disease Blood Atlas contains information on protein levels in blood in patients with different diseases and highlights proteins associated with these diseases using differential expression analysis and a disease prediction strategy based on machine-learning. In this version, a pan-cancer study is reported covering 1463 proteins quantified by Proximity Extension Assay (PEA) and 146 proteins quantified by isotope dilution strategies based on the addition of recombinant protein fragment standards – the gold standard of quantitative mass spectrometry. Protein profiles have been quantified across 12 major cancer types. More information about the specific content and the generation and analysis of the data in the section can be found in the Methods Summary. Learn about
Example: The proteins predicted by the model to be associated with prostate cancer in the pan-cancer study.
Immune CellThe Immune Cell section contains single cell information on genome-wide RNA expression profiles of human protein-coding genes covering various B- and T-cells, monocytes, granulocytes and dendritic cells. The transcriptomics analysis covers 18 cell types isolated with cell sorting and includes classification based on specificity, distribution and expression cluster across all immune cells. More information about the specific content and the generation and analysis of the data in the section can be found in the Methods Summary. Learn about:
Example: The expression of the tumor metastasis suppressor CD82 in 18 different types of immune cells and PBMC.
Blood ProteinThe Blood Proteins section presents estimated plasma concentrations of the proteins detected in human blood from mass spectrometry-based proteomics studies, published immune assay data and a longitudinal study based on proximity extension assay (PEA). Further, an analysis of the “human secretome” is presented including annotation of the genes predicted to be actively secreted to human blood, as well as to other compartments or organ systems of the human body such as the digestive tract or the brain. More information about the specific content and the generation and analysis of the data in this section can be found in the Methods Summary. Learn about:
Example: CP The violin plot shows the concentration in blood for proteins with different types of function based on immunoassays .The red square in the turquoise Transport category indicates the concentration of the glycoprotein Ceruloplasmin, which is involved in iron transport across the cell membrane.
SubcellularThe subcellular section of the Human Protein Atlas provides high-resolution insights into the expression and spatiotemporal distribution of proteins encoded by 13105 genes (65% of the human protein-coding genes). For each gene, the subcellular distribution of the protein has been investigated by immunofluorescence (ICC-IF) and confocal microscopy in up to three different cell lines, selected from a subset of 37 of the cell lines found in the cell line section. Upon image analysis, the subcellular localization of the protein has been classified into one or more of 35 different organelles and fine subcellular structures. In addition, the section includes an annotation of genes that display single-cell variation in protein expression levels and/or subcellular distribution, as well as an extended analysis of cell cycle dependency of such variations. The subcellular section offers a database for detailed exploration of individual genes and proteins of interest, as well as for systematic analysis of proteomes in a broader context. More information about the content of the section, as well as the generation and analysis of the data, can be found in the Methods summary. Learn about:
Example: CCNB1 The protein localizes to the cytosol in human and mouse cells, and is expressed in a cell cycle-dependent manner. The location has been validated by siRNA mediated gene silencing, analysis of GFP-tagged protein and independent antibodies.
Cell LineThe Cell Line section contains information on genome-wide RNA expression profiles of human protein-coding genes in 1055 human cell lines, including 985 cancer cell lines. The transcriptomics analysis includes classification based on specificity analysis across 27 cancer types, distribution and expression cluster analysis across all cell lines and for selected cancer types also analysis of similarity of the cell lines to their corresponding cancer type. More information about the specific content and the generation and analysis of the data in the section can be found in the Methods summary. Learn about:
Example: The RNA expression of the gene A4GALT in 1055 cellines grouped according to origin into 27 cancers, a non-cancerous group including other diseases and an uncategorised group including cell lines resulting from immortalization of normal cells, primary cell lines and induced pluripotent stem cells.
StructureThe Structure section contains information about the three-dimensional structure of human proteins.The predicted 3D structure from the AlphaFold Protein Structure Database project is shown together with experimentally determined structures from the Protein Data Bank (PDB). All antigens with known sequence has been mapped and can be displayed on the protein structures. Additionally the amino acid positions of population variants and variants with known clinical relevance in the Ensembl variation database can be shown. More information about the specific content and the generation and analysis of the data in the section can be found in the Methods Summary. Learn about:
Example: The predicted structure from AlphaFold of the membrane-protein receptor EGFR. 1055 cellines from 28 tumortypesa 1 uncat
MetabolicThe Metabolic section enables exploration of protein function and tissue-specific gene expression in the context of the most curated human metabolic network. For proteins involved in metabolism, a metabolic summary is provided that describes the metabolic subsystems/pathways, cellular compartments, and number of reactions associated with the protein. Over 120 manually curated metabolic pathway maps facilitate the visualization of each protein's participation in different metabolic processes. Each pathway map is accompanied by a heatmap detailing the mRNA levels across 256 different tissue types for all proteins involved in the metabolic pathway. More information about the human metabolic network, including how it was generated and what information it provides, can be found in the Methods summary. Learn about:
Example: A part of the Fructose and mannose metabolism network showing reactions involving the gene HKDC1 (labelled in red).
|