The Human Tissue Specific Proteome

All, approximately 20000, human genes are classified according to their expression across all major organs and tissue types in the human body. Few of the genes are strictly tissue specific, however, the genes with an elevated expression in particular tissues are interesting as a starting point to understand their biology and function, and underlying mechanisms for disease.

  • A total of 10986 genes are elevated in at least one of the analyzed tissues of which:
  • 3106 are tissue enriched genes
  • 1628 are group enriched genes
  • 6252 are enhanced genes


Transcriptome analysis of all major organs and tissue types in the human body can be visualized with regard to specificity and distribution of transcribed mRNA molecules across all putative 20090 protein coding genes (Figure 1). Specificity illustrates the number of genes with elevated or non-elevated expression in a particular tissue compared to other tissues. The analysis includes 10986 genes, and 8243 genes with low tissue specificity (read more in The housekeeping proteome). Elevated expression includes three subcategory types of elevated expression:

  • Tissue enriched: At least four-fold higher mRNA level in a particular tissue compared to any other tissue.
  • Group enriched: At least four-fold higher average mRNA level in a group of 2-5 tissues compared to any other tissue.
  • Tissue enhanced: At least four-fold higher mRNA level in a particular tissue compared to the average level in all other tissues.

Distribution, on the other hand, visualizes how many genes have, or do not have, detectable levels (nTPM≥1) of transcribed mRNA molecules. As evident in Table 1, all elevated genes are categorized as:

  • Detected in single: Detected in a single tissue
  • Detected in some: Detected in more than one but less than one third of tissues
  • Detected in many: Detected in at least a third but not all tissues
  • Detected in all: Detected in all tissues

A. Specificity

B. Distribution

Figure 1. (A) The distribution of all genes across the five categories based on transcript specificity in all 37 analyzed tissues. (B) The distribution of all genes across the six categories based on transcript detection (nTPM≥1) in all 37 analyzed tissues.


Table 1.The number of genes in the subdivided categories of elevated expression in all 37 analyzed tissues.

Distribution in the 36 tissues
Detected in singleDetected in someDetected in manyDetected in all Total
Specificity
Tissue enriched 8771329735165 3106
Group enriched 0874636118 1628
Tissue enhanced 177107932111785 6252
Total 1054328245822068 10986

The amount of tissue elevated genes is highly variable between the analyzed tissue types (see Table 2 below). Testis shows the highest number of tissue enriched genes (n=908), followed by the brain (n=525) and liver (n=269). When taking into consideration all tissue elevated genes, the brain however has a slightly higher number than the testis. A large number of enriched genes in testis is considered to be due to the highly specialized processes occurring during spermatogenesis. Many of these genes likely have a shared expression with oocytes in the female ovaries. Oocytes are however difficult to analyze because of the complex kinetics of female germ cell development, including first rounds of meiosis, which in females occur at the embryonic stage. As expected, tissues that have similar functions and morphology often have higher numbers of shared group enriched genes.

In addition to previously known proteins, the analysis also identified a large number of genes with tissue elevated expression patterns that were previously poorly characterized and with no or only scarce evidence of existence at protein level. The combined RNA and antibody-based profiling can thus be used to confirm the physiological functions of such protein coding genes lacking previous annotation. These proteins are interesting starting points for further in-depth studies to gain a better understanding of the molecular mechanisms of the various cellular phenotypes that define the function of each respective tissue and organ.


Table 2. The tissue elevated genes.

Tissue Tissue
enriched
Group
enriched
Tissue
enhanced
Total
elevated
Choroid plexus 21 187 256 464
Brain 525 616 1544 2685
Retina 123 233 388 744
Pituitary gland 21 123 150 294
Thyroid gland 10 30 131 171
Parathyroid gland 26 39 139 204
Adrenal gland 24 57 139 220
Lung 17 44 130 191
Salivary gland 42 90 186 318
Esophagus 21 73 335 429
Tongue 3 232 255 490
Stomach 36 74 204 314
Intestine 121 243 577 941
Liver 269 171 533 973
Gallbladder 3 12 66 81
Pancreas 60 76 175 311
Kidney 59 139 254 452
Urinary bladder 5 33 150 188
Testis 908 295 765 1968
Epididymis 93 66 150 309
Prostate 14 28 85 127
Seminal vesicle 6 11 52 69
Breast 18 36 81 135
Vagina 0 35 113 148
Cervix 0 42 134 176
Endometrium 2 13 77 92
Fallopian tube 27 93 187 307
Ovary 5 30 113 148
Placenta 65 52 169 286
Heart muscle 36 129 257 422
Skeletal muscle 52 274 592 918
Smooth muscle 0 7 40 47
Adipose tissue 2 31 183 216
Skin 188 97 327 612
Bone marrow 103 165 632 900
Lymphoid tissue 201 282 956 1439
Total 3106 1628 6252 10986


Tissue elevated genes

The comprehensive analysis presented here has identified 10986 human genes that display a tissue elevated expression pattern across the human body. By combining the analysis with antibody-based protein profiling using immunohistochemistry, the exact location of the corresponding protein expression pattern at a cellular and subcellular level can be provided. Examples of protein expression patterns of tissue elevated genes are presented below.

Brain

  • GFAP (Glial fibrillary acidic protein) - astrocyte intermediate filament protein
  • MBP (Myelin basic protein) - a major constituent of the myelin sheath
  • ELAVL3 (ELAV like RNA binding protein 3) - neural-specific RNA-binding protein


GFAP - cerebral cortex

MBP - hippocampus

ELAVL3 - cerebral cortex

Retina

  • RHO (Rhodopsin) – involved in phototransduction in rod photoreceptors
  • ARR3 (Arrestin 3) – involved in phototransduction in cone photoreceptors


RHO - retina

ARR3 - retina

Endocrine tissues

  • FSHB (Follicle stimulating hormone beta subunit) – hormone inducing egg and sperm production
  • TG (Thyroglobulin) - substrate for the synthesis of thyroid hormones
  • HSD3B2 (Hydroxy-delta-5-steroid dehydrogenase, 3 beta- and steroid delta-isomerase 2) - involved in the biosynthesis of hormonal steroids


FSHB - pituitary gland

TG - thyroid gland

HSD3B2 - adrenal gland

Lung

  • SFTPA1 (Surfactant protein A1) - involved in surfactant homeostasis and the defense against respiratory pathogens
  • SFTPB (Surfactant protein B) - involved in surfactant homeostasis and the defense against respiratory pathogens


SFTPA1 - lung

SFTPB - lung

Proximal digestive tract

  • STATH (Statherin) - inhibits precipitation of calcium phosphate salts in the saliva
  • KRT4 (Keratin 4) - expressed in differentiated layers of mucosal and esophageal epithelia


STATH - salivary gland

KRT4 - esophagus

Gastrointestinal tract

  • PGA4 (Pepsinogen 4, group I (pepsinogen A)) - enzyme for digestion of dietary proteins
  • DEFA5 (Defensin alpha 5) - antimicrobial and cytotoxic peptide involved in host defense
  • KRT20 (Keratin 20) - maintains keratin filament organization in intestinal epithelia


PGA4 - stomach

DEFA5 - duodenum

KRT20 - colon

Liver & gallbladder

  • ALB (Albumin) - plasma protein
  • CYP2A13 (Cytochrome P450 member) - involved in drug metabolism, cholesterol and steroid synthesis
  • CHST4 (Carbohydrate sulfotransferase 4) - an enzyme involved in the modification of glycan structures


ALB - liver

CYP2A13 - liver

CHST4 - gallbladder

Pancreas

  • AMY2A (Amylase, alpha 2A) - an enzyme that digests carbohydrates, secreted by exocrine cells
  • INS (Insulin) - involved in lowering of blood glucose, secreted by beta cells
  • GCG (Glucagon) - involved in the elevation of blood glucose, secreted by alpha cells


AMY2A - pancreas

INS - pancreas

GCG - pancreas

Kidney & urinary bladder

  • SLC22A13 (Solute carrier family 22 member 13) - membrane-bound organic anion transporter
  • NPHS2 (Podocin) - involved in the regulation of glomerular permeability
  • UPK2 (Uroplakin 2) - membrane protein preventing cell rupture during bladder distention


SLC22A13 - kidney

NPHS2 - kidney

UPK2 - urinary bladder

Male tissues

  • DMRT1 (Doublesex- and mab-3-related transcription factor 1) - involved in meiosis
  • SEMG1 (Semenogelin I) - a predominant protein in semen
  • KLK3 (Kallikrein related peptidase 3) - also called PSA, is used clinically to diagnose prostate cancer


DMRT1 - testis

SEMG1 - seminal vesicle

KLK3 - prostate

Female tissues

  • CSH1 (Chorionic somatomammotropin hormone 1 ) - hormone important for growth control during pregnancy
  • OVGP1 (Oviductal glycoprotein 1) - mucus protein important in mucociliary transport of the fertilized ovum
  • PWWP3B (PWWP domain containing 3B) - a protein with a mutated melanoma-associated antigen 1 domain, associated with cancer


CSH1 - placenta

OVGP1 - fallopian tube

PWWP3B - ovary

Muscle tissues

  • TNNI3 (Troponin I3, cardiac type) - mediates muscle relaxation
  • TNNT2 (Troponin T2, cardiac type) - mediates muscle contraction
  • MYH7 (Myosin heavy chain 7) - expressed in slow type I muscle fibers


TNNI3 - heart muscle

TNNT2 - heart muscle

MYH7 - skeletal muscle

Connective & soft tissue

  • FABP4 (Fatty acid binding protein 4) - involved in fatty acid uptake, transport, and metabolism
  • PLIN1 (Perilipin 1) - coats lipid storage droplets in adipocytes


FABP4 - adipose tissue (soft tissue)

PLIN1 - adipose tissue (breast)

Skin

  • KRT1 (Keratin 1) - involved in squamous differentiation and skin barrier function
  • KRT27 (Keratin 27) - plays a role in hair formation
  • CASP14 (Caspase 14) - involved in keratinocyte differentiation and cornification


KRT1 - skin

KRT27 - hair

CASP14 - skin

Bone marrow & lymphoid tissues

  • MPO (Myeloperoxidase) - major component of neutrophil azurophilic granules
  • CD8B (CD8b molecule) - plays a critical role in thymic selection of CD8+ T-cells
  • CD22 (CD22 molecule) - mediates interactions between B-cells


MPO - bone marrow

CD8B - thymus

CD22 - lymph node


Group enriched proteins

The 1628 genes identified as group enriched reflect genes with shared expression in 2-5 tissues. Many of these genes encode proteins that are expressed in cell types that have similar functions across several tissues, such as proteins expressed in immune cells (present in many organs but especially lymphoid tissues and the gastrointestinal tract) tissues), proteins involved in squamous cell differentiation (e.g. cervix, esophagus and skin), glandular cell function in the gastrointestinal tract (duodenum, small intestine and colon) or cilia movement (testis and fallopian tube). The schematic network plot below shows the distribution between group enriched genes in different tissues.

Figure 2. An interactive network plot of the tissue enriched and group enriched genes connected to their respective enriched tissues (grey circles). Red nodes represent the number of tissue enriched genes and orange nodes represent the number of genes that are group enriched. The sizes of the red and orange nodes are related to the number of genes displayed within the node. Each node is clickable and results in a list of all enriched genes connected to the highlighted edges. The network is limited to group enriched genes in combinations of up to 3 tissues, but the resulting lists show the complete set of group enriched genes in the particular tissue.


Immune cells can be found in both lymphoid organs and organs infiltrated by immune cells, such as the intestine. Consequently, genes important for immune cell function are often enriched in both lymphoid tissues and the intestine. One such gene is CD19, encoding a co-receptor for the B-cell antigen receptor complex on B-cell lymphocytes essential for their differentiation and proliferation, including antibody production, in response to antigens.


CD19 - tonsil

CD19 - appendix

CD19 - colon

Squamous epithelia are found in many parts of the body as dry skin or wet mucosa, acting as a robust barrier against various chemical and mechanical stresses. Desmocollin 3, DSC3, encoding a protein important in cell-cell junctions and cellular adhesion, is group enriched in squamous epithelia, such as the esophagus and skin exemplified below.


DSC3 - esophagus

DSC3 - skin

Mucus has several functions in the body related to transportation and barrier functions. The function of the mucus in the salivary gland is related to food and pathogens, while the mucus in the cervix is involved in for example transportation and blockage of sperm during sexual reproduction. MUC16 is a mucus component and is group enriched in both the mucus-producing salivary gland and cervix.


MUC16 - salivary gland

MUC16 - cervix

The fallopian tube shares many elevated genes with testis. The common denominator is the utilization of cilia, or the structurally similar flagellum, for essential organ functions. DNAI2, a dynein protein, constitutes a motor protein component of motile cilia of multiciliated cells as well as the flagellum (tail) of the sperm. By pulling on the microtubule structure of the cilium/flagellum, the motor protein creates motion and in the case of the sperm, sperm motility. In the immunohistochemistry images below, expression of DNAI2 can be seen in a subset of cilia in the fallopian tube (left and middle image), as well as in the flagellum of spermatids and cytoplasm of differentiating spermatocytes (right image).


DNAI2 - fallopian tube

DNAI2 - fallopian tube ciliated cells

DNAI2 - testis


Relevant links and publications

Uhlén M et al., Tissue-based map of the human proteome. Science (2015)
PubMed: 25613900 DOI: 10.1126/science.1260419

Bergman J et al., The human adrenal gland proteome defined by transcriptomics and antibody-based profiling. Endocrinology. (2016)
PubMed: 27901589 DOI: 10.1210/en.2016-1758

Edqvist PH et al., Expression of human skin-specific genes defined by transcriptomics and antibody-based profiling. J Histochem Cytochem. (2015)
PubMed: 25411189 DOI: 10.1369/0022155414562646

Lindskog C et al., The human cardiac and skeletal muscle proteomes defined by transcriptomics and antibody-based profiling. BMC Genomics. (2015)
PubMed: 26109061 DOI: 10.1186/s12864-015-1686-y

Sjöstedt E et al., Defining the Human Brain Proteome Using Transcriptomics and Antibody-Based Profiling with a Focus on the Cerebral Cortex. PLoS One. (2015)
PubMed: 26076492 DOI: 10.1371/journal.pone.0130028

Zieba A et al., The Human Endometrium-Specific Proteome Defined by Transcriptomics and Antibody-Based Profiling. OMICS. (2015)
PubMed: 26488136 DOI: 10.1089/omi.2015.0115

O'Hurley G et al., Analysis of the Human Prostate-Specific Proteome Defined by Transcriptomics and Antibody-Based Profiling Identifies TMEM79 and ACOXL as Two Putative, Diagnostic Markers in Prostate Cancer. PLoS One. (2015)
PubMed: 26237329 DOI: 10.1371/journal.pone.0133449

Habuka M et al., The Urinary Bladder Transcriptome and Proteome Defined by Transcriptomics and Antibody-Based Profiling. PLoS One. (2015)
PubMed: 26694548 DOI: 10.1371/journal.pone.0145301

Andersson S et al., The transcriptomic and proteomic landscapes of bone marrow and secondary lymphoid tissues. PLoS One. (2014)
PubMed: 25541736 DOI: 10.1371/journal.pone.0115911

Habuka M et al., The kidney transcriptome and proteome defined by transcriptomics and antibody-based profiling. PLoS One. (2014)
PubMed: 25551756 DOI: 10.1371/journal.pone.0116125

Mardinoglu A et al., Defining the Human Adipose Tissue Proteome To Reveal Metabolic Alterations in Obesity. J Proteome Res. (2014)
PubMed: 25219818 DOI: 10.1021/pr500586e

Kampf C et al., Defining the human gallbladder proteome by transcriptomics and affinity proteomics. Proteomics. (2014)
PubMed: 25175928 DOI: 10.1002/pmic.201400201

Lindskog C et al., The lung-specific proteome defined by integration of transcriptomics and antibody-based profiling. FASEB J. (2014)
PubMed: 25169055 DOI: 10.1096/fj.14-254862

Gremel G et al., The human gastrointestinal tract-specific transcriptome and proteome as defined by RNA sequencing and antibody-based profiling. J Gastroenterol. (2014)
PubMed: 24789573 DOI: 10.1007/s00535-014-0958-7

Kampf C et al., The human liver-specific proteome defined by transcriptomics and antibody-based profiling. FASEB J. (2014)
PubMed: 24648543 DOI: 10.1096/fj.14-250555

Djureinovic D et al., The human testis-specific proteome defined by transcriptomics and antibody-based profiling. Mol Hum Reprod. (2014)
PubMed: 24598113 DOI: 10.1093/molehr/gau018

Fagerberg L et al., Analysis of the human tissue-specific expression by genome-wide integration of transcriptomics and antibody-based proteomics. Mol Cell Proteomics. (2014)
PubMed: 24309898 DOI: 10.1074/mcp.M113.035600

Danielsson A et al., The human pancreas proteome defined by transcriptomics and antibody-based profiling. PLoS One. (2014)
PubMed: 25546435 DOI: 10.1371/journal.pone.0115421

Microscopical images of normal tissue - Tissue Dictionary (Human Protein Atlas)

GTEx Portal

UniProt

Allen Brain Atlas