A sequence of about thirty to forty amino-acid residues long found in the
sequence of epidermal growth factor (EGF) has been shown [1,2,3,4,5,6] to be
present, in a more or less conserved form, in a large number of other, mostly
animal proteins. EGF is a polypeptide of about 50 amino acids with three
internal disulfide bridges. It first binds with high affinity to specific
cell-surface receptors and then induces their dimerization, which is essential
for activating the tyrosine kinase in the receptor cytoplasmic domain,
initiating a signal transduction that results in DNA synthesis and cell
A common feature of all EGF-like domains is that they are found in the
extracellular domain of membrane-bound proteins or in proteins known to be
secreted (exception: prostaglandin G/H synthase). The EGF-like domain includes
six cysteine residues which have been shown to be involved in disulfide bonds.
The structure of several EGF-like domains has been solved. The fold consists
of two-stranded β-sheet followed by a loop to a C-terminal short
two-stranded sheet (see <PDB:1EGF>). Subdomains between the conserved
cysteines strongly vary in length as shown in the following schematic
representation of the EGF-like domain:
'C': conserved cysteine involved in a disulfide bond.
'G': often conserved glycine
'a': often conserved aromatic amino acid
'*': position of both patterns.
'x': any residue
Some proteins known to contain one or more copies of an EGF-like domain are
Adipocyte differentiation inhibitor (gene PREF-1) from mouse (6 copies).
Agrin, a basal lamina protein that causes the aggregation of acetylcholine
receptors on cultured muscle fibers (4 copies).
Amphiregulin, a growth factor (1 copy).
βcellulin, a growth factor (1 copy).
Blastula proteins BP10 and Span from sea urchin which are thought to be
involved in pattern formation (1 copy).
BM86, a glycoprotein antigen of cattle tick (7 copies).
Bone morphogenic protein 1 (BMP-1), a protein which induces cartilage and
bone formation and which expresses metalloendopeptidase activity (1-2
copies). Homologous proteins are found in sea urchin - suBMP (1 copy) - and
in Drosophila - the dorsal-ventral patterning protein tolloid (2 copies).
Caenorhabditis elegans apx-1 protein, a patterning protein (4.5 copies).
Calcium-dependent serine proteinase (CASP) which degrades the extracellular
matrix proteins type I and IV collagen and fibronectin (1 copy).
Cartilage matrix protein CMP (1 copy).
Cartilage oligomeric matrix protein COMP (4 copies).
Cell surface antigen 114/A10 (3 copies).
Cell surface glycoprotein complex transmembrane subunit ASGP-2 from rat (2
Coagulation associated proteins C, Z (2 copies) and S (4 copies).
Coagulation factors VII, IX, X and XII (2 copies).
Complement C1r components (1 copy).
Complement C1s components (1 copy).
Complement-activating component of Ra-reactive factor (RARF) (1 copy).
Complement components C6, C7, C8 α and β chains, and C9 (1 copy).
Crumbs, an epithelial development protein from Drosophila (29 copies).
Epidermal growth factor precursor (7-9 copies).
Exogastrula-inducing peptides A, C, D and X from sea urchin (1 copy).
Fat protein, a Drosophila cadherin-related tumor suppressor (5 copies).
Fetal antigen 1, a probable neuroendocrine differentiation protein, which
is derived from the delta-like protein (DLK) (6 copies).
Fibrillin 1 (47 copies) and fibrillin 2 (14 copies).
Fibropellins IA (21 copies), IB (13 copies), IC (8 copies), II (4 copies)
and III (8 copies) from the apical lamina - a component of the
extracellular matrix - of sea urchin.
Fibulin-1 and -2, two extracellular matrix proteins (9-11 copies).
Giant-lens protein (protein Argos), which regulates cell determination and
axon guidance in the Drosophila eye (1 copy).
Growth factor-related proteins from various poxviruses (1 copy).
Gurken protein, a Drosophila developmental protein (1 copy).
Heparin-binding EGF-like growth factor (HB-EGF), transforming growth factor
α (TGF-α), growth factors Lin-3 and Spitz (1 copy); the precursors
are membrane proteins, the mature form is located extracellular.
Limulus clotting factor C, which is involved in hemostasis and host defense
mechanisms in japanese horseshoe crab (1 copy).
Meprin A α subunit, a mammalian membrane-bound endopeptidase (1 copy).
Milk fat globule-EGF factor 8 (MFG-E8) from mouse (2 copies).
Neuregulin GGF-I and GGF-II, two human glial growth factors (1 copy).
Neurexins from mammals (3 copies).
Neurogenic proteins Notch, Xotch and the human homolog Tan-1 (36 copies),
Delta (9 copies) and the similar differentiation proteins Lag-2 from
Caenorhabditis elegans (2 copies), Serrate (14 copies) and Slit (7 copies)
Nidogen (also called entactin), a basement membrane protein from chordates
Prostaglandin G/H synthase 1 and 2 (EC 184.108.40.206) (1 copy), which is found
in the endoplasmatic reticulum.
Reelin, an extracellular matrix protein that plays a role in layering of
neurons in the cerebral cortex and cerebellum of mammals (8 copies).
S1-5, a human extracellular protein whose ultimate activity is probably
modulated by the environment (5 copies).
Schwannoma-derived growth factor (SDGF), an autocrine growth factor as well
as a mitogen for different target cells (1 copy).
Selectins. Cell adhesion proteins such as ELAM-1 (E-selectin), GMP-140
(P-selectin), or the lymph-node homing receptor (L-selectin) (1 copy).
Serine/threonine-protein kinase homolog (gene Pro25) from Arabidopsis
thaliana, which may be involved in assembly or regulation of
light-harvesting chlorophyll A/B protein (2 copies).
Sperm-egg fusion proteins PH-30 α and β from guinea pig (1 copy).
Stromal cell derived protein-1 (SCP-1) from mouse (6 copies).
TDGF-1, human teratocarcinoma-derived growth factor 1 (1 copy).
Tenascin (or neuronectin), an extracellular matrix protein from mammals
(14.5 copies), chicken (TEN-A) (13.5 copies) and the related proteins human
tenascin-X (18 copies) and tenascin-like proteins TEN-A and TEN-M from
Drosophila (8 copies).
Thrombomodulin (fetomodulin), which together with thrombin activates
protein C (6 copies).
Thrombospondin 1, 2 (3 copies), 3 and 4 (4 copies), adhesive glycoproteins
that mediate cell-to-cell and cell-to-matrix interactions.
Thyroid peroxidase 1 and 2 (EC 220.127.116.11) from human (1 copy).
Transforming growth factor β-1 binding protein (TGF-B1-BP) (16 or 18
Tyrosine-protein kinase receptors Tek and Tie (EC 18.104.22.168) (3 copies).
Vitamin K-dependent anticoagulants protein C (2 copies) and protein S (4
copies) and the similar protein Z, a single-chain plasma glycoprotein of
unknown function (2 copies).
63 Kd sperm flagellar membrane protein from sea urchin (3 copies).
93 Kd protein (gene nel) from chicken (5 copies).
Hypothetical 337.6 Kd protein T20G5.3 from Caenorhabditis elegans (44
The region between the 5th and 6th cysteine contains two conserved glycines of
which at least one is present in most EGF-like domains. We created two
patterns for this domain, each including one of these C-terminal conserved
glycine residues. The profile we developed covers the whole domain.
The β chain of the integrin family of proteins contains 2 cysteine-
rich repeats which were said to be dissimilar with the EGF pattern .
Laminin EGF-like repeats (see <PDOC00961>) are longer than the average
EGF module and contain a further disulfide bond C-terminal of the EGF-like
region. Perlecan and agrin contain both EGF-like domains and laminin-type
The pattern do not detect all of the repeats of proteins with multiple
See <PDOC00913> for an entry describing specifically the subset of EGF-
like domains that bind calcium.
April 2006 / Pattern revised.
PROSITE methods (with tools and information) covered by this documentation:
The many faces of epidermal growth factor repeats.
PROSITE is copyright. It is produced by the SIB Swiss Institute
Bioinformatics. There are no restrictions on its use by non-profit
institutions as long as its content is in no way modified. Usage by and
for commercial entities requires a license agreement. For information
about the licensing scheme send an email to
or see: prosite_license.html.