A number of eukaryotic proteins, which probably are sequence specific DNA-binding proteins that act as transcription factors, share a conserved domain
of 40 to 50 amino acid residues. It has been proposed  that this domain is
formed of two amphipathic helices joined by a variable length linker region
that could form a loop. This 'helix-loop-helix' (HLH) domain mediates protein
dimerization and has been found in the proteins listed below [2,3]. Most
of these proteins have an extra basic region of about 15 amino acid residues
that is adjacent to the HLH domain and specifically binds to DNA. They are
refered as basic helix-loop-helix proteins (bHLH), and are classified in two
groups: class A (ubiquitous) and class B (tissue-specific). Members of the
bHLH family bind variations on the core sequence 'CANNTG', also refered to as
the E-box motif. The homo- or heterodimerization mediated by the HLH domain is
independent of, but necessary for DNA binding, as two basic regions are
required for DNA binding activity. The HLH proteins lacking the basic domain
(Emc, Id) function as negative regulators since they form heterodimers, but
fail to bind DNA. The hairy-related proteins (hairy, E(spl), deadpan) also
repress transcription although they can bind DNA. The proteins of this
subfamily act together with co-repressor proteins, like groucho, through their
C-terminal motif WRPW.
The myc family of cellular oncogenes , which is currently known to
contain four members: c-myc, N-myc, L-myc, and B-myc. The myc genes are
thought to play a role in cellular differentiation and proliferation.
Proteins involved in myogenesis (the induction of muscle cells). In mammals
MyoD1 (Myf-3), myogenin (Myf-4), Myf-5, and Myf-6 (Mrf4 or herculin), in
birds CMD1 (QMF-1), in Xenopus MyoD and MF25, in Caenorhabditis elegans
CeMyoD, and in Drosophila nautilus (nau).
Vertebrate proteins that bind specific DNA sequences ('E boxes') in various
immunoglobulin chains enhancers: E2A or ITF-1 (E12/pan-2 and E47/pan-1),
ITF-2 (tcf4), TFE3, and TFEB.
Vertebrate neurogenic differentiation factor 1 that acts as differentiation
factor during neurogenesis.
Vertebrate MAX protein, a transcription regulator that forms a sequence-
specific DNA-binding protein complex with myc or mad.
Vertebrate Max Interacting Protein 1 (MXI1 protein) which acts as a
transcriptional repressor and may antagonize myc transcriptional activity
by competing for max.
Proteins of the bHLH/PAS superfamily which are transcriptional activators.
In mammals, AH receptor nuclear translocator (ARNT), single-minded homologs
(SIM1 and SIM2), hypoxia-inducible factor 1 α (HIF1A), AH receptor
(AHR), neuronal pas domain proteins (NPAS1 and NPAS2), endothelial pas
domain protein 1 (EPAS1), mouse ARNT2, and human BMAL1. In drosophila,
single-minded (SIM), AH receptor nuclear translocator (ARNT), trachealess
protein (TRH), and similar protein (SIMA).
Mammalian transcription factors HES, which repress transcription by acting
on two types of DNA sequences, the E box and the N box.
Mammalian MAD protein (max dimerizer) which acts as transcriptional
repressor and may antagonize myc transcriptional activity by competing for
Mammalian Upstream Stimulatory Factor 1 and 2 (USF1 and USF2), which bind
to a symmetrical DNA sequence that is found in a variety of viral and
Human lyl-1 protein; which is involved, by chromosomal translocation, in T-
Human transcription factor AP-4.
Mouse helix-loop-helix proteins MATH-1 and MATH-2 which activate E box-
dependent transcription in collaboration with E47.
Mammalian stem cell protein (SCL) (also known as tal1), a protein which may
play an important role in hemopoietic differentiation. SCL is involved, by
chromosomal translocation, in stem-cell leukemia.
Mammalian proteins Id1 to Id4 . Id (inhibitor of DNA binding) proteins
lack a basic DNA-binding domain but are able to form heterodimers with
other HLH proteins, thereby inhibiting binding to DNA.
Drosophila extra-macrochaetae (emc) protein, which participates in sensory
organ patterning by antagonizing the neurogenic activity of the achaete-
scute complex. Emc is the homolog of mammalian Id proteins.
Human Sterol Regulatory Element Binding Protein 1 (SREBP-1), a
transcriptional activator that binds to the sterol regulatory element 1
(SRE-1) found in the flanking region of the LDLR gene and in other genes.
Drosophila achaete-scute (AS-C) complex proteins T3 (l'sc), T4 (scute),
T5 (achaete) and T8 (asense). The AS-C proteins are involved in the
determination of the neuronal precursors in the peripheral nervous system
and the central nervous system.
Mammalian homologs of achaete-scute proteins, the MASH-1 and MASH-2
Drosophila atonal protein (ato) which is involved in neurogenesis.
Drosophila daughterless (da) protein, which is essential for neurogenesis
Drosophila deadpan (dpn), a hairy-like protein involved in the functional
differentiation of neurons.
Drosophila delilah (dei) protein, which is plays an important role in the
differentiation of epidermal cells into muscle.
Drosophila hairy (h) protein, a transcriptional repressor which regulates
the embryonic segmentation and adult bristle patterning.
Drosophila enhancer of split proteins E(spl), that are hairy-like proteins
active during neurogenesis. also act as transcriptional repressors.
Drosophila twist (twi) protein, which is involved in the establishment of
germ layers in embryos.
Maize anthocyanin regulatory proteins R-S and LC.
Yeast centromere-binding protein 1 (CPF1 or CBF1). This protein is involved
in chromosomal segregation. It binds to a highly conserved DNA sequence,
found in centromers and in several promoters.
Yeast INO2 and INO4 proteins.
Yeast phosphate system positive regulatory protein PHO4 which interacts
with the upstream activating sequence of several acid phosphatase genes.
Yeast serine-rich protein TYE7 that is required for ty-mediated ADH2
Neurospora crassa nuc-1, a protein that activates the transcription of
structural genes for phosphorus acquisition.
Fission yeast protein esc1 which is involved in the sexual differentiation
The schematic representation of the helix-loop-helix domain is shown here:
PROSITE is copyright. It is produced by the SIB Swiss Institute
Bioinformatics. There are no restrictions on its use by non-profit
institutions as long as its content is in no way modified. Usage by and
for commercial entities requires a license agreement. For information
about the licensing scheme send an email to
or see: prosite_license.html.