SARS-CoV-2 Relevant PROSITE Motifs

Sigrist CJA, Bridge A, Le Mercier P.
A potential role for integrins in host cell entry by SARS-CoV-2.
Antiviral Res. 2020 Mar 1;177:104759. doi: 10.1016/j.antiviral.2020.104759.
PubMed: 32130973 [ Full text ] [ PDF version ]

Profiles detected in SARS-CoV-2 proteins

Their associated ProRule provide additional information used to increase the discriminatory power and to annotate UniProtKB/Swiss-Prot proteins.

Macro domain profile ( PS51154 )
The Macro or A1pp domain is a module of ~180 amino acids which can bind ADP-ribose, an NAD metabolite or related ligands. The Macro domain has been suggested to play a regulatory role in ADP-ribosylation, which is involved in inter- and intracellular signaling, transcriptional regulation, DNA repair pathways and maintenance of genomic stability, telomere dynamics, cell differentiation and proliferation, and necrosis and apoptosis. Viral Macro domains reverse protein ADP-ribosylation.

Peptidase family C16 domain profile ( PS51124 )
Peptidase family C16 (EC 3.4.22.-) contains the coronaviruses accessory cysteine proteinases that recognize and process one or two sites in the amino-terminal half of the replicase polyprotein during assembly of the viral replication complex. The SARS-CoV-2 papain-like protease (PL-PRO) belongs to this family.

Coronavirus main protease (M-pro) domain profile ( PS51442 )
The maturation of coronaviruses involves a highly complex cascade of proteolytic processing events on the polyproteins to control viral gene expression and replication. Most maturation cleavage events within the precursor polyprotein are mediated by a viral cysteine proteinase which is called the 'main proteinase' (M-pro) or, alternatively, the '3C-like proteinase' (3CL-pro). The ~300-residue mature form of the M-pro is released from pp1a and pp1ab by autoproteolytic cleavage and employs conserved cysteine and histidine residues in the catalytic site.

RdRp of positive ssRNA viruses catalytic domain profile ( PS50507 )
RNA-directed RNA polymerase (RdRp) (EC is an essential protein encoded in the genomes of all RNA containing viruses with no DNA stage. It catalyses synthesis of the RNA strand complementary to a given RNA template.

Coronaviridae zinc-binding (CV ZBD) domain profile ( PS51653 )
The coronavirus nsp13 protein is comprised of a C-terminal nucleoside triphosphate-binding/helicase (Hel) motif and an N-terminal cysteine-rich zinc-binding domain (ZBD). The ZBD is critically involved in coronavirus replication and transcription by modulating the enzymatic activities of the helicase domain and other, yet unknown, mechanisms.

(+)RNA virus helicase core domain profile ( PS51657 )
Implicated in diverse aspects of transcription and replication.

Betacoronavirus spike (S) glycoprotein S1 subunit N-terminal and C-terminal domain (NTD and CTD) profiles ( PS51921 and PS51922 )
The S protein, which is located on the envelope surface of the virion, functions to mediate receptor recognition and membrane fusion and is therefore a key factor determining the virus tropism for a specific species. In most cases, coronaviral S will be further cleaved into S1 and S2 subunits, and the receptor binding capacity is allocated to the S1 subunit. The receptor binding domain (RBD) of betaCoV that directly engages the receptor is commonly located in the C-terminal half of S1 [C-terminal domain (CTD)] such as in SARS-CoV,SARS-CoV-2, MERS-CoV, and BatCoV HKU4, though in rare cases such as with mouse hepatitis virus (MHV), the RBD region was identified in the S1 N-terminal domain (NTD).

Sarbecovirus 9b domain profile ( PS51920 )
p9b could have a role in membrane interactions during the assembly of the virus.

X4e domain profile ( PS51919 )
The Sarbecovirus accessory protein X4, also called 7a or U122, is likely to be a type I membrane protein, with the amino-terminal hydrophilic domain oriented inside the lumen of the ER/Golgi or on the surface of the cell membrane or virus particle, depending on the localization of the protein. It has been suggested that X4e contains a binding site for the alpha(L) integrin subunit I-domain of LFA-1.

Coronavirus spike (S) glycoprotein S2 subunit heptad repeat 1 (HR1) and 2 (HR2) regions profiles ( PS51923 and PS51924 )
The S protein, which is located on the envelope surface of the virion, functions to mediate receptor recognition and membrane fusion and is therefore a key factor determining the virus tropism for a specific species. This protein is composed of an N-terminal receptor-binding domain (S1) and a C-terminal trans-membrane fusion domain (S2). The S2 subunit contains two 4-3 heptad repeats (HRs) of hydrophobic residues, HR1 and HR2, typical of coiled coils, separated by an ~170-aa-long intervening domain. The S2 subunit is expected to present rearrangement of its HRs to form a stable 6-helix bundle fusion core.

Coronavirus envelope (CoV E) protein profile ( PS51926 )
Coronavirus envelope (CoV E) proteins are involved in several aspects of the virus' life cycle, such as assembly, budding, envelope formation, and pathogenesis. They are ~100-residue-long polypeptides that are minor components in virions but are abundantly expressed inside infected cells. They are localized mainly to the endoplasmic reticulum (ER) and Golgi-complex where they participate in the assembly, budding, and intracellular trafficking of infectious virions.

Coronavirus membrane (CoV M) protein profile ( PS51927 )
The Coronavirus membrane (CoV M) protein, which functions as a homodimer, adapts a region of membrane for virus assembly and captures other structural proteins at the budding site.

Coronavirus nucleocapsid (CoV N) protein N- and C-terminal (NTD and CTD) domains profiles ( PS51928 and PS51929 )
The coronavirus nucleocapsid (CoV N) protein serves multiple purposes, such as packaging the RNA genome into helical ribonucleoproteins, modulating host cell metabolism, and regulating viral RNA synthesis during replication and transcription. CoV N proteins contain two structured domains: the N-terminal domain (NTD; also called RBD), which is responsible for RNA binding, and the C-terminal domain (CTD; also called DD), which mediates oligomerization and RNA binding.

Sarbecovirus Nsp3c-N and Betacoronavirus Nsp3c-M domains profiles ( PS51940 and PS51941 )
Sarbecovirus Nsp3 contains three sequentially arranged Macro domains: the X domain and domains Nsp3c-N (or SUD-N) as well as Nsp3c-M (or SUD-M) within the Nsp3C or SUD (SARS-unique domain) region. Both Nsp3c-N and Nsp3c-M domains bind unusual nucleic-acid structures formed by consecutives guanosine nucleotides, where four strands of nucleic acid are forming a superhelix (so-called G-quadruplexes).

Betacoronavirus Nsp3c-C domain profile ( PS51942 )
The Nsp3c-C (or DPUP) domain adopts a frataxin fold or double-wing motif, which is an alpha+beta fold that is associated with protein/protein interactions, DNA binding, and metal ion binding. The Nsp3c-C domain binds to single-stranded RNA and recognizes purine bases more strongly than pyrimidine bases.

Coronavirus Nsp3a and Nsp3d Ubl domains profiles ( PS51943 and PS51944 )
Two ubiquitin-like domains, Ubl1 and Ubl2 (Nsp3a and the N-terminal domain of Nsp3d), exist within Nsp3 of all CoVs. The known functional roles of Nsp3a Ubl in CoVs are related to single-stranded (ssRNA) binding and interacting with the nucleocapsid (N) protein. Nsp3d Ubl is immediately adjacent to the N-terminus of the PLpro (or PL2Pro) domain in CoV polyproteins, and it may play a critical role in protease regulation and stability as well as in viral infection.

Betacoronavirus Nsp3e nucleic acid-binding (NAB) domain profile ( PS51945 )

Nsp3e is unique to Betacoronaviruses and consists of a nucleic acid-binding domain (NAB) and the so-called group 2-specific marker (G2M). The Nsp3e NAB was shown to bind G-rich single-stranded RNA (ssRNA) and to possess double-stranded DNA (dsDNA) unwinding capability.

Coronavirus Nsp4 C-terminal (Nsp4C) domain profile ( PS51946 )
Nsp4 plays important role in coronavirus replication and double membrane vesicles (DMVs) formation, and the transmembrane regions of Nsp4 are involved in association of the coronavirus replication complex with cellular membranes. The Nsp4C domain could be engaged in protein-protein interactions.

Nidovirus RdRp-associated nucleotidyl transferase (NiRAN) domain profile ( PS51947 )
The NiRAN domain has an essential nucleotidylation activity and its potential functions in nidovirus replication may include RNA ligation, protein-primed RNA synthesis, and the guanylly-transferase function that is necessary for mRNA capping.

Coronavirus Nsp12 RNA-dependent RNA polymerase (RdRp) and Nsp7 and Nsp8 cofactors domains profiles ( PS51948, PS51949, PS51950 and PS52000)
The Nsp12 RNA-dependent RNA polymerase, that includes an RdRp catalytic domain conserved in all RNA viruses, possesses some minimal activity on its own, but the addition of the Nsp7 and Nsp8 cofactors greatly stimulates polymerase activity.

Coronavirus (CoV) Nsp9 ssRNA-binding domain profile ( PS51951 )
Nsp9 is able to bind single-stranded (ss)DNA or ssRNA, although binding of ssRNA is expected to be the native function. The CoV Nsp9 ssRNA-binding domain is a seemingly obligate homodimer but CoV Nsp9 ssRNA-binding domains have diverse forms of dimerization that promote their biological function. Using diverse dimerization strategies, the Nsp9 ssRNA-binding domain might increase the nucleic acid binding interface and then promote its nucleic acid binding affinity, which might stabilize nascent viral RNAs during replication or transcription, thus providing protection from nucleases.

Coronavirus (CoV) ExoN/MTase coactivator domain profile ( PS51952 )
Nsp10, a critical cofactor for activation of multiple replicative enzyme, is a small protein of ca. 140 amino acid residues that exists exclusively in viruses and not in prokaryotes or eukaryotes. Nsp10 is known to interact with both Nsp14 and Nsp16, acting as a scaffolding protein and stimulating their respective 3'-5' exoribonuclease (ExoN) and 2'-O-methyltransferase (2'-O-MTase) activities.

Nidovirus 3'-5' exoribonuclease (ExoN) domain profile ( PS51953 )
The nidovirus 3'-5' exoribonuclease (ExoN) domain may enhance the fidelity of RNA synthesis by correcting nucleotide incorporation errors made by the RNA-dependent RNA polymerase.

Coronavirus (CoV) guanine-N7-methyltransferase (N7-MTase) domain profile ( PS51954 )
The S-adenosyl methionine (SAM)-dependent guanine-N7-methyltransferase (N7-MTase) domain methylates the 5' guanine of the Gppp-RNA at the N7 position for mRNA capping.

Nidovirus 2'-O-methyltransferase (2'-O-MTase) domain profile ( PS51955 )
The 3'-terminal domain of the most conserved ORF1b in three of the four families of the order Nidovirales (except for the family Arteriviridae) encodes a 2'-O-methyltransferase (2'-O-MTase), known as non structural protein (Nsp) 16 in the family Coronaviridae and implicated in methylation of the 5' cap structure of nidoviral mRNAs.

Nidoviral uridylate-specific endoribonuclease (NendoU) domain ( PS51958 )
Among the Nsps found in Nidovirales, nonstructural protein 15 (nsp15) from coronaviruses and Nsp11 from arteriviruses contain in their C-terminal region a conserved endoribonuclease domain called nidoviral uridylate-specific endoribonuclease (NendoU) with cleavage specificity for single- and double-stranded RNA 5' of uridine nucleotides to produce a 2'-3'-cyclic phosphate end product.

Coronavirus (CoV) Nsp15 N-terminal oligomerization domain ( PS51960 )
Nsp15 is a nidoviral uridylate-specific endoribonuclease (NendoU) that consists of three distinct domains, a small N-terminal, an intermediate-sized middle, and a large C-terminal NendoU domain. CoV Nsp15 forms double-ring hexamers made of dimers of trimers. The hexameric form is thought to be the fully active form of CoV Nsp15 and the hexamer is stabilized by interactions of the N-terminal oligomerization domain.

Arterivirus Nsp11 N-terminal/coronavirus NSP15 middle (AV-Nsp11N/CoV-Nsp15M) domain ( PS51961 )
The AV-Nsp11N/CoV-Nsp15M domain may serve as an interaction hubs with other proteins and RNA.

Coronavirus (CoV) Nsp1 globular and Betacoronavirus (BetaCoV) Nsp1 C-terminal domains profiles ( PS51962 and PS51963 )
Nsp1 is a characteristic feature of AlphaCoVs and BetaCoVs, which exhibits both functional conservation and mechanistic diversity in inhibiting host gene expression and antiviral responses. Although the sequence homologies among CoV Nsp1 proteins are low, the core structures share a relatively conserved globular domain. In addition to the globular domain, BetaCoV Nsp1s contain a C-terminal domain. The Sarbecovirus Nsp1 C-terminal domain binds to the mRNA channel of the 40S ribosome, where it interferes with mRNA binding and inhibits host protein translation. The 5’ UTR of Sarbecovirus mRNA removes this inhibition by binding to the Nsp1 globular domain. This inhibition mechanism may be unique to Sarbecoviruses, because the C-terminal region of Nsp1 is shorter in AlphaCoVs and is not highly conserved amongst other BetaCoVs, including MERS-CoV.

Coronavirus (CoV) 3a-like viroporin transmembrane (TM) and cytosolic (CD) domains profiles ( PS51966 and PS51967 )

3a-like accessory proteins are transmembrane proteins of the viroporin family that form ion channels in the host membrane and have been implicated in inducing apoptosis, pathogenicity, and virus release. The induction of cytokine storms in COVID-19 patients might be linked to ORF3a mediated activation of inflammasome. 3a-like viroporins contain a transmembrane domain (TM) and a cytosolic domain (CD).

SARS ORF8 accessory protein immunoglobulin (Ig)-like domain profile ( PS51964 )
The SARS related (SARSr) ORF8 accessory protein is thought to exert important functions in modulating the host infected cell metabolism and antiviral immunity. The SARSr ORF8 contains an immunoglobulin (Ig)-like domain.

Coronavirus (CoV) Nsp2 N-terminal, middle and C-terminal domains profiles ( PS51989, PS51990 and PS51991 )
Nsp2 has been implicated in processes ranging from translation repression to endosomal transport, ribosome biogenesis, and actin filament binding. Nsp2 in SARS-CoV-2 and other coronaviruses have been observed to localize to endosomes and replication-transcription complexes (RTC). Nsp2 may be involved in binding nucleic acids and regulating intracellular signaling pathways. It contains an N-terminal, a middle and a C-terminal domain. The Nsp2 N-terminal domain contains ten alpha-helices and fourteen beta-sheets with three zinc fingers (ZnFs), belonging to the C2H2, C4, and C2HC types, respectively. The three zinc fingers are not involved in binding nucleic acids directly but may play other unknown functions. The interaction of Nsp2 and nucleic acid is mainly dependent on a large positively charged region on the electrostatic surface of the Nsp2 N-terminal domain.

Coronavirus Nsp3 Y3 domain profile ( PS51992 )
The Y1 domain is conserved in all viruses of the order Nidovirales, while Y2 and Y3 are only conserved in all coronaviruses. The three domains from the Y region (Y1 to Y3) are located at the cytosolic side of the ER. The level of their conservation is close to that for the enzymatic domains of Nsp3 and exceeds nonenzymatic ones. From the consistently high conservation of Y1, Y2, and Y3, it has been hypothesize that Y1 to Y3 may form a single functional unit with a conserved enzymatic function. The CoV Nsp3 Y3 domain possesses a well-ordered globular alpha/beta fold.

Betacoronavirus Nsp3e group 2-specific marker (G2M) domain profile ( PS51994 )
Nsp3e is unique to Betacoronaviruses and consists of a nucleic acid-binding domain (NAB) and the so-called group 2-specific marker (G2M) or Betacoronavirus-specific marker (BetaSM).

Coronavirus 3Ecto domain profile ( PS51993 )
The CoV 3Ecto domain is glycosylated and predicted to be located on the lumenal side of the membrane. It has been shown that interaction of the 3Ecto domain with the large lumenal loop of Nsp4 is essential for the endoplasmic reticulum (ER) rearrangements and double-membrane vesicles (DMVs) formation occurring in cells infected by SARS-CoV or mouse hepatitis virus (MHV).

Patterns with a high probability of occurrence detected in SARS-CoV-2 proteins

They have poor specificity and generate many false positives. Their matches are reviewed by expert curators before inclusion in UniProtKB/Swiss-Prot.

Cell attachment sequence ( PS00016 )
The spike protein of SARS-CoV-2 acquired a RGD motif known to bind integrins. This motif is absent from other coronaviruses.

N-glycosylation site ( PS00001 )
The SARS-CoV-2 viral envelope comprises of three proteins where spike (S) and membrane (M) are the two major glycoproteins and envelope (E) is the non-glycosylated protein. N-glycosylation sites are specific to the consensus sequence Asn-Xaa-Ser/Thr. This signature performs well to detect potential N-glycosylation sites of extracellular viral proteins.