0€0€ €‚pfam00067, p450, Cytochrome P450. Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures.¡€0€ª€0€ €CDD¡€ €­{¢€0€0€ €‚Òpfam00068, Phospholip_A2_1, Phospholipase A2. Phospholipase A2 releases fatty acids from the second carbon group of glycerol. Perhaps the best known members are secreted snake venoms, but also found in secreted pancreatic and membrane-associated forms. Structure is all-alpha, with two core disulfide-linked helices and a calcium-binding loop. This alignment represents the major family of PLA2s. A second minor family, defined by the honeybee venom PLA2 Structure 1POC and related sequences from Gila monsters (Heloderma), is not recognized. This minor family conserves the core helix pair but is substantially different elsewhere. The PROSITE pattern PA2_HIS, specific to the first core helix, recognizes both families.¡€0€ª€0€ €CDD¡€ €­|¢€0€0€ €,pfam00069, Pkinase, Protein kinase domain. ¡€0€ª€0€ €CDD¡€ €­}¢€0€0€ €‚pfam00070, Pyr_redox, Pyridine nucleotide-disulphide oxidoreductase. This family includes both class I and class II oxidoreductases and also NADH oxidases and peroxidases. This domain is actually a small NADH binding domain within a larger FAD binding domain.¡€0€ª€0€ €CDD¡€ €­~¢€0€0€ €‚’pfam00071, Ras, Ras family. Includes sub-families Ras, Rab, Rac, Ral, Ran, Rap Ypt1 and more. Shares P-loop motif with GTP_EFTU, arf and myosin_head. See pfam00009 pfam00025, pfam00063. As regards Rab GTPases, these are important regulators of vesicle formation, motility and fusion. They share a fold in common with all Ras GTPases: this is a six-stranded beta-sheet surrounded by five alpha-helices.¡€0€ª€0€ €CDD¡€ €­¢€0€0€ €Úpfam00072, Response_reg, Response regulator receiver domain. This domain receives the signal from the sensor partner in bacterial two-component systems. It is usually found N-terminal to a DNA binding effector domain.¡€0€ª€0€ €CDD¡€ €­€¢€0€0€ €‚pfam00073, Rhv, picornavirus capsid protein. CAUTION: This alignment is very weak. It can not be generated by clustalw. If a representative set is used for a seed, many so-called members are not recognized. The family should probably be split up into sub-families. Capsid proteins of picornaviruses. Picornaviruses are non-enveloped plus-strand ssRNA animal viruses with icosahedral capsids. They include rhinovirus (common cold) and poliovirus. Common structure is an 8-stranded beta sandwich. Variations (one or two extra strands) occur.¡€0€ª€0€ €CDD¡€ €?墀0€0€ €½pfam00074, RnaseA, Pancreatic ribonuclease. Ribonucleases. Members include pancreatic RNAase A and angiogenins. Structure is an alpha+beta fold -- long curved beta sheet and three helices.¡€0€ª€0€ €CDD¡€ €­¢€0€0€ €‚pfam00075, RNase_H, RNase H. RNase H digests the RNA strand of an RNA/DNA hybrid. Important enzyme in retroviral replication cycle, and often found as a domain associated with reverse transcriptases. Structure is a mixed alpha+beta fold with three a/b/a layers.¡€0€ª€0€ €CDD¡€ €­‚¢€0€0€ €‚îpfam00076, RRM_1, RNA recognition motif. (a.k.a. RRM, RBD, or RNP domain). The RRM motif is probably diagnostic of an RNA binding protein. RRMs are found in a variety of RNA binding proteins, including various hnRNP proteins, proteins implicated in regulation of alternative splicing, and protein components of snRNPs. The motif also appears in a few single stranded DNA binding proteins. The RRM structure consists of four strands and two helices arranged in an alpha/beta sandwich, with a third helix present during RNA binding in some cases The C-terminal beta strand (4th strand) and final helix are hard to align and have been omitted in the SEED alignment The LA proteins have an N terminal rrm which is included in the seed. There is a second region towards the C terminus that has some features characteristic of a rrm but does not appear to have the important structural core of a rrm. The LA proteins are one of the main autoantigens in Systemic lupus erythematosus (SLE), an autoimmune disease.¡€0€ª€0€ €CDD¡€ €?袀0€0€ €‚›pfam00077, RVP, Retroviral aspartyl protease. Single domain aspartyl proteases from retroviruses, retrotransposons, and badnaviruses (plant dsDNA viruses). These proteases are generally part of a larger polyprotein; usually pol, more rarely gag. Retroviral proteases appear to be homologous to a single domain of the two-domain eukaryotic aspartyl proteases such as pepsins, cathepsins, and renins (pfam00026).¡€0€ª€0€ €CDD¡€ €­ƒ¢€0€0€ €‚hpfam00078, RVT_1, Reverse transcriptase (RNA-dependent DNA polymerase). A reverse transcriptase gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. Reverse transcriptases occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses.¡€0€ª€0€ €CDD¡€ €­„¢€0€0€ €Œpfam00079, Serpin, Serpin (serine protease inhibitor). Structure is a multi-domain fold containing a bundle of helices and a beta sandwich.¡€0€ª€0€ €CDD¡€ €­…¢€0€0€ €‚Ópfam00080, Sod_Cu, Copper/zinc superoxide dismutase (SODC). superoxide dismutases (SODs) catalyze the conversion of superoxide radicals to hydrogen peroxide and molecular oxygen. Three evolutionarily distinct families of SODs are known, of which the copper/zinc-binding family is one. Defects in the human SOD1 gene cause familial amyotrophic lateral sclerosis (Lou Gehrig's disease). Structure is an eight-stranded beta sandwich, similar to the immunoglobulin fold.¡€0€ª€0€ €CDD¡€ €­†¢€0€0€ €‚ñpfam00081, Sod_Fe_N, Iron/manganese superoxide dismutases, alpha-hairpin domain. superoxide dismutases (SODs) catalyze the conversion of superoxide radicals to hydrogen peroxide and molecular oxygen. Three evolutionarily distinct families of SODs are known, of which the Mn/Fe-binding family is one. In humans, there is a cytoplasmic Cu/Zn SOD, and a mitochondrial Mn/Fe SOD. N-terminal domain is a long alpha antiparallel hairpin. A small fragment of YTRE_LEPBI matches well - sequencing error?.¡€0€ª€0€ €CDD¡€ €?í¢€0€0€ €‚Ppfam00082, Peptidase_S8, Subtilase family. Subtilases are a family of serine proteases. They appear to have independently and convergently evolved an Asp/Ser/His catalytic triad, like that found in the trypsin serine proteases (see pfam00089). Structure is an alpha/beta fold containing a 7-stranded parallel beta sheet, order 2314567.¡€0€ª€0€ €CDD¡€ €­‡¢€0€0€ €5pfam00083, Sugar_tr, Sugar (and other) transporter. ¡€0€ª€0€ €CDD¡€ €­ˆ¢€0€0€ €.pfam00084, Sushi, Sushi repeat (SCR repeat). ¡€0€ª€0€ €CDD¡€ €­‰¢€0€0€ €ópfam00085, Thioredoxin, Thioredoxin. Thioredoxins are small enzymes that participate in redox reactions, via the reversible oxidation of an active centre disulfide bond. Some members with only the active site are not separated from the noise.¡€0€ª€0€ €CDD¡€ €?ñ¢€0€0€ €‚'pfam00086, Thyroglobulin_1, Thyroglobulin type-1 repeat. Thyroglobulin type 1 repeats are thought to be involved in the control of proteolytic degradation. The domain usually contains six conserved cysteines. These form three disulphide bridges. Cysteines 1 pairs with 2, 3 with 4 and 5 with 6.¡€0€ª€0€ €CDD¡€ €­Š¢€0€0€ €pfam00087, Toxin_1, Snake toxin. A family of venomous neurotoxins and cytotoxins. Structure is small, disulfide-rich, nearly all beta sheet.¡€0€ª€0€ €CDD¡€ €­‹¢€0€0€ €.pfam00088, Trefoil, Trefoil (P-type) domain. ¡€0€ª€0€ €CDD¡€ €­Œ¢€0€0€ €pfam00089, Trypsin, Trypsin. ¡€0€ª€0€ €CDD¡€ €­¢€0€0€ €1pfam00090, TSP_1, Thrombospondin type 1 domain. ¡€0€ª€0€ €CDD¡€ €­Ž¢€0€0€ €‚spfam00091, Tubulin, Tubulin/FtsZ family, GTPase domain. This family includes the tubulin alpha, beta and gamma chains, as well as the bacterial FtsZ family of proteins. Members of this family are involved in polymer formation. FtsZ is the polymer-forming protein of bacterial cell division. It is part of a ring in the middle of the dividing cell that is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ and tubulin are GTPases. FtsZ can polymerize into tubes, sheets, and rings in vitro and is ubiquitous in eubacteria and archaea. Tubulin is the major component of microtubules.¡€0€ª€0€ €CDD¡€ €­¢€0€0€ €6pfam00092, VWA, von Willebrand factor type A domain. ¡€0€ª€0€ €CDD¡€ €­¢€0€0€ €qpfam00093, VWC, von Willebrand factor type C domain. The high cutoff was used to prevent overlap with pfam00094.¡€0€ª€0€ €CDD¡€ €?ø¢€0€0€ €Îpfam00094, VWD, von Willebrand factor type D domain. Luciferin monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods.¡€0€ª€0€ €CDD¡€ €­‘¢€0€0€ €Špfam00095, WAP, WAP-type (Whey Acidic Protein) 'four-disulfide core'. WAP belongs to the group of Elafin or elastase-specific inhibitors.¡€0€ª€0€ €CDD¡€ €­’¢€0€0€ €‚™pfam00096, zf-C2H2, Zinc finger, C2H2 type. The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter.¡€0€ª€0€ €CDD¡€ €­“¢€0€0€ €‚|pfam00097, zf-C3HC4, Zinc finger, C3HC4 type (RING finger). The C3HC4 type zinc-finger (RING finger) is a cysteine-rich domain of 40 to 60 residues that coordinates two zinc ions, and has the consensus sequence: C-X2-C-X(9-39)-C-X(1-3)-H-X(2-3)-C-X2-C-X(4-48)-C-X2-C where X is any amino acid. Many proteins containing a RING finger play a key role in the ubiquitination pathway.¡€0€ª€0€ €CDD¡€ €­”¢€0€0€ €‚wpfam00098, zf-CCHC, Zinc knuckle. The zinc knuckle is a zinc binding motif composed of the the following CX2CX4HX4C where X can be any amino acid. The motifs are mostly from retroviral gag proteins (nucleocapsid). Prototype structure is from HIV. Also contains members involved in eukaryotic gene regulation, such as C. elegans GLH-1. Structure is an 18-residue zinc finger.¡€0€ª€0€ €CDD¡€ €­•¢€0€0€ €8pfam00100, Zona_pellucida, Zona pellucida-like domain. ¡€0€ª€0€ €CDD¡€ €­–¢€0€0€ €Kpfam00101, RuBisCO_small, Ribulose bisphosphate carboxylase, small chain. ¡€0€ª€0€ €CDD¡€ €­—¢€0€0€ €9pfam00102, Y_phosphatase, Protein-tyrosine phosphatase. ¡€0€ª€0€ €CDD¡€ €­˜¢€0€0€ €4pfam00103, Hormone_1, Somatotropin hormone family. ¡€0€ª€0€ €CDD¡€ €­™¢€0€0€ €œpfam00104, Hormone_recep, Ligand-binding domain of nuclear hormone receptor. This all helical domain is involved in binding the hormone in these receptors.¡€0€ª€0€ €CDD¡€ €­š¢€0€0€ €ïpfam00105, zf-C4, Zinc finger, C4 type (two domains). In nearly all cases, this is the DNA binding domain of a nuclear hormone receptor. The alignment contains two Zinc finger domains that are too dissimilar to be aligned with each other.¡€0€ª€0€ €CDD¡€ €­›¢€0€0€ €hpfam00106, adh_short, short chain dehydrogenase. This family contains a wide variety of dehydrogenases.¡€0€ª€0€ €CDD¡€ €­œ¢€0€0€ €4pfam00107, ADH_zinc_N, Zinc-binding dehydrogenase. ¡€0€ª€0€ €CDD¡€ €­¢€0€0€ €§pfam00108, Thiolase_N, Thiolase, N-terminal domain. Thiolase is reported to be structurally related to beta-ketoacyl synthase (pfam00109), and also chalcone synthase.¡€0€ª€0€ €CDD¡€ €­ž¢€0€0€ €‚pfam00109, ketoacyl-synt, Beta-ketoacyl synthase, N-terminal domain. The structure of beta-ketoacyl synthase is similar to that of the thiolase family (pfam00108) and also chalcone synthase. The active site of beta-ketoacyl synthase is located between the N and C-terminal domains. The N-terminal domain contains most of the structures involved in dimer formation and also the active site cysteine.¡€0€ª€0€ €CDD¡€ €­Ÿ¢€0€0€ €‚pfam00110, wnt, wnt family. Wnt genes have been identified in vertebrates and invertebrates but not in plants, unicellular eukaryotes or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families.¡€0€ª€0€ €CDD¡€ €­ ¢€0€0€ €=pfam00111, Fer2, 2Fe-2S iron-sulfur cluster binding domain. ¡€0€ª€0€ €CDD¡€ €­¡¢€0€0€ €;pfam00112, Peptidase_C1, Papain family cysteine protease. ¡€0€ª€0€ €CDD¡€ €­¢¢€0€0€ €>pfam00113, Enolase_C, Enolase, C-terminal TIM barrel domain. ¡€0€ª€0€ €CDD¡€ €@ ¢€0€0€ €³pfam00114, Pilin, Pilin (bacterial filament). Proteins with only the short N-terminal methylation site are not separated from the noise. The Prosite pattern detects those better.¡€0€ª€0€ €CDD¡€ €­£¢€0€0€ €Apfam00115, COX1, Cytochrome C and Quinol oxidase polypeptide I. ¡€0€ª€0€ €CDD¡€ €­¤¢€0€0€ €Gpfam00116, COX2, Cytochrome C oxidase subunit II, periplasmic domain. ¡€0€ª€0€ €CDD¡€ €@¢€0€0€ €8pfam00117, GATase, Glutamine amidotransferase class-I. ¡€0€ª€0€ €CDD¡€ €­¥¢€0€0€ €pfam00118, Cpn60_TCP1, TCP-1/cpn60 chaperonin family. This family includes members from the HSP60 chaperone family and the TCP-1 (T-complex protein) family.¡€0€ª€0€ €CDD¡€ €­¦¢€0€0€ €.pfam00119, ATP-synt_A, ATP synthase A chain. ¡€0€ª€0€ €CDD¡€ €­§¢€0€0€ €@pfam00120, Gln-synt_C, Glutamine synthetase, catalytic domain. ¡€0€ª€0€ €CDD¡€ €­¨¢€0€0€ €,pfam00121, TIM, Triosephosphate isomerase. ¡€0€ª€0€ €CDD¡€ €­©¢€0€0€ €(pfam00122, E1-E2_ATPase, E1-E2 ATPase. ¡€0€ª€0€ €CDD¡€ €­ª¢€0€0€ €]pfam00123, Hormone_2, Peptide hormone. This family contains glucagon, GIP, secretin and VIP.¡€0€ª€0€ €CDD¡€ €­«¢€0€0€ €>pfam00124, Photo_RC, Photosynthetic reaction centre protein. ¡€0€ª€0€ €CDD¡€ €­¬¢€0€0€ €1pfam00125, Histone, Core histone H2A/H2B/H3/H4. ¡€0€ª€0€ €CDD¡€ €­­¢€0€0€ €Opfam00126, HTH_1, Bacterial regulatory helix-turn-helix protein, lysR family. ¡€0€ª€0€ €CDD¡€ €­®¢€0€0€ €Npfam00127, Copper-bind, Copper binding proteins, plastocyanin/azurin family. ¡€0€ª€0€ €CDD¡€ €­¯¢€0€0€ €‚hpfam00128, Alpha-amylase, Alpha amylase, catalytic domain. Alpha amylase is classified as family 13 of the glycosyl hydrolases. The structure is an 8 stranded alpha/beta barrel containing the active site, interrupted by a ~70 a.a. calcium-binding domain protruding between beta strand 3 and alpha helix 3, and a carboxyl-terminal Greek key beta-barrel domain.¡€0€ª€0€ €CDD¡€ €­°¢€0€0€ €Npfam00129, MHC_I, Class I Histocompatibility antigen, domains alpha 1 and 2. ¡€0€ª€0€ €CDD¡€ €@¢€0€0€ €pfam00130, C1_1, Phorbol esters/diacylglycerol binding domain (C1 domain). This domain is also known as the Protein kinase C conserved region 1 (C1) domain.¡€0€ª€0€ €CDD¡€ €@¢€0€0€ €*pfam00131, Metallothio, Metallothionein. ¡€0€ª€0€ €CDD¡€ €­±¢€0€0€ €Fpfam00132, Hexapep, Bacterial transferase hexapeptide (six repeats). ¡€0€ª€0€ €CDD¡€ €­²¢€0€0€ €ˆpfam00133, tRNA-synt_1, tRNA synthetases class I (I, L, M and V). Other tRNA synthetase sub-families are too dissimilar to be included.¡€0€ª€0€ €CDD¡€ €­³¢€0€0€ €‚$pfam00134, Cyclin_N, Cyclin, N-terminal domain. Cyclins regulate cyclin dependent kinases (CDKs). Cyclin-0 (CCNO) is a Uracil-DNA glycosylase that is related to other cyclins. Cyclins contain two domains of similar all-alpha fold, of which this family corresponds with the N-terminal domain.¡€0€ª€0€ €CDD¡€ €­´¢€0€0€ €1pfam00135, COesterase, Carboxylesterase family. ¡€0€ª€0€ €CDD¡€ €­µ¢€0€0€ €Îpfam00136, DNA_pol_B, DNA polymerase family B. This region of DNA polymerase B appears to consist of more than one structural domain, possibly including elongation, DNA-binding and dNTP binding activities.¡€0€ª€0€ €CDD¡€ €­¶¢€0€0€ €0pfam00137, ATP-synt_C, ATP synthase subunit C. ¡€0€ª€0€ €CDD¡€ €­·¢€0€0€ €/pfam00139, Lectin_legB, Legume lectin domain. ¡€0€ª€0€ €CDD¡€ €­¸¢€0€0€ €7pfam00140, Sigma70_r1_2, Sigma-70 factor, region 1.2. ¡€0€ª€0€ €CDD¡€ €­¹¢€0€0€ €$pfam00141, peroxidase, Peroxidase. ¡€0€ª€0€ €CDD¡€ €­º¢€0€0€ €Vpfam00142, Fer4_NifH, 4Fe-4S iron sulfur cluster binding proteins, NifH/frxC family. ¡€0€ª€0€ €CDD¡€ €­»¢€0€0€ €6pfam00143, Interferon, Interferon alpha/beta domain. ¡€0€ª€0€ €CDD¡€ €­¼¢€0€0€ €•pfam00144, Beta-lactamase, Beta-lactamase. This family appears to be distantly related to pfam00905 and PF00768 D-alanyl-D-alanine carboxypeptidase.¡€0€ª€0€ €CDD¡€ €­½¢€0€0€ €@pfam00145, DNA_methylase, C-5 cytosine-specific DNA methylase. ¡€0€ª€0€ €CDD¡€ €­¾¢€0€0€ €(pfam00146, NADHdh, NADH dehydrogenase. ¡€0€ª€0€ €CDD¡€ €­¿¢€0€0€ €Xpfam00147, Fibrinogen_C, Fibrinogen beta and gamma chains, C-terminal globular domain. ¡€0€ª€0€ €CDD¡€ €@,¢€0€0€ €Ipfam00148, Oxidored_nitro, Nitrogenase component 1 type Oxidoreductase. ¡€0€ª€0€ €CDD¡€ €­À¢€0€0€ €‚‹pfam00149, Metallophos, Calcineurin-like phosphoesterase. This family includes a diverse range of phosphoesterases, including protein phosphoserine phosphatases, nucleotidases, sphingomyelin phosphodiesterases and 2'-3' cAMP phosphodiesterases as well as nucleases such as bacterial SbcD or yeast MRE11. The most conserved regions in this superfamily centre around the metal chelating residues.¡€0€ª€0€ €CDD¡€ €­Á¢€0€0€ €@pfam00150, Cellulase, Cellulase (glycosyl hydrolase family 5). ¡€0€ª€0€ €CDD¡€ €­Â¢€0€0€ €pfam00151, Lipase, Lipase. ¡€0€ª€0€ €CDD¡€ €@0¢€0€0€ €Apfam00152, tRNA-synt_2, tRNA synthetases class II (D, K and N). ¡€0€ª€0€ €CDD¡€ €­Ã¢€0€0€ €6pfam00153, Mito_carr, Mitochondrial carrier protein. ¡€0€ª€0€ €CDD¡€ €@2¢€0€0€ €‚pfam00154, RecA, recA bacterial DNA recombination protein. RecA is a DNA-dependent ATPase and functions in DNA repair systems. RecA protein catalyzes an ATP-dependent DNA strand-exchange reaction that is the central step in the repair of dsDNA breaks by homologous recombination.¡€0€ª€0€ €CDD¡€ €­Ä¢€0€0€ €¢€0€0€ €ðpfam00166, Cpn10, Chaperonin 10 Kd subunit. This family contains GroES and Gp31-like chaperonins. Gp31 is a functional co-chaperonin that is required for the folding and assembly of Gp23, a major capsid protein, during phage morphogenesis.¡€0€ª€0€ €CDD¡€ €@?¢€0€0€ €‚npfam00167, FGF, Fibroblast growth factor. Fibroblast growth factors are a family of proteins involved in growth and differentiation in a wide range of contexts. They are found in a wide range of organisms, from nematodes to humans. Most share an internal core region of high similarity, conserved residues in which are involved in binding with their receptors. On binding, they cause dimerisation of their tyrosine kinase receptors leading to intracellular signalling. There are currently four known tyrosine kinase receptors for fibroblast growth factors. These receptors can each bind several different members of this family. Members of this family have a beta trefoil structure. Most have N-terminal signal peptides and are secreted. A few lack signal sequences but are secreted anyway; still others also lack the signal peptide but are found on the cell surface and within the extracellular matrix. A third group remain intracellular. They have central roles in development, regulating cell proliferation, migration and differentiation. On the other hand, they are important in tissue repair following injury in adult organisms.¡€0€ª€0€ €CDD¡€ €­Î¢€0€0€ €pfam00168, C2, C2 domain. ¡€0€ª€0€ €CDD¡€ €­Ï¢€0€0€ €=pfam00169, PH, PH domain. PH stands for pleckstrin homology.¡€0€ª€0€ €CDD¡€ €­Ð¢€0€0€ €vpfam00170, bZIP_1, bZIP transcription factor. The Pfam entry includes the basic region and the leucine zipper region.¡€0€ª€0€ €CDD¡€ €@C¢€0€0€ €‚ pfam00171, Aldedh, Aldehyde dehydrogenase family. This family of dehydrogenases act on aldehyde substrates. Members use NADP as a cofactor. The family includes the following members: The prototypical members are the aldehyde dehydrogenases EC:1.2.1.3. Succinate-semialdehyde dehydrogenase EC:1.2.1.16. Lactaldehyde dehydrogenase EC:1.2.1.22. Benzaldehyde dehydrogenase EC:1.2.1.28. Methylmalonate-semialdehyde dehydrogenase EC:1.2.1.27. Glyceraldehyde-3-phosphate dehydrogenase EC:1.2.1.9. Delta-1-pyrroline-5-carboxylate dehydrogenase EC: 1.5.1.12. Acetaldehyde dehydrogenase EC:1.2.1.10. Glutamate-5-semialdehyde dehydrogenase EC:1.2.1.41. This family also includes omega crystallin, an eye lens protein from squid and octopus that has little aldehyde dehydrogenase activity.¡€0€ª€0€ €CDD¡€ €­Ñ¢€0€0€ €Cpfam00172, Zn_clus, Fungal Zn(2)-Cys(6) binuclear cluster domain. ¡€0€ª€0€ €CDD¡€ €`¢€0€0€ €‚Õpfam00173, Cyt-b5, Cytochrome b5-like Heme/Steroid binding domain. This family includes heme binding domains from a diverse range of proteins. This family also includes proteins that bind to steroids. The family includes progesterone receptors. Many members of this subfamily are membrane anchored by an N-terminal transmembrane alpha helix. This family also includes a domain in some chitin synthases. There is no known ligand for this domain in the chitin synthases.¡€0€ª€0€ €CDD¡€ €­Ò¢€0€0€ €‚pfam00174, Oxidored_molyb, Oxidoreductase molybdopterin binding domain. This domain is found in a variety of oxidoreductases. This domain binds to a molybdopterin cofactor. Xanthine dehydrogenases, that also bind molybdopterin, have essentially no similarity.¡€0€ª€0€ €CDD¡€ €­Ó¢€0€0€ €Žpfam00175, NAD_binding_1, Oxidoreductase NAD-binding domain. Xanthine dehydrogenases, that also bind FAD/NAD, have essentially no similarity.¡€0€ª€0€ €CDD¡€ €­Ô¢€0€0€ €‚Špfam00176, SNF2_N, SNF2 family N-terminal domain. This domain is found in proteins involved in a variety of processes including transcription regulation (e.g., SNF2, STH1, brahma, MOT1), DNA repair (e.g., ERCC6, RAD16, RAD5), DNA recombination (e.g., RAD54), and chromatin unwinding (e.g., ISWI) as well as a variety of other proteins with little functional information (e.g., lodestar, ETL1).¡€0€ª€0€ €CDD¡€ €­Õ¢€0€0€ €‡pfam00177, Ribosomal_S7, Ribosomal protein S7p/S5e. This family contains ribosomal protein S7 from prokaryotes and S5 from eukaryotes.¡€0€ª€0€ €CDD¡€ €­Ö¢€0€0€ €pfam00178, Ets, Ets-domain. ¡€0€ª€0€ €CDD¡€ €­×¢€0€0€ €‚*pfam00179, UQ_con, Ubiquitin-conjugating enzyme. Proteins destined for proteasome-mediated degradation may be ubiquitinated. Ubiquitination follows conjugation of ubiquitin to a conserved cysteine residue of UBC homologs. TSG101 is one of several UBC homologs that lacks this active site cysteine.¡€0€ª€0€ €CDD¡€ €­Ø¢€0€0€ €>pfam00180, Iso_dh, Isocitrate/isopropylmalate dehydrogenase. ¡€0€ª€0€ €CDD¡€ €­Ù¢€0€0€ €Epfam00181, Ribosomal_L2, Ribosomal Proteins L2, RNA binding domain. ¡€0€ª€0€ €CDD¡€ €­Ú¢€0€0€ €/pfam00182, Glyco_hydro_19, Chitinase class I. ¡€0€ª€0€ €CDD¡€ €­Û¢€0€0€ €"pfam00183, HSP90, Hsp90 protein. ¡€0€ª€0€ €CDD¡€ €­Ü¢€0€0€ €fpfam00184, Hormone_5, Neurohypophysial hormones, C-terminal Domain. N-terminal Domain is in hormone5.¡€0€ª€0€ €CDD¡€ €­Ý¢€0€0€ €Vpfam00185, OTCace, Aspartate/ornithine carbamoyltransferase, Asp/Orn binding domain. ¡€0€ª€0€ €CDD¡€ €­Þ¢€0€0€ €-pfam00186, DHFR_1, Dihydrofolate reductase. ¡€0€ª€0€ €CDD¡€ €­ß¢€0€0€ €7pfam00187, Chitin_bind_1, Chitin recognition protein. ¡€0€ª€0€ €CDD¡€ €­à¢€0€0€ €‚ïpfam00188, CAP, Cysteine-rich secretory protein family. This is a large family of cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins (CAP) that are found in a wide range of organisms, including prokaryotes and non-vertebrate eukaryotes, The nine subfamilies of the mammalian CAP 'super'family include: the human glioma pathogenesis-related 1 (GLIPR1), Golgi associated pathogenesis related-1 (GAPR1) proteins, peptidase inhibitor 15 (PI15), peptidase inhibitor 16 (PI16), cysteine-rich secretory proteins (CRISPs), CRISP LCCL domain containing 1 (CRISPLD1), CRISP LCCL domain containing 2 (CRISPLD2), mannose receptor like and the R3H domain containing like proteins. Members are most often secreted and have an extracellular endocrine or paracrine function and are involved in processes including the regulation of extracellular matrix and branching morphogenesis, potentially as either proteases or protease inhibitors; in ion channel regulation in fertility; as tumor suppressor or pro-oncogenic genes in tissues including the prostate; and in cell-cell adhesion during fertilisation. The overall protein structural conservation within the CAP 'super'family results in fundamentally similar functions for the CAP domain in all members, yet the diversity outside of this core region dramatically alters the target specificity and, thus, the biological consequences. The Ca++-chelating function would fit with the various signalling processes (e.g. the CRISP proteins) that members of this family are involved in, and also the sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how the cysteine-rich venom protein helothermine blocks the Ca++ transporting ryanodine receptors.¡€0€ª€0€ €CDD¡€ €­á¢€0€0€ €üpfam00189, Ribosomal_S3_C, Ribosomal protein S3, C-terminal domain. This family contains a central domain pfam00013, hence the amino and carboxyl terminal domains are stored separately. This is a minimal carboxyl-terminal domain. Some are much longer.¡€0€ª€0€ €CDD¡€ €­â¢€0€0€ €‚:pfam00190, Cupin_1, Cupin. This family represents the conserved barrel domain of the 'cupin' superfamily ('cupa' is the Latin term for a small barrel). This family contains 11S and 7S plant seed storage proteins, and germins. Plant seed storage proteins provide the major nitrogen source for the developing plant.¡€0€ª€0€ €CDD¡€ €­ã¢€0€0€ €zpfam00191, Annexin, Annexin. This family of annexins also includes giardin that has been shown to function as an annexin.¡€0€ª€0€ €CDD¡€ €­ä¢€0€0€ €.pfam00193, Xlink, Extracellular link domain. ¡€0€ª€0€ €CDD¡€ €­å¢€0€0€ €@pfam00194, Carb_anhydrase, Eukaryotic-type carbonic anhydrase. ¡€0€ª€0€ €CDD¡€ €­æ¢€0€0€ €‚-pfam00195, Chal_sti_synt_N, Chalcone and stilbene synthases, N-terminal domain. The C-terminal domain of Chalcone synthase is reported to be structurally similar to domains in thiolase and beta-ketoacyl synthase. The differences in activity are accounted for by differences in this N-terminal domain.¡€0€ª€0€ €CDD¡€ €­ç¢€0€0€ €>pfam00196, GerE, Bacterial regulatory proteins, luxR family. ¡€0€ª€0€ €CDD¡€ €­è¢€0€0€ €;pfam00197, Kunitz_legume, Trypsin and protease inhibitor. ¡€0€ª€0€ €CDD¡€ €­é¢€0€0€ €¾pfam00198, 2-oxoacid_dh, 2-oxoacid dehydrogenases acyltransferase (catalytic domain). These proteins contain one to three copies of a lipoyl binding domain followed by the catalytic domain.¡€0€ª€0€ €CDD¡€ €­ê¢€0€0€ € pfam00199, Catalase, Catalase. ¡€0€ª€0€ €CDD¡€ €­ë¢€0€0€ €&pfam00200, Disintegrin, Disintegrin. ¡€0€ª€0€ €CDD¡€ €­ì¢€0€0€ €Bpfam00201, UDPGT, UDP-glucoronosyl and UDP-glucosyl transferase. ¡€0€ª€0€ €CDD¡€ €@`¢€0€0€ €5pfam00202, Aminotran_3, Aminotransferase class-III. ¡€0€ª€0€ €CDD¡€ €­í¢€0€0€ €2pfam00203, Ribosomal_S19, Ribosomal protein S19. ¡€0€ª€0€ €CDD¡€ €­î¢€0€0€ €¼pfam00204, DNA_gyraseB, DNA gyrase B. This family represents the second domain of DNA gyrase B which has a ribosomal S5 domain 2-like fold. This family is structurally related to PF01119.¡€0€ª€0€ €CDD¡€ €­ï¢€0€0€ €Špfam00205, TPP_enzyme_M, Thiamine pyrophosphate enzyme, central domain. The central domain of TPP enzymes contains a 2-fold Rossman fold.¡€0€ª€0€ €CDD¡€ €­ð¢€0€0€ €pfam00206, Lyase_1, Lyase. ¡€0€ª€0€ €CDD¡€ €@e¢€0€0€ €~pfam00207, A2M, Alpha-2-macroglobulin family. This family includes the C-terminal region of the alpha-2-macroglobulin family.¡€0€ª€0€ €CDD¡€ €­ñ¢€0€0€ €Qpfam00208, ELFV_dehydrog, Glutamate/Leucine/Phenylalanine/Valine dehydrogenase. ¡€0€ª€0€ €CDD¡€ €­ò¢€0€0€ €opfam00209, SNF, Sodium:neurotransmitter symporter family. These are twelve xTM-containing region transporters.¡€0€ª€0€ €CDD¡€ €­ó¢€0€0€ €¤pfam00210, Ferritin, Ferritin-like domain. This family contains ferritins and other ferritin-like proteins such as members of the DPS family and bacterioferritins.¡€0€ª€0€ €CDD¡€ €­ô¢€0€0€ €Mpfam00211, Guanylate_cyc, Adenylate and Guanylate cyclase catalytic domain. ¡€0€ª€0€ €CDD¡€ €­õ¢€0€0€ €-pfam00212, ANP, Atrial natriuretic peptide. ¡€0€ª€0€ €CDD¡€ €­ö¢€0€0€ €Çpfam00213, OSCP, ATP synthase delta (OSCP) subunit. The ATP D subunit from E. coli is the same as the OSCP subunit which is this family. The ATP D subunit from metazoa are found in family pfam00401.¡€0€ª€0€ €CDD¡€ €­÷¢€0€0€ €=pfam00214, Calc_CGRP_IAPP, Calcitonin / CGRP / IAPP family. ¡€0€ª€0€ €CDD¡€ €­ø¢€0€0€ €‚[pfam00215, OMPdecase, Orotidine 5'-phosphate decarboxylase / HUMPS family. This family includes Orotidine 5'-phosphate decarboxylase enzymes EC:4.1.1.23 that are involved in the final step of pyrimidine biosynthesis. The family also includes enzymes such as hexulose-6-phosphate synthase. This family appears to be distantly related to pfam00834.¡€0€ª€0€ €CDD¡€ €­ù¢€0€0€ €¢€0€0€ €1pfam00304, Gamma-thionin, Gamma-thionin family. ¡€0€ª€0€ €CDD¡€ €@À¢€0€0€ €(pfam00305, Lipoxygenase, Lipoxygenase. ¡€0€ª€0€ €CDD¡€ €®?¢€0€0€ €Mpfam00306, ATP-synt_ab_C, ATP synthase alpha/beta chain, C terminal domain. ¡€0€ª€0€ €CDD¡€ €®@¢€0€0€ €‚¼pfam00307, CH, Calponin homology (CH) domain. The CH domain is found in both cytoskeletal proteins and signal transduction proteins. The CH domain is involved in actin binding in some members of the family. However in calponins there is evidence that the CH domain is not involved in its actin binding activity. Most member proteins have from two to four copies of the CH domain, however some proteins such as calponin have only a single copy.¡€0€ª€0€ €CDD¡€ €®A¢€0€0€ €.pfam00308, Bac_DnaA, Bacterial dnaA protein. ¡€0€ª€0€ €CDD¡€ €@Ä¢€0€0€ €‚Jpfam00309, Sigma54_AID, Sigma-54 factor, Activator interacting domain (AID). The sigma-54 holoenzyme is an enhancer dependent form of the RNA polymerase. The AID is necessary for activator interaction. In addition, the AID also inhibits transcription initiation in the sigma-54 holoenzyme prior to interaction with the activator.¡€0€ª€0€ €CDD¡€ €®B¢€0€0€ €pfam00356, LacI, Bacterial regulatory proteins, lacI family. ¡€0€ª€0€ €CDD¡€ €®g¢€0€0€ €Œpfam00357, Integrin_alpha, Integrin alpha cytoplasmic region. This family contains the short intracellular region of integrin alpha chains.¡€0€ª€0€ €CDD¡€ €@ó¢€0€0€ €_pfam00358, PTS_EIIA_1, phosphoenolpyruvate-dependent sugar phosphotransferase system, EIIA 1. ¡€0€ª€0€ €CDD¡€ €®h¢€0€0€ €_pfam00359, PTS_EIIA_2, Phosphoenolpyruvate-dependent sugar phosphotransferase system, EIIA 2. ¡€0€ª€0€ €CDD¡€ €®i¢€0€0€ €‚Ppfam00360, PHY, Phytochrome region. Phytochromes are red/far-red photochromic biliprotein photoreceptors which regulate plant development. They are widely represented in both photosynthetic and non-photosynthetic bacteria and are known in a variety of fungi. Although sequence similarities are low, this domain is structurally related to pfam01590, which is generally located immediately N-terminal to this domain. Compared with pfam01590, this domain carries an additional tongue-like hairpin loop between the fifth beta-sheet and the sixth alpha-helix which functions to seal the chromophore pocket and stabilize the photoactivated far-red-absorbing state (Pfr). The tongue carries a conserved PRxSF motif, from which an arginine finger points into the chromophore pocket close to ring D forming a salt bridge with a conserved aspartate residue.¡€0€ª€0€ €CDD¡€ €®j¢€0€0€ €‚ppfam00361, Proton_antipo_M, Proton-conducting membrane transporter. This is a family of membrane transporters that inlcudes some 7 of potentially 14-16 TM regions. In many instances the family forms part of complex I that catalyzes the transfer of two electrons from NADH to ubiquinone in a reaction that is associated with proton translocation across the membrane, and in this context is a combination predominantly of subunits 2, 4, 5, 14, L, M and N. In many bacterial species these proteins are probable stand-alone transporters not coupled with oxidoreduction. The family in total represents homologs across the phyla.¡€0€ª€0€ €CDD¡€ €®k¢€0€0€ €üpfam00362, Integrin_beta, Integrin beta chain VWA domain. Integrins have been found in animals and their homologs have also been found in cyanobacteria, probably due to horizontal gene transfer. This domain corresponds to the integrin beta VWA domain.¡€0€ª€0€ €CDD¡€ €®l¢€0€0€ €pfam00363, Casein, Casein. ¡€0€ª€0€ €CDD¡€ €®m¢€0€0€ €‚pfam00364, Biotin_lipoyl, Biotin-requiring enzyme. This family covers two Prosite entries, the conserved lysine residue binds biotin in one group and lipoic acid in the other. Note that the HMM does not currently recognize the Glycine cleavage system H proteins.¡€0€ª€0€ €CDD¡€ €®n¢€0€0€ €&pfam00365, PFK, Phosphofructokinase. ¡€0€ª€0€ €CDD¡€ €®o¢€0€0€ €2pfam00366, Ribosomal_S17, Ribosomal protein S17. ¡€0€ª€0€ €CDD¡€ €®p¢€0€0€ €7pfam00367, PTS_EIIB, phosphotransferase system, EIIB. ¡€0€ª€0€ €CDD¡€ €®q¢€0€0€ €‚§pfam00368, HMG-CoA_red, Hydroxymethylglutaryl-coenzyme A reductase. The HMG-CoA reductases catalyze the conversion of HMG-CoA to mevalonate, which is the rate-limiting step in the synthesis of isoprenoids like cholesterol. Probably because of the critical role of this enzyme in cholesterol homeostasis, mammalian HMG-CoA reductase is heavily regulated at the transcriptional, translational, and post-translational levels.¡€0€ª€0€ €CDD¡€ €®r¢€0€0€ €¯pfam00370, FGGY_N, FGGY family of carbohydrate kinases, N-terminal domain. This domain adopts a ribonuclease H-like fold and is structurally related to the C-terminal domain.¡€0€ª€0€ €CDD¡€ €®s¢€0€0€ €Žpfam00372, Hemocyanin_M, Hemocyanin, copper containing domain. This family includes arthropod hemocyanins and insect larval storage proteins.¡€0€ª€0€ €CDD¡€ €®t¢€0€0€ €ipfam00373, FERM_M, FERM central domain. This domain is the central structural domain of the FERM domain.¡€0€ª€0€ €CDD¡€ €®u¢€0€0€ €8pfam00374, NiFeSe_Hases, Nickel-dependent hydrogenase. ¡€0€ª€0€ €CDD¡€ €®v¢€0€0€ €8pfam00375, SDF, Sodium:dicarboxylate symporter family. ¡€0€ª€0€ €CDD¡€ €®w¢€0€0€ €2pfam00376, MerR, MerR family regulatory protein. ¡€0€ª€0€ €CDD¡€ €®x¢€0€0€ €‚ìpfam00377, Prion, Prion/Doppel alpha-helical domain. The prion protein is thought to be the infectious agent that causes transmissible spongiform encephalopathies, such as scrapie and BSE. It is thought that the prion protein can exist in two different forms: one is the normal cellular protein, and the other is the infectious form which can change the normal prion protein into the infectious form. It has been found that the prion alpha-helical domain is also found in the Doppel protein.¡€0€ª€0€ €CDD¡€ €®y¢€0€0€ €ìpfam00378, ECH_1, Enoyl-CoA hydratase/isomerase. This family contains a diverse set of enzymes including: enoyl-CoA hydratase, napthoate synthase, carnitate racemase, 3-hydroxybutyryl-CoA dehydratase and dodecanoyl-CoA delta-isomerase.¡€0€ª€0€ €CDD¡€ €®z¢€0€0€ €‚Çpfam00379, Chitin_bind_4, Insect cuticle protein. Many insect cuticular proteins include a 35-36 amino acid motif known as the R&R consensus. The extensive conservation of this region led to the suggestion that it functions to bind chitin. Provocatively, it has no sequence similarity to the well-known cysteine-containing chitin-binding domain found in chitinases and some peritrophic membrane proteins. Chitin binding has been shown experimentally for this region. Thus arthropods have two distinct classes of chitin binding proteins, those with the chitin-binding domain found in lectins, chitinases and peritrophic membranes (cysCBD) and those with the cuticular protein chitin-binding domain (non-cysCBD).¡€0€ª€0€ €CDD¡€ €®{¢€0€0€ €pfam00380, Ribosomal_S9, Ribosomal protein S9/S16. This family includes small ribosomal subunit S9 from prokaryotes and S16 from eukaryotes.¡€0€ª€0€ €CDD¡€ €®|¢€0€0€ €=pfam00381, PTS-HPr, PTS HPr component phosphorylation site. ¡€0€ª€0€ €CDD¡€ €®}¢€0€0€ €6pfam00382, TFIIB, Transcription factor TFIIB repeat. ¡€0€ª€0€ €CDD¡€ €A ¢€0€0€ €Ypfam00383, dCMP_cyt_deam_1, Cytidine and deoxycytidylate deaminase zinc-binding region. ¡€0€ª€0€ €CDD¡€ €A ¢€0€0€ €9pfam00384, Molybdopterin, Molybdopterin oxidoreductase. ¡€0€ª€0€ €CDD¡€ €®~¢€0€0€ €Fpfam00385, Chromo, Chromo (CHRromatin Organisation MOdifier) domain. ¡€0€ª€0€ €CDD¡€ €®¢€0€0€ €rpfam00386, C1q, C1q domain. C1q is a subunit of the C1 enzyme complex that activates the serum complement system.¡€0€ª€0€ €CDD¡€ €®€¢€0€0€ €pfam00387, PI-PLC-Y, Phosphatidylinositol-specific phospholipase C, Y domain. This associates with pfam00388 to form a single structural unit.¡€0€ª€0€ €CDD¡€ €®¢€0€0€ €pfam00388, PI-PLC-X, Phosphatidylinositol-specific phospholipase C, X domain. This associates with pfam00387 to form a single structural unit.¡€0€ª€0€ €CDD¡€ €®‚¢€0€0€ €ÿpfam00389, 2-Hacid_dh, D-isomer specific 2-hydroxyacid dehydrogenase, catalytic domain. This family represents the largest portion of the catalytic domain of 2-hydroxyacid dehydrogenases as the NAD binding domain is inserted within the structural domain.¡€0€ª€0€ €CDD¡€ €®ƒ¢€0€0€ €4pfam00390, malic, Malic enzyme, N-terminal domain. ¡€0€ª€0€ €CDD¡€ €®„¢€0€0€ €·pfam00391, PEP-utilizers, PEP-utilising enzyme, mobile domain. This domain is a "swivelling" beta/beta/alpha domain which is thought to be mobile in all proteins known to contain it.¡€0€ª€0€ €CDD¡€ €®…¢€0€0€ €‚’pfam00392, GntR, Bacterial regulatory proteins, gntR family. This family of regulatory proteins consists of the N-terminal HTH region of GntR-like bacterial transcription factors. At the C-terminus there is usually an effector-binding/oligomerization domain. The GntR-like proteins include the following sub-families: MocR, YtrR, FadR, AraR, HutC and PlmA, DevA, DasR. Many of these proteins have been shown experimentally to be autoregulatory, enabling the prediction of operator sites and the discovery of cis/trans relationships. The DasR regulator has been shown to be a global regulator of primary metabolism and development in Streptomyces coelicolor.¡€0€ª€0€ €CDD¡€ €®†¢€0€0€ €àpfam00393, 6PGD, 6-phosphogluconate dehydrogenase, C-terminal domain. This family represents the C-terminal all-alpha domain of 6-phosphogluconate dehydrogenase. The domain contains two structural repeats of 5 helices each.¡€0€ª€0€ €CDD¡€ €®‡¢€0€0€ €’pfam00394, Cu-oxidase, Multicopper oxidase. Many of the proteins in this family contain multiple similar copies of this plastocyanin-like domain.¡€0€ª€0€ €CDD¡€ €®ˆ¢€0€0€ €*pfam00395, SLH, S-layer homology domain. ¡€0€ª€0€ €CDD¡€ €®‰¢€0€0€ € pfam00396, Granulin, Granulin. ¡€0€ª€0€ €CDD¡€ €®Š¢€0€0€ €“pfam00397, WW, WW domain. The WW domain is a protein module with two highly conserved tryptophans that binds proline-rich peptide motifs in vitro.¡€0€ª€0€ €CDD¡€ €®‹¢€0€0€ €7pfam00398, RrnaAD, Ribosomal RNA adenine dimethylase. ¡€0€ª€0€ €CDD¡€ €®Œ¢€0€0€ €+pfam00399, PIR, Yeast PIR protein repeat. ¡€0€ª€0€ €CDD¡€ €®¢€0€0€ €,pfam00400, WD40, WD domain, G-beta repeat. ¡€0€ª€0€ €CDD¡€ €®Ž¢€0€0€ €‚mpfam00401, ATP-synt_DE, ATP synthase, Delta/Epsilon chain, long alpha-helix domain. Part of the ATP synthase CF(1). These subunits are part of the head unit of the ATP synthase. This subunit is called epsilon in bacteria and delta in mitochondria. In bacteria the delta (D) subunit is equivalent to the mitochondrial Oligomycin sensitive subunit, OSCP (pfam00213).¡€0€ª€0€ €CDD¡€ €®¢€0€0€ €.pfam00402, Calponin, Calponin family repeat. ¡€0€ª€0€ €CDD¡€ €®¢€0€0€ €0pfam00403, HMA, Heavy-metal-associated domain. ¡€0€ª€0€ €CDD¡€ €®‘¢€0€0€ €‚?pfam00404, Dockerin_1, Dockerin type I repeat. The dockerin repeat is the binding partner of the cohesin domain pfam00963. The cohesin-dockerin interaction is the crucial interaction for complex formation in the cellulosome. The dockerin repeats, each bearing homology to the EF-hand calcium-binding loop bind calcium.¡€0€ª€0€ €CDD¡€ €®’¢€0€0€ €&pfam00405, Transferrin, Transferrin. ¡€0€ª€0€ €CDD¡€ €A!¢€0€0€ €#pfam00406, ADK, Adenylate kinase. ¡€0€ª€0€ €CDD¡€ €®“¢€0€0€ €‚mpfam00407, Bet_v_1, Pathogenesis-related protein Bet v I family. This family is named after Bet v 1, the major birch pollen allergen. This protein belongs to family 10 of plant pathogenesis-related proteins (PR-10), cytoplasmic proteins of 15-17 kd that are wide-spread among dicotyledonous plants. In recent years, a number of diverse plant proteins with low sequence similarity to Bet v 1 was identified. A classification by sequence similarity yielded several subfamilies related to PR-10: - Pathogenesis-related proteins PR-10: These proteins were identified as major tree pollen allergens in birch and related species (hazel, alder), as plant food allergens expressed in high levels in fruits, vegetables and seeds (apple, celery, hazelnut), and as pathogenesis-related proteins whose expression is induced by pathogen infection, wounding, or abiotic stress. Hyp-1, an enzyme involved in the synthesis of the bioactive naphthodianthrone hypericin in St. John's wort (Hypericum perforatum) also belongs to this family. Most of these proteins were found in dicotyledonous plants. In addition, related sequences were identified in monocots and conifers. - Cytokinin-specific binding proteins: These legume proteins bind cytokinin plant hormones. - (S)-Norcoclaurine synthases are enzymes catalyzing the condensation of dopamine and 4-hydroxyphenylacetaldehyde to (S)-norcoclaurine, the first committed step in the biosynthesis of benzylisoquinoline alkaloids such as morphine. -Major latex proteins and ripening-related proteins are proteins of unknown biological function that were first discovered in the latex of opium poppy (Papaver somniferum) and later found to be upregulated during ripening of fruits such as strawberry and cucumber. The occurrence of Bet v 1-related proteins is confined to seed plants with the exception of a cytokinin-binding protein from the moss Physcomitrella patens.¡€0€ª€0€ €CDD¡€ €A#¢€0€0€ €Rpfam00408, PGM_PMM_IV, Phosphoglucomutase/phosphomannomutase, C-terminal domain. ¡€0€ª€0€ €CDD¡€ €®”¢€0€0€ €0pfam00410, Ribosomal_S8, Ribosomal protein S8. ¡€0€ª€0€ €CDD¡€ €®•¢€0€0€ €2pfam00411, Ribosomal_S11, Ribosomal protein S11. ¡€0€ª€0€ €CDD¡€ €®–¢€0€0€ €\pfam00412, LIM, LIM domain. This family represents two copies of the LIM structural domain.¡€0€ª€0€ €CDD¡€ €A'¢€0€0€ €pfam00413, Peptidase_M10, Matrixin. The members of this family are enzymes that cleave peptides. These proteases require zinc for catalysis.¡€0€ª€0€ €CDD¡€ €®—¢€0€0€ €7pfam00414, MAP1B_neuraxin, Neuraxin and MAP1B repeat. ¡€0€ª€0€ €CDD¡€ €A)¢€0€0€ €Fpfam00415, RCC1, Regulator of chromosome condensation (RCC1) repeat. ¡€0€ª€0€ €CDD¡€ €®˜¢€0€0€ €Špfam00416, Ribosomal_S13, Ribosomal protein S13/S18. This family includes ribosomal protein S13 from prokaryotes and S18 from eukaryotes.¡€0€ª€0€ €CDD¡€ €®™¢€0€0€ €‚¿pfam00418, Tubulin-binding, Tau and MAP protein, tubulin-binding repeat. This family includes the vertebrate proteins MAP2, MAP4 and Tau, as well as other animal homologs. MAP4 is present in many tissues but is usually absent from neurons; MAP2 and Tau are mainly neuronal. Members of this family have the ability to bind to and stabilize microtubules. As a result, they are involved in neuronal migration, supporting dendrite elongation, and regulating microtubules during mitotic metaphase. Note that Tau is involved in neurofibrillary tangle formation in Alzheimer's disease and some other dementias. This family features a C-terminal microtubule binding repeat that contains a conserved KXGS motif.¡€0€ª€0€ €CDD¡€ €®š¢€0€0€ €(pfam00419, Fimbrial, Fimbrial protein. ¡€0€ª€0€ €CDD¡€ €®›¢€0€0€ €Ppfam00420, Oxidored_q2, NADH-ubiquinone/plastoquinone oxidoreductase chain 4L. ¡€0€ª€0€ €CDD¡€ €®œ¢€0€0€ €*pfam00421, PSII, Photosystem II protein. ¡€0€ª€0€ €CDD¡€ €®¢€0€0€ €.pfam00423, HN, Haemagglutinin-neuraminidase. ¡€0€ª€0€ €CDD¡€ €A0¢€0€0€ €Hpfam00424, REV, REV protein (anti-repression trans-activator protein). ¡€0€ª€0€ €CDD¡€ €A1¢€0€0€ €òpfam00425, Chorismate_bind, chorismate binding enzyme. This family includes the catalytic regions of the chorismate binding enzymes anthranilate synthase, isochorismate synthase, aminodeoxychorismate synthase and para-aminobenzoate synthase.¡€0€ª€0€ €CDD¡€ €®ž¢€0€0€ €Fpfam00426, VP4_haemagglut, Outer Capsid protein VP4 (Hemagglutinin). ¡€0€ª€0€ €CDD¡€ €®Ÿ¢€0€0€ €?pfam00427, PBS_linker_poly, Phycobilisome Linker polypeptide. ¡€0€ª€0€ €CDD¡€ €® ¢€0€0€ €|pfam00428, Ribosomal_60s, 60s Acidic ribosomal protein. This family includes archaebacterial L12, eukaryotic P0, P1 and P2.¡€0€ª€0€ €CDD¡€ €®¡¢€0€0€ €:pfam00429, TLV_coat, ENV polyprotein (coat polyprotein). ¡€0€ª€0€ €CDD¡€ €®¢¢€0€0€ €‚Ñpfam00430, ATP-synt_B, ATP synthase B/B' CF(0). Part of the CF(0) (base unit) of the ATP synthase. The base unit is thought to translocate protons through membrane (inner membrane in mitochondria, thylakoid membrane in plants, cytoplasmic membrane in bacteria). The B subunits are thought to interact with the stalk of the CF(1) subunits. This domain should not be confused with the ab CF(1) proteins (in the head of the ATP synthase) which are found in pfam00006.¡€0€ª€0€ €CDD¡€ €A6¢€0€0€ €pfam00431, CUB, CUB domain. ¡€0€ª€0€ €CDD¡€ €A7¢€0€0€ €Hpfam00432, Prenyltrans, Prenyltransferase and squalene oxidase repeat. ¡€0€ª€0€ €CDD¡€ €®£¢€0€0€ €9pfam00433, Pkinase_C, Protein kinase C terminal domain. ¡€0€ª€0€ €CDD¡€ €®¤¢€0€0€ €#pfam00434, VP7, Glycoprotein VP7. ¡€0€ª€0€ €CDD¡€ €A:¢€0€0€ €‚õpfam00435, Spectrin, Spectrin repeat. Spectrin repeat-domains are found in several proteins involved in cytoskeletal structure. These include spectrin, alpha-actinin and dystrophin. The sequence repeat used in this family is taken from the structural repeat in reference. The spectrin domain- repeat forms a three helix bundle. The second helix is interrupted by proline in some sequences. The repeats are defined by a characteristic tryptophan (W) residue at position 17 in helix A and a leucine (L) at 2 residues from the carboxyl end of helix C. Although the domain occurs in ultiple repeats along sequences, the domains are actually stable on their own - ie they act, biophysically, like domains rather than repeats that along function when aggregated.¡€0€ª€0€ €CDD¡€ €®¥¢€0€0€ €Òpfam00436, SSB, Single-strand binding protein family. This family includes single stranded binding proteins and also the primosomal replication protein N (PriB). PriB forms a complex with PriA, PriC and ssDNA.¡€0€ª€0€ €CDD¡€ €®¦¢€0€0€ €‚cpfam00437, T2SSE, Type II/IV secretion system protein. This family contains both type II and type IV pathway secretion proteins from bacteria. VirB11 ATPase is a subunit of the Agrobacterium tumefaciens transfer DNA (T-DNA) transfer system, a type IV secretion pathway required for delivery of T-DNA and effector proteins to plant cells during infection.¡€0€ª€0€ €CDD¡€ €®§¢€0€0€ €¤pfam00438, S-AdoMet_synt_N, S-adenosylmethionine synthetase, N-terminal domain. The three domains of S-adenosylmethionine synthetase have the same alpha+beta fold.¡€0€ª€0€ €CDD¡€ €®¨¢€0€0€ €Èpfam00439, Bromodomain, Bromodomain. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine.¡€0€ª€0€ €CDD¡€ €®©¢€0€0€ €@pfam00440, TetR_N, Bacterial regulatory proteins, tetR family. ¡€0€ª€0€ €CDD¡€ €®ª¢€0€0€ €£pfam00441, Acyl-CoA_dh_1, Acyl-CoA dehydrogenase, C-terminal domain. C-terminal domain of Acyl-CoA dehydrogenase is an all-alpha, four helical up-and-down bundle.¡€0€ª€0€ €CDD¡€ €®«¢€0€0€ €8pfam00443, UCH, Ubiquitin carboxyl-terminal hydrolase. ¡€0€ª€0€ €CDD¡€ €®¬¢€0€0€ €2pfam00444, Ribosomal_L36, Ribosomal protein L36. ¡€0€ª€0€ €CDD¡€ €®­¢€0€0€ €5pfam00445, Ribonuclease_T2, Ribonuclease T2 family. ¡€0€ª€0€ €CDD¡€ €®®¢€0€0€ €2pfam00446, GnRH, Gonadotropin-releasing hormone. ¡€0€ª€0€ €CDD¡€ €AE¢€0€0€ €0pfam00447, HSF_DNA-bind, HSF-type DNA-binding. ¡€0€ª€0€ €CDD¡€ €®¯¢€0€0€ €…pfam00448, SRP54, SRP54-type protein, GTPase domain. This family includes relatives of the G-domain of the SRP54 family of proteins.¡€0€ª€0€ €CDD¡€ €®°¢€0€0€ €âpfam00449, Urease_alpha, Urease alpha-subunit, N-terminal domain. The N-terminal domain is a composite domain and plays a major trimer stabilizing role by contacting the catalytic domain of the symmetry related alpha-subunit.¡€0€ª€0€ €CDD¡€ €®±¢€0€0€ €4pfam00450, Peptidase_S10, Serine carboxypeptidase. ¡€0€ª€0€ €CDD¡€ €®²¢€0€0€ €¡pfam00451, Toxin_2, Scorpion short toxin, BmKK2. Members of this family, which are found in various scorpion toxins, confer potassium channel blocking activity.¡€0€ª€0€ €CDD¡€ €AJ¢€0€0€ €?pfam00452, Bcl-2, Apoptosis regulator proteins, Bcl-2 family. ¡€0€ª€0€ €CDD¡€ €®³¢€0€0€ €2pfam00453, Ribosomal_L20, Ribosomal protein L20. ¡€0€ª€0€ €CDD¡€ €®´¢€0€0€ €¡pfam00454, PI3_PI4_kinase, Phosphatidylinositol 3- and 4-kinase. Some members of this family probably do not have lipid kinase activity and are protein kinases.¡€0€ª€0€ €CDD¡€ €®µ¢€0€0€ €‚cpfam00455, DeoRC, DeoR C terminal sensor domain. The sensor domains of the DeoR are catalytically inactive versions of the ISOCOT fold, but retain the substrate binding site. DeorC senses diverse sugar derivatives such as deoxyribose nucleoside (DeoR), tagatose phosphate (LacR), galactosamine (AgaR), myo-inositol (Bacillus IolR) and L-ascorbate (UlaR).¡€0€ª€0€ €CDD¡€ €AN¢€0€0€ €‚fpfam00456, Transketolase_N, Transketolase, thiamine diphosphate binding domain. This family includes transketolase enzymes EC:2.2.1.1. and also partially matches to 2-oxoisovalerate dehydrogenase beta subunit EC:1.2.4.4. Both these enzymes utilize thiamine pyrophosphate as a cofactor, suggesting there may be common aspects in their mechanism of catalysis.¡€0€ª€0€ €CDD¡€ €AO¢€0€0€ €;pfam00457, Glyco_hydro_11, Glycosyl hydrolases family 11. ¡€0€ª€0€ €CDD¡€ €®¶¢€0€0€ €'pfam00458, WHEP-TRS, WHEP-TRS domain. ¡€0€ª€0€ €CDD¡€ €®·¢€0€0€ €9pfam00459, Inositol_P, Inositol monophosphatase family. ¡€0€ª€0€ €CDD¡€ €AR¢€0€0€ €9pfam00460, Flg_bb_rod, Flagella basal body rod protein. ¡€0€ª€0€ €CDD¡€ €AS¢€0€0€ €(pfam00462, Glutaredoxin, Glutaredoxin. ¡€0€ª€0€ €CDD¡€ €®¸¢€0€0€ €*pfam00463, ICL, Isocitrate lyase family. ¡€0€ª€0€ €CDD¡€ €AU¢€0€0€ €3pfam00464, SHMT, Serine hydroxymethyltransferase. ¡€0€ª€0€ €CDD¡€ €AV¢€0€0€ €;pfam00465, Fe-ADH, Iron-containing alcohol dehydrogenase. ¡€0€ª€0€ €CDD¡€ €®¹¢€0€0€ €2pfam00466, Ribosomal_L10, Ribosomal protein L10. ¡€0€ª€0€ €CDD¡€ €®º¢€0€0€ €¯pfam00467, KOW, KOW motif. This family has been extended to coincide with ref. The KOW (Kyprides, Ouzounis, Woese) motif is found in a variety of ribosomal proteins and NusG.¡€0€ª€0€ €CDD¡€ €AY¢€0€0€ €2pfam00468, Ribosomal_L34, Ribosomal protein L34. ¡€0€ª€0€ €CDD¡€ €®»¢€0€0€ €‚"pfam00469, F-protein, Negative factor, (F-Protein) or Nef. Nef protein accelerates virulent progression of AIDS by its interaction with cellular proteins involved in signal transduction and host cell activation. Nef has been shown to bind specifically to a subset of the Src kinase family.¡€0€ª€0€ €CDD¡€ €®¼¢€0€0€ €2pfam00471, Ribosomal_L33, Ribosomal protein L33. ¡€0€ª€0€ €CDD¡€ €®½¢€0€0€ €‚cpfam00472, RF-1, RF-1 domain. This domain is found in peptide chain release factors such as RF-1 and RF-2, and a number of smaller proteins of unknown function. This domain contains the peptidyl-tRNA hydrolase activity. The domain contains a highly conserved motif GGQ, where the glutamine is thought to coordinate the water that mediates the hydrolysis.¡€0€ª€0€ €CDD¡€ €®¾¢€0€0€ €8pfam00473, CRF, Corticotropin-releasing factor family. ¡€0€ª€0€ €CDD¡€ €®¿¢€0€0€ €1pfam00474, SSF, Sodium:solute symporter family. ¡€0€ª€0€ €CDD¡€ €«×¢€0€0€ €;pfam00475, IGPD, Imidazoleglycerol-phosphate dehydratase. ¡€0€ª€0€ €CDD¡€ €®À¢€0€0€ €0pfam00476, DNA_pol_A, DNA polymerase family A. ¡€0€ª€0€ €CDD¡€ €®Á¢€0€0€ €9pfam00477, LEA_5, Small hydrophilic plant seed protein. ¡€0€ª€0€ €CDD¡€ €®Â¢€0€0€ €‚hpfam00478, IMPDH, IMP dehydrogenase / GMP reductase domain. This family is involved in biosynthesis of guanosine nucleotide. Members of this family contain a TIM barrel structure. In the inosine monophosphate dehydrogenases 2 CBS domains pfam00571 are inserted in the TIM barrel. This family is a member of the common phosphate binding site TIM barrel family.¡€0€ª€0€ €CDD¡€ €®Ã¢€0€0€ €Kpfam00479, G6PD_N, Glucose-6-phosphate dehydrogenase, NAD binding domain. ¡€0€ª€0€ €CDD¡€ €®Ä¢€0€0€ €pfam00480, ROK, ROK family. ¡€0€ª€0€ €CDD¡€ €Ac¢€0€0€ €‚pfam00481, PP2C, Protein phosphatase 2C. Protein phosphatase 2C is a Mn++ or Mg++ dependent protein serine/threonine phosphatase.¡€0€ª€0€ €CDD¡€ €®Å¢€0€0€ €‚pfam00482, T2SSF, Type II secretion system (T2SS), protein F. The original family covered both the regions found by the current model. The splitting of the family has allowed the related FlaJ_arch (archaeal FlaJ family) to be merged with it. Proteins with this domain in form a platform for the machiney of the Type II secretion system, as well as the Type 4 pili and the archaeal flagella. This domain seems to show some similarity to PF00664 but this may just be due to similarities in the TM helices (personal obs: C Yeats).¡€0€ª€0€ €CDD¡€ €®Æ¢€0€0€ €‘pfam00483, NTP_transferase, Nucleotidyl transferase. This family includes a wide range of enzymes which transfer nucleotides onto phosphosugars.¡€0€ª€0€ €CDD¡€ €®Ç¢€0€0€ €pfam00484, Pro_CA, Carbonic anhydrase. This family includes carbonic anhydrases as well as a family of non-functional homologs related to YbcF.¡€0€ª€0€ €CDD¡€ €®È¢€0€0€ €‚pfam00485, PRK, Phosphoribulokinase / Uridine kinase family. This family matches three types of P-loop containing kinases: phosphoribulokinases, uridine kinases and bacterial pantothenate kinases(CoaA). Arabidopsis and other organisms have a dual uridine kinase/uracil phosphoribosyltransferase protein where the N-terminal region consists of a UK domain and the C-terminal region of a UPRT domain.¡€0€ª€0€ €CDD¡€ €Ah¢€0€0€ €Ipfam00486, Trans_reg_C, Transcriptional regulatory protein, C terminal. ¡€0€ª€0€ €CDD¡€ €®É¢€0€0€ €2pfam00487, FA_desaturase, Fatty acid desaturase. ¡€0€ª€0€ €CDD¡€ €®Ê¢€0€0€ €‚Üpfam00488, MutS_V, MutS domain V. This domain is found in proteins of the MutS family (DNA mismatch repair proteins) and is found associated with pfam01624, pfam05188, pfam05192 and pfam05190. The mutS family of proteins is named after the Salmonella typhimurium MutS protein involved in mismatch repair; other members of the family included the eukaryotic MSH 1,2,3, 4,5 and 6 proteins. These have various roles in DNA repair and recombination. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein. The aligned region corresponds with domain V of Thermus aquaticus MutS, which contains a Walker A motif, and is structurally similar to the ATPase domain of ABC transporters.¡€0€ª€0€ €CDD¡€ €®Ë¢€0€0€ €1pfam00489, IL6, Interleukin-6/G-CSF/MGF family. ¡€0€ª€0€ €CDD¡€ €Al¢€0€0€ €9pfam00490, ALAD, Delta-aminolevulinic acid dehydratase. ¡€0€ª€0€ €CDD¡€ €®Ì¢€0€0€ €'pfam00491, Arginase, Arginase family. ¡€0€ª€0€ €CDD¡€ €®Í¢€0€0€ €"pfam00493, MCM, MCM2/3/5 family. ¡€0€ª€0€ €CDD¡€ €®Î¢€0€0€ €1pfam00494, SQS_PSY, Squalene/phytoene synthase. ¡€0€ª€0€ €CDD¡€ €®Ï¢€0€0€ €Ìpfam00496, SBP_bac_5, Bacterial extracellular solute-binding proteins, family 5 Middle. The borders of this family are based on the PDBSum definitions of the domain edges for Salmonella typhimurium oppA.¡€0€ª€0€ €CDD¡€ €Aq¢€0€0€ €Rpfam00497, SBP_bac_3, Bacterial extracellular solute-binding proteins, family 3. ¡€0€ª€0€ €CDD¡€ €®Ð¢€0€0€ €dpfam00498, FHA, FHA domain. The FHA (Forkhead-associated) domain is a phosphopeptide binding motif.¡€0€ª€0€ €CDD¡€ €®Ñ¢€0€0€ €Opfam00499, Oxidored_q3, NADH-ubiquinone/plastoquinone oxidoreductase chain 6. ¡€0€ª€0€ €CDD¡€ €®Ò¢€0€0€ €0pfam00500, Late_protein_L1, L1 (late) protein. ¡€0€ª€0€ €CDD¡€ €Au¢€0€0€ €-pfam00501, AMP-binding, AMP-binding enzyme. ¡€0€ª€0€ €CDD¡€ €®Ó¢€0€0€ €2pfam00502, Phycobilisome, Phycobilisome protein. ¡€0€ª€0€ €CDD¡€ €®Ô¢€0€0€ €‚Àpfam00503, G-alpha, G-protein alpha subunit. G proteins couple receptors of extracellular signals to intracellular signaling pathways. The G protein alpha subunit binds guanyl nucleotide and is a weak GTPase. A set of residues that are unique to G-alpha as compared to its ancestor the Arf-like family form a ring of residues centered on the nucleotide binding site. A Ggamma is found fused to an inactive Galpha in the Dictyostelium protein gbqA.¡€0€ª€0€ €CDD¡€ €®Õ¢€0€0€ €=pfam00504, Chloroa_b-bind, Chlorophyll A-B binding protein. ¡€0€ª€0€ €CDD¡€ €®Ö¢€0€0€ €4pfam00505, HMG_box, HMG (high mobility group) box. ¡€0€ª€0€ €CDD¡€ €Az¢€0€0€ €3pfam00506, Flu_NP, Influenza virus nucleoprotein. ¡€0€ª€0€ €CDD¡€ €A{¢€0€0€ €Ppfam00507, Oxidored_q4, NADH-ubiquinone/plastoquinone oxidoreductase, chain 3. ¡€0€ª€0€ €CDD¡€ €®×¢€0€0€ €6pfam00508, PPV_E2_N, E2 (early) protein, N terminal. ¡€0€ª€0€ €CDD¡€ €A}¢€0€0€ €‚Xpfam00509, Hemagglutinin, Haemagglutinin. Haemagglutinin from influenza virus causes membrane fusion of the viral membrane with the host membrane. Fusion occurs after the host cell internalises the virus by endocytosis. The drop of pH causes release of a hydrophobic fusion peptide and a large conformational change leading to membrane fusion.¡€0€ª€0€ €CDD¡€ €A~¢€0€0€ €4pfam00510, COX3, Cytochrome c oxidase subunit III. ¡€0€ª€0€ €CDD¡€ €A¢€0€0€ €6pfam00511, PPV_E2_C, E2 (early) protein, C terminal. ¡€0€ª€0€ €CDD¡€ €A€¢€0€0€ €ypfam00512, HisKA, His Kinase A (phospho-acceptor) domain. Dimerisation and phospho-acceptor domain of histidine kinases.¡€0€ª€0€ €CDD¡€ €®Ø¢€0€0€ €.pfam00513, Late_protein_L2, Late Protein L2. ¡€0€ª€0€ €CDD¡€ €A‚¢€0€0€ €‚ pfam00514, Arm, Armadillo/beta-catenin-like repeat. Approx. 40 amino acid repeat. Tandem repeats form super-helix of helices that is proposed to mediate interaction of beta-catenin with its ligands. CAUTION: This family does not contain all known armadillo repeats.¡€0€ª€0€ €CDD¡€ €Aƒ¢€0€0€ €-pfam00515, TPR_1, Tetratricopeptide repeat. ¡€0€ª€0€ €CDD¡€ €®Ù¢€0€0€ €›pfam00516, GP120, Envelope glycoprotein GP120. The entry of HIV requires interaction of viral GP120 with CD4 and a chemokine receptor on the cell surface.¡€0€ª€0€ €CDD¡€ €A…¢€0€0€ €‚pfam00517, GP41, Retroviral envelope protein. This family includes envelope protein from a variety of retroviruses. It includes the GP41 subunit of the envelope protein complex from human and simian immunodeficiency viruses (HIV and SIV) which mediate membrane fusion during viral entry. The family also includes bovine immunodeficiency virus, feline immunodeficiency virus and Equine infectious anaemia (EIAV). The family also includes the Gp36 protein from mouse mammary tumor virus (MMTV) and human endogenous retroviruses (HERVs).¡€0€ª€0€ €CDD¡€ €®Ú¢€0€0€ €$pfam00518, E6, Early Protein (E6). ¡€0€ª€0€ €CDD¡€ €®Û¢€0€0€ €Ãpfam00519, PPV_E1_C, Papillomavirus helicase. This protein is a DNA helicase that is required for initiation of viral DNA replication. This protein forms a complex with the E2 protein pfam00508.¡€0€ª€0€ €CDD¡€ €Aˆ¢€0€0€ €‚‚pfam00520, Ion_trans, Ion transport protein. This family contains sodium, potassium and calcium ion channels. This family is 6 transmembrane helices in which the last two helices flank a loop which determines ion selectivity. In some sub-families (e.g. Na channels) the domain is repeated four times, whereas in others (e.g. K channels) the protein forms as a tetramer in the membrane.¡€0€ª€0€ €CDD¡€ €®Ü¢€0€0€ €Cpfam00521, DNA_topoisoIV, DNA gyrase/topoisomerase IV, subunit A. ¡€0€ª€0€ €CDD¡€ €®Ý¢€0€0€ €"pfam00522, VPR, VPR/VPX protein. ¡€0€ª€0€ €CDD¡€ €A‹¢€0€0€ €0pfam00523, Fusion_gly, Fusion glycoprotein F0. ¡€0€ª€0€ €CDD¡€ €®Þ¢€0€0€ €5pfam00524, PPV_E1_N, E1 Protein, N terminal domain. ¡€0€ª€0€ €CDD¡€ €A¢€0€0€ €>pfam00525, Crystallin, Alpha crystallin A chain, N terminal. ¡€0€ª€0€ €CDD¡€ €®ß¢€0€0€ €;pfam00526, Dicty_CTDC, Dictyostelium (slime mold) repeat. ¡€0€ª€0€ €CDD¡€ €®à¢€0€0€ €+pfam00527, E7, E7 protein, Early protein. ¡€0€ª€0€ €CDD¡€ €A¢€0€0€ €‚Bpfam00528, BPD_transp_1, Binding-protein-dependent transport system inner membrane component. The alignments cover the most conserved region of the proteins, which is thought to be located in a cytoplasmic loop between two transmembrane domains. The members of this family have a variable number of transmembrane helices.¡€0€ª€0€ €CDD¡€ €®á¢€0€0€ €‚upfam00529, HlyD, HlyD membrane-fusion protein of T1SS. HlyD is a component of the prototypical alpha-haemolysin (HlyA) bacterial type I secretion system, along with the other components HlyB and TolC. HlyD and HlyB are inner-membrane proteins and specific components of the transport apparatus of alpha-haemolysin. HlyD is anchored in the cytoplasmic membrane by a single transmembrane domain and has a large periplasmic domain within the carboxy-terminal 100 amino acids, HlyB and HlyD form a stable complex that binds the recombinant protein bearing a C-terminal HlyA signal sequence and ATP in the cytoplasm. HlyD, HlyB and TolC combine to form the three-component ABC transporter complex that forms a trans-membrane channel or pore through which HlyA can be transferred directly to the extracellular medium. Cutinase has been shown to be transported effectively through this pore.¡€0€ª€0€ €CDD¡€ €®â¢€0€0€ €ãpfam00530, SRCR, Scavenger receptor cysteine-rich domain. These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions.¡€0€ª€0€ €CDD¡€ €®ã¢€0€0€ €!pfam00531, Death, Death domain. ¡€0€ª€0€ €CDD¡€ €®ä¢€0€0€ €‚&pfam00532, Peripla_BP_1, Periplasmic binding proteins and sugar binding domain of LacI family. This family includes the periplasmic binding proteins, and the LacI family transcriptional regulators. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The LacI family of proteins consist of transcriptional regulators related to the lac repressor. In this case, generally the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain (pfam00356).¡€0€ª€0€ €CDD¡€ €®å¢€0€0€ €‚]pfam00533, BRCT, BRCA1 C Terminus (BRCT) domain. The BRCT domain is found predominantly in proteins involved in cell cycle checkpoint functions responsive to DNA damage. The BRCT domain of XRCC1 forms a homodimer in the crystal structure. This suggests that pairs of BRCT domains associate as homo- or heterodimers. BRCT domains are often found as tandem-repeat pairs. Structures of the BRCA1 BRCT domains revealed a basis for a widely utilized head-to-tail BRCT-BRCT oligomerization mode. This conserved tandem BRCT architecture facilitates formation of the canonical BRCT phospho-peptide interaction cleft at a groove between the BRCT domains. Disease associated missense and nonsense mutations in the BRCA1 BRCT domains disrupt peptide binding by directly occluding this peptide binding groove, or by disrupting key conserved BRCT core folding determinants.¡€0€ª€0€ €CDD¡€ €®æ¢€0€0€ €‚°pfam00534, Glycos_transf_1, Glycosyl transferases group 1. Mutations in this domain of PIGA lead to disease (Paroxysmal Nocturnal haemoglobinuria). Members of this family transfer activated sugars to a variety of substrates, including glycogen, Fructose-6-phosphate and lipopolysaccharides. Members of this family transfer UDP, ADP, GDP or CMP linked sugars. The eukaryotic glycogen synthases may be distant members of this family.¡€0€ª€0€ €CDD¡€ €A—¢€0€0€ €ÿpfam00535, Glycos_transf_2, Glycosyl transferase family 2. Diverse family, transferring sugar from UDP-glucose, UDP-N-acetyl- galactosamine, GDP-mannose or CDP-abequose, to a range of substrates including cellulose, dolichol phosphate and teichoic acids.¡€0€ª€0€ €CDD¡€ €A˜¢€0€0€ €‚zpfam00536, SAM_1, SAM domain (Sterile alpha motif). It has been suggested that SAM is an evolutionarily conserved protein binding domain that is involved in the regulation of numerous developmental processes in diverse eukaryotes. The SAM domain can potentially function as a protein interaction module through its ability to homo- and heterooligomerise with other SAM domains.¡€0€ª€0€ €CDD¡€ €®ç¢€0€0€ €‚Âpfam00537, Toxin_3, Scorpion toxin-like domain. This family contains both neurotoxins and plant defensins. The mustard trypsin inhibitor, MTI-2, is plant defensin. It is a potent inhibitor of trypsin with no activity towards chymotrypsin. MTI-2 is toxic for Lepidopteran insects, but has low activity against aphids. Brazzein is plant defensin-like protein. It is pH-stable, heat-stable and intensely sweet protein. The scorpion toxin (a neurotoxin) binds to sodium channels and inhibits the activation mechanisms of the channels, thereby blocking neuronal transmission. Scorpion toxins bind to sodium channels and inhibit the activation mechanisms of the channels, thereby blocking neuronal transmission.¡€0€ª€0€ €CDD¡€ €Aš¢€0€0€ €ëpfam00538, Linker_histone, linker histone H1 and H5 family. Linker histone H1 is an essential component of chromatin structure. H1 links nucleosomes into higher order structures Histone H1 is replaced by histone H5 in some cell types.¡€0€ª€0€ €CDD¡€ €®è¢€0€0€ €ëpfam00539, Tat, Transactivating regulatory protein (Tat). The retroviral Tat protein binds to the Tar RNA. This activates transcriptional initiation and elongation from the LTR promoter. Binding is mediated by an arginine rich region.¡€0€ª€0€ €CDD¡€ €®é¢€0€0€ €²pfam00540, Gag_p17, gag gene protein p17 (matrix protein). The matrix protein forms an icosahedral shell associated with the inner membrane of the mature immunodeficiency virus.¡€0€ª€0€ €CDD¡€ €ÐW¢€0€0€ €‚Apfam00541, Adeno_knob, Adenoviral fibre protein (knob domain). Specific attachment of adenovirus is achieved through interactions between host-cell receptors and the adenovirus fibre protein and is mediated by the globular carboxy-terminal domain of the adenovirus fibre protein, termed the carboxy-terminal knob domain.¡€0€ª€0€ €CDD¡€ €®ê¢€0€0€ €Gpfam00542, Ribosomal_L12, Ribosomal protein L7/L12 C-terminal domain. ¡€0€ª€0€ €CDD¡€ €®ë¢€0€0€ €hpfam00543, P-II, Nitrogen regulatory protein P-II. P-II modulates the activity of glutamine synthetase.¡€0€ª€0€ €CDD¡€ €AŸ¢€0€0€ €¶pfam00544, Pec_lyase_C, Pectate lyase. This enzyme forms a right handed beta helix structure. Pectate lyase is an enzyme involved in the maceration and soft rotting of plant tissue.¡€0€ª€0€ €CDD¡€ €®ì¢€0€0€ €\pfam00545, Ribonuclease, ribonuclease. This enzyme hydrolyses RNA and oligoribonucleotides.¡€0€ª€0€ €CDD¡€ €®í¢€0€0€ €—pfam00547, Urease_gamma, Urease, gamma subunit. Urease is a nickel-binding enzyme that catalyzes the hydrolysis of urea to carbon dioxide and ammonia.¡€0€ª€0€ €CDD¡€ €®î¢€0€0€ €®pfam00548, Peptidase_C3, 3C cysteine protease (picornain 3C). Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease.¡€0€ª€0€ €CDD¡€ €A£¢€0€0€ €×pfam00549, Ligase_CoA, CoA-ligase. This family includes the CoA ligases Succinyl-CoA synthetase alpha and beta chains, malate CoA ligase and ATP-citrate lyase. Some members of the family utilize ATP others use GTP.¡€0€ª€0€ €CDD¡€ €®ï¢€0€0€ €‚pfam00550, PP-binding, Phosphopantetheine attachment site. A 4'-phosphopantetheine prosthetic group is attached through a serine. This prosthetic group acts as a a 'swinging arm' for the attachment of activated fatty acid and amino-acid groups. This domain forms a four helix bundle. This family includes members not included in Prosite. The inclusion of these members is supported by sequence analysis and functional evidence. The related domain of the anguibactin system regulator AngR has the attachment serine replaced by an alanine.¡€0€ª€0€ €CDD¡€ €®ð¢€0€0€ €‚Þpfam00551, Formyl_trans_N, Formyl transferase. This family includes the following members. Glycinamide ribonucleotide transformylase catalyzes the third step in de novo purine biosynthesis, the transfer of a formyl group to 5'-phosphoribosylglycinamide. Formyltetrahydrofolate deformylase produces formate from formyl- tetrahydrofolate. Methionyl-tRNA formyltransferase transfers a formyl group onto the amino terminus of the acyl moiety of the methionyl aminoacyl-tRNA. Inclusion of the following members is supported by PSI-blast. HOXX_BRAJA (P31907) contains a related domain of unknown function. PRTH_PORGI (P46071) contains a related domain of unknown function. Y09P_MYCTU (Q50721) contains a related domain of unknown function.¡€0€ª€0€ €CDD¡€ €A¦¢€0€0€ €‚wpfam00552, IN_DBD_C, Integrase DNA binding domain. Integrase mediates integration of a DNA copy of the viral genome into the host chromosome. Integrase is composed of three domains. The amino-terminal domain is a zinc binding domain. The central domain is the catalytic domain pfam00665. This domain is the carboxyl terminal domain that is a non-specific DNA binding domain.¡€0€ª€0€ €CDD¡€ €®ñ¢€0€0€ €“pfam00553, CBM_2, Cellulose binding domain. Two tryptophan residues are involved in cellulose binding. Cellulose binding domain found in bacteria.¡€0€ª€0€ €CDD¡€ €®ò¢€0€0€ €‚€pfam00554, RHD_DNA_bind, Rel homology DNA-binding domain. Proteins containing the Rel homology domain (RHD) are eukaryotic transcription factors. The RHD is composed of two structural domains. This is the N-terminal DNA-binding domain that is similar to that found in P53. The C-terminal domain has an immunoglobulin-like fold (See pfam16179) that functions as a dimerisation domain.¡€0€ª€0€ €CDD¡€ €A©¢€0€0€ €‚Æpfam00555, Endotoxin_M, delta endotoxin. This family contains insecticidal toxins produced by Bacillus species of bacteria. During spore formation the bacteria produce crystals of this protein. When an insect ingests these proteins they are activated by proteolytic cleavage. The N terminus is cleaved in all of the proteins and a C terminal extension is cleaved in some members. Once activated the endotoxin binds to the gut epithelium and causes cell lysis leading to death. This activated region of the delta endotoxin is composed of three structural domains. The N-terminal helical domain is involved in membrane insertion and pore formation. The second and third domains are involved in receptor binding.¡€0€ª€0€ €CDD¡€ €®ó¢€0€0€ €5pfam00556, LHC, Antenna complex alpha/beta subunit. ¡€0€ª€0€ €CDD¡€ €®ô¢€0€0€ €×pfam00557, Peptidase_M24, Metallopeptidase family M24. This family contains metallopeptidases. It also contains non-peptidase homologs such as the N terminal domain of Spt16 which is a histone H3-H4 binding module.¡€0€ª€0€ €CDD¡€ €®õ¢€0€0€ €‚%pfam00558, Vpu, Vpu protein. The Vpu protein contains an N-terminal transmembrane spanning region and a C-terminal cytoplasmic region. The HIV-1 Vpu protein stimulates virus production by enhancing the release of viral particles from infected cells. The VPU protein binds specifically to CD4.¡€0€ª€0€ €CDD¡€ €¬(¢€0€0€ €‚\pfam00559, Vif, Retroviral Vif (Viral infectivity) protein. Human immunodeficiency virus type 1 (HIV-1) Vif is required for productive infection of T lymphocytes and macrophages. Virions produced in the absence of Vif have abnormal core morphology and those produced in primary T cells carry immature core proteins and low levels of mature capsid.¡€0€ª€0€ €CDD¡€ €A­¢€0€0€ €‚àpfam00560, LRR_1, Leucine Rich Repeat. CAUTION: This Pfam may not find all Leucine Rich Repeats in a protein. Leucine Rich Repeats are short sequence motifs present in a number of proteins with diverse functions and cellular locations. These repeats are usually involved in protein-protein interactions. Each Leucine Rich Repeat is composed of a beta-alpha unit. These units form elongated non-globular structures. Leucine Rich Repeats are often flanked by cysteine rich domains.¡€0€ª€0€ €CDD¡€ €®ö¢€0€0€ €upfam00561, Abhydrolase_1, alpha/beta hydrolase fold. This catalytic domain is found in a very wide range of enzymes.¡€0€ª€0€ €CDD¡€ €®÷¢€0€0€ €‚ñpfam00562, RNA_pol_Rpb2_6, RNA polymerase Rpb2, domain 6. RNA polymerases catalyze the DNA dependent polymerization of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain represents the hybrid binding domain and the wall domain. The hybrid binding domain binds the nascent RNA strand / template DNA strand in the Pol II transcription elongation complex. This domain contains the important structural motifs, switch 3 and the flap loop and binds an active site metal ion. This domain is also involved in binding to Rpb1 and Rpb3. Many of the bacterial members contain large insertions within this domain, as region known as dispensable region 2 (DRII).¡€0€ª€0€ €CDD¡€ €®ø¢€0€0€ €‚gpfam00563, EAL, EAL domain. This domain is found in diverse bacterial signaling proteins. It is called EAL after its conserved residues. The EAL domain is a good candidate for a diguanylate phosphodiesterase function. The domain contains many conserved acidic residues that could participate in metal binding and might form the phosphodiesterase active site.¡€0€ª€0€ €CDD¡€ €®ù¢€0€0€ €pfam00564, PB1, PB1 domain. ¡€0€ª€0€ €CDD¡€ €A²¢€0€0€ €‚™pfam00565, SNase, Staphylococcal nuclease homolog. Present in all three domains of cellular life. Four copies in the transcriptional coactivator p100: these, however, appear to lack the active site residues of Staphylococcal nuclease. Positions 14 (Asp-21), 34 (Arg-35), 39 (Asp-40), 42 (Glu-43) and 110 (Arg-87) [SNase numbering in parentheses] are thought to be involved in substrate-binding and catalysis.¡€0€ª€0€ €CDD¡€ €®ú¢€0€0€ €ûpfam00566, RabGAP-TBC, Rab-GTPase-TBC domain. Identification of a TBC domain in GYP6_YEAST and GYP7_YEAST, which are GTPase activator proteins of yeast Ypt6 and Ypt7, implies that these domains are GTPase activator proteins of Rab-like small GTPases.¡€0€ª€0€ €CDD¡€ €®û¢€0€0€ €!pfam00567, TUDOR, Tudor domain. ¡€0€ª€0€ €CDD¡€ €®ü¢€0€0€ €‚¯pfam00568, WH1, WH1 domain. WASp Homology domain 1 (WH1) domain. WASP is the protein that is defective in Wiskott-Aldrich syndrome (WAS). The majority of point mutations occur within the amino- terminal WH1 domain. The metabotropic glutamate receptors mGluR1alpha and mGluR5 bind a protein called homer, which is a WH1 domain homolog. A subset of WH1 domains has been termed a "EVH1" domain and appear to bind a polyproline motif.¡€0€ª€0€ €CDD¡€ €3k¢€0€0€ €‚pfam00569, ZZ, Zinc finger, ZZ type. Zinc finger present in dystrophin, CBP/p300. ZZ in dystrophin binds calmodulin. Putative zinc finger; binding not yet shown. Four to six cysteine residues in its sequence are responsible for coordinating zinc ions, to reinforce the structure.¡€0€ª€0€ €CDD¡€ €A¶¢€0€0€ €‚pfam00570, HRDC, HRDC domain. The HRDC (Helicase and RNase D C-terminal) domain has a putative role in nucleic acid binding. Mutations in the HRDC domain cause human disease. It is interesting to note that the RecQ helicase in Deinococcus radiodurans has three tandem HRDC domains.¡€0€ª€0€ €CDD¡€ €®ý¢€0€0€ €‚×pfam00571, CBS, CBS domain. CBS domains are small intracellular modules that pair together to form a stable globular domain. This family represents a single CBS domain. Pairs of these domains have been termed a Bateman domain. CBS domains have been shown to bind ligands with an adenosyl group such as AMP, ATP and S-AdoMet. CBS domains are found attached to a wide range of other protein domains suggesting that CBS domains may play a regulatory role making proteins sensitive to adenosyl carrying ligands. The region containing the CBS domains in Cystathionine-beta synthase is involved in regulation by S-AdoMet. CBS domain pairs from AMPK bind AMP or ATP. The CBS domains from IMPDH and the chloride channel CLC2 bind ATP.¡€0€ª€0€ €CDD¡€ €®þ¢€0€0€ €2pfam00572, Ribosomal_L13, Ribosomal protein L13. ¡€0€ª€0€ €CDD¡€ €®ÿ¢€0€0€ €Äpfam00573, Ribosomal_L4, Ribosomal protein L4/L1 family. This family includes Ribosomal L4/L1 from eukaryotes and archaebacteria and L4 from eubacteria. L4 from yeast has been shown to bind rRNA.¡€0€ª€0€ €CDD¡€ €¯¢€0€0€ €‚pfam00574, CLP_protease, Clp protease. The Clp protease has an active site catalytic triad. In E. coli Clp protease, ser-111, his-136 and asp-185 form the catalytic triad. Some members have lost active site residues and are therefore inactive, some contain one or two large insertions.¡€0€ª€0€ €CDD¡€ €¯¢€0€0€ €Ûpfam00575, S1, S1 RNA binding domain. The S1 domain occurs in a wide range of RNA associated proteins. It is structurally similar to cold shock protein which binds nucleic acids. The S1 domain has an OB-fold structure.¡€0€ª€0€ €CDD¡€ €¯¢€0€0€ €‚†pfam00576, Transthyretin, HIUase/Transthyretin family. This family includes transthyretin that is a thyroid hormone-binding protein that transports thyroxine from the bloodstream to the brain. However, most of the sequences listed in this family do not bind thyroid hormones. They are actually enzymes of the purine catabolism that catalyze the conversion of 5-hydroxyisourate (HIU) to OHCU. HIU hydrolysis is the original function of the family and is conserved from bacteria to mammals; transthyretins arose by gene duplications in the vertebrate lineage. HIUases are distinguished in the alignment from the conserved C-terminal YRGS sequence.¡€0€ª€0€ €CDD¡€ €¯¢€0€0€ €‚:pfam00577, Usher, Outer membrane usher protein. In Gram-negative bacteria the biogenesis of fimbriae (or pili) requires a two- component assembly and transport system which is composed of a periplasmic chaperone and an outer membrane protein which has been termed a molecular 'usher'. The usher protein is rather large (from 86 to 100 Kd) and seems to be mainly composed of membrane-spanning beta-sheets, a structure reminiscent of porins. Although the degree of sequence similarity of these proteins is not very high they share a number of characteristics. One of these is the presence of two pairs of cysteines, the first one located in the N-terminal part and the second at the C-terminal extremity that are probably involved in disulphide bonds. The best conserved region is located in the central part of these proteins.¡€0€ª€0€ €CDD¡€ €¯¢€0€0€ €špfam00578, AhpC-TSA, AhpC/TSA family. This family contains proteins related to alkyl hydroperoxide reductase (AhpC) and thiol specific antioxidant (TSA).¡€0€ª€0€ €CDD¡€ €¯¢€0€0€ €>pfam00579, tRNA-synt_1b, tRNA synthetases class I (W and Y). ¡€0€ª€0€ €CDD¡€ €¯¢€0€0€ €‚kpfam00580, UvrD-helicase, UvrD/REP helicase N-terminal domain. The Rep family helicases are composed of four structural domains. The Rep family function as dimers. REP helicases catalyze ATP dependent unwinding of double stranded DNA to single stranded DNA. Some members have large insertions near to the carboxy-terminus relative to other members of the family.¡€0€ª€0€ €CDD¡€ €¯¢€0€0€ €‚pfam00581, Rhodanese, Rhodanese-like domain. Rhodanese has an internal duplication. This Pfam represents a single copy of this duplicated domain. The domain is found as a single copy in other proteins, including phosphatases and ubiquitin C-terminal hydrolases.¡€0€ª€0€ €CDD¡€ €¯¢€0€0€ €‚pfam00582, Usp, Universal stress protein family. The universal stress protein UspA is a small cytoplasmic bacterial protein whose expression is enhanced when the cell is exposed to stress agents. UspA enhances the rate of cell survival during prolonged exposure to such conditions, and may provide a general "stress endurance" activity. The crystal structure of Haemophilus influenzae UspA reveals an alpha/beta fold similar to that of the Methanococcus jannaschii MJ0577 protein, which binds ATP, though UspA lacks ATP-binding activity.¡€0€ª€0€ €CDD¡€ €¯ ¢€0€0€ €œpfam00583, Acetyltransf_1, Acetyltransferase (GNAT) family. This family contains proteins with N-acetyltransferase functions such as Elp3-related proteins.¡€0€ª€0€ €CDD¡€ €¯ ¢€0€0€ €‚@pfam00584, SecE, SecE/Sec61-gamma subunits of protein translocation complex. SecE is part of the SecYEG complex in bacteria which translocates proteins from the cytoplasm. In eukaryotes the complex, made from Sec61-gamma and Sec61-alpha translocates protein from the cytoplasm to the ER. Archaea have a similar complex.¡€0€ª€0€ €CDD¡€ €¯ ¢€0€0€ €‚%pfam00585, Thr_dehydrat_C, C-terminal regulatory domain of Threonine dehydratase. Threonine dehydratases pfam00291 all contain a carboxy terminal region. This region may have a regulatory role. Some members contain two copies of this region. This family is homologous to the pfam01842 domain.¡€0€ª€0€ €CDD¡€ €AÆ¢€0€0€ €‚bpfam00586, AIRS, AIR synthase related protein, N-terminal domain. This family includes Hydrogen expression/formation protein HypE, AIR synthases EC:6.3.3.1, FGAM synthase EC:6.3.5.3 and selenide, water dikinase EC:2.7.9.3. The N-terminal domain of AIR synthase forms the dimer interface of the protein, and is suggested as a putative ATP binding domain.¡€0€ª€0€ €CDD¡€ €¯ ¢€0€0€ €pfam00587, tRNA-synt_2b, tRNA synthetase class II core domain (G, H, P, S and T). tRNA-synt_2b is a family of largely threonyl-tRNA members.¡€0€ª€0€ €CDD¡€ €AÈ¢€0€0€ €fpfam00588, SpoU_methylase, SpoU rRNA Methylase family. This family of proteins probably use S-AdoMet.¡€0€ª€0€ €CDD¡€ €¯ ¢€0€0€ €‚bpfam00589, Phage_integrase, Phage integrase family. Members of this family cleave DNA substrates by a series of staggered cuts, during which the protein becomes covalently linked to the DNA through a catalytic tyrosine residue at the carboxy end of the alignment. The catalytic site residues in CRE recombinase are Arg-173, His-289, Arg-292 and Tyr-324.¡€0€ª€0€ €CDD¡€ €AÊ¢€0€0€ €‚pfam00590, TP_methylase, Tetrapyrrole (Corrin/Porphyrin) Methylases. This family uses S-AdoMet in the methylation of diverse substrates. This family includes a related group of bacterial proteins of unknown function. This family includes the methylase Dipthine synthase.¡€0€ª€0€ €CDD¡€ €¯¢€0€0€ €åpfam00591, Glycos_transf_3, Glycosyl transferase family, a/b domain. This family includes anthranilate phosphoribosyltransferase (TrpD), thymidine phosphorylase. All these proteins can transfer a phosphorylated ribose substrate.¡€0€ª€0€ €CDD¡€ €AÌ¢€0€0€ €ypfam00593, TonB_dep_Rec, TonB dependent receptor. This model now only covers the conserved part of the barrel structure.¡€0€ª€0€ €CDD¡€ €¯¢€0€0€ €‚¢€0€0€ €Épfam00664, ABC_membrane, ABC transporter transmembrane region. This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions.¡€0€ª€0€ €CDD¡€ €¯?¢€0€0€ €‚†pfam00665, rve, Integrase core domain. Integrase mediates integration of a DNA copy of the viral genome into the host chromosome. Integrase is composed of three domains. The amino-terminal domain is a zinc binding domain pfam02022. This domain is the central catalytic domain. The carboxyl terminal domain that is a non-specific DNA binding domain pfam00552. The catalytic domain acts as an endonuclease when two nucleotides are removed from the 3' ends of the blunt-ended viral DNA made by reverse transcription. This domain also catalyzes the DNA strand transfer reaction of the 3' ends of the viral DNA to the 5' ends of the integration site.¡€0€ª€0€ €CDD¡€ €¯@¢€0€0€ €Àpfam00666, Cathelicidins, Cathelicidin. A novel protein family, showing a conserved proregion and a variable carboxyl-terminal antimicrobial domain. This region shows similarity to cystatins.¡€0€ª€0€ €CDD¡€ €¯A¢€0€0€ €³pfam00667, FAD_binding_1, FAD binding domain. This domain is found in sulfite reductase, NADPH cytochrome P450 reductase, Nitric oxide synthase and methionine synthase reductase.¡€0€ª€0€ €CDD¡€ €¯B¢€0€0€ €‚Ápfam00668, Condensation, Condensation domain. This domain is found in many multi-domain enzymes which synthesize peptide antibiotics. This domain catalyzes a condensation reaction to form peptide bonds in non- ribosomal peptide biosynthesis. It is usually found to the carboxy side of a phosphopantetheine binding domain (pfam00550). It has been shown that mutations in the HHXXXDG motif abolish activity suggesting this is part of the active site.¡€0€ª€0€ €CDD¡€ €¯C¢€0€0€ €‚pfam00669, Flagellin_N, Bacterial flagellin N-terminal helical region. Flagellins polymerize to form bacterial flagella. This family includes flagellins and hook associated protein 3. Structurally this family forms an extended helix that interacts with pfam00700.¡€0€ª€0€ €CDD¡€ €B¢€0€0€ €Tpfam00670, AdoHcyase_NAD, S-adenosyl-L-homocysteine hydrolase, NAD binding domain. ¡€0€ª€0€ €CDD¡€ €B¢€0€0€ €pfam00672, HAMP, HAMP domain. ¡€0€ª€0€ €CDD¡€ €¯D¢€0€0€ €lpfam00673, Ribosomal_L5_C, ribosomal L5P family C-terminus. This region is found associated with pfam00281.¡€0€ª€0€ €CDD¡€ €¯E¢€0€0€ €ëpfam00674, DUP, DUP family. This family consists of several yeast proteins of unknown functions. Swiss-prot annotates these as belonging to the DUP family. Several members of this family contain an internal duplication of this region.¡€0€ª€0€ €CDD¡€ €¯F¢€0€0€ €>pfam00675, Peptidase_M16, Insulinase (Peptidase family M16). ¡€0€ª€0€ €CDD¡€ €¯G¢€0€0€ €Ópfam00676, E1_dh, Dehydrogenase E1 component. This family uses thiamine pyrophosphate as a cofactor. This family includes pyruvate dehydrogenase, 2-oxoglutarate dehydrogenase and 2-oxoisovalerate dehydrogenase.¡€0€ª€0€ €CDD¡€ €¯H¢€0€0€ €³pfam00677, Lum_binding, Lumazine binding domain. This domain binds to derivatives of lumazine in some proteins. Some proteins have lost the residues involved in binding lumazine.¡€0€ª€0€ €CDD¡€ €¯I¢€0€0€ €Ýpfam00679, EFG_C, Elongation factor G C-terminus. This domain includes the carboxyl terminal regions of Elongation factor G, elongation factor 2 and some tetracycline resistance proteins and adopt a ferredoxin-like fold.¡€0€ª€0€ €CDD¡€ €¯J¢€0€0€ €2pfam00680, RdRP_1, RNA dependent RNA polymerase. ¡€0€ª€0€ €CDD¡€ €B¢€0€0€ €‡pfam00681, Plectin, Plectin repeat. This family includes repeats from plectin, desmoplakin, envoplakin and bullous pemphigoid antigen.¡€0€ª€0€ €CDD¡€ €¯K¢€0€0€ €–pfam00682, HMGL-like, HMGL-like. This family contains a diverse set of enzymes. These include various aldolases and a region of pyruvate carboxylase.¡€0€ª€0€ €CDD¡€ €B ¢€0€0€ €Àpfam00683, TB, TB domain. This domain is also known as the 8 cysteine domain. This family includes the hybrid domains. This cysteine rich repeat is found in TGF binding protein and fibrillin.¡€0€ª€0€ €CDD¡€ €¯L¢€0€0€ €‚ãpfam00684, DnaJ_CXXCXGXG, DnaJ central domain. The central cysteine-rich (CR) domain of DnaJ proteins contains four repeats of the motif CXXCXGXG where X is any amino acid. The isolated cysteine rich domain folds in zinc dependent fashion. Each set of two repeats binds one unit of zinc. Although this domain has been implicated in substrate binding, no evidence of specific interaction between the isolated DNAJ cysteine rich domain and various hydrophobic peptides has been found.¡€0€ª€0€ €CDD¡€ €¯M¢€0€0€ €6pfam00685, Sulfotransfer_1, Sulfotransferase domain. ¡€0€ª€0€ €CDD¡€ €¯N¢€0€0€ €+pfam00686, CBM_20, Starch binding domain. ¡€0€ª€0€ €CDD¡€ €¯O¢€0€0€ €tpfam00687, Ribosomal_L1, Ribosomal protein L1p/L10e family. This family includes prokaryotic L1 and eukaryotic L10.¡€0€ª€0€ €CDD¡€ €¯P¢€0€0€ €Åpfam00688, TGFb_propeptide, TGF-beta propeptide. This propeptide is known as latency associated peptide (LAP) in TGF-beta. LAP is a homodimer which is disulfide linked to TGF-beta binding protein.¡€0€ª€0€ €CDD¡€ €¯Q¢€0€0€ €Åpfam00689, Cation_ATPase_C, Cation transporting ATPase, C-terminus. Members of this families are involved in Na+/K+, H+/K+, Ca++ and Mg++ transport. This family represents 5 transmembrane helices.¡€0€ª€0€ €CDD¡€ €¯R¢€0€0€ €”pfam00690, Cation_ATPase_N, Cation transporter/ATPase, N-terminus. Members of this families are involved in Na+/K+, H+/K+, Ca++ and Mg++ transport.¡€0€ª€0€ €CDD¡€ €¯S¢€0€0€ €ƒpfam00691, OmpA, OmpA family. The Pfam entry also includes MotB and related proteins which are not included in the Prosite family.¡€0€ª€0€ €CDD¡€ €B)¢€0€0€ €Ppfam00692, dUTPase, dUTPase. dUTPase hydrolyses dUTP to dUMP and pyrophosphate.¡€0€ª€0€ €CDD¡€ €B*¢€0€0€ €:pfam00693, Herpes_TK, Thymidine kinase from herpesvirus. ¡€0€ª€0€ €CDD¡€ €B+¢€0€0€ €²pfam00694, Aconitase_C, Aconitase C-terminal domain. Members of this family usually also match to pfam00330. This domain undergoes conformational change in the enzyme mechanism.¡€0€ª€0€ €CDD¡€ €¯T¢€0€0€ €;pfam00695, vMSA, Major surface antigen from hepadnavirus. ¡€0€ª€0€ €CDD¡€ €B-¢€0€0€ €‚cpfam00696, AA_kinase, Amino acid kinase family. This family includes kinases that phosphorylate a variety of amino acid substrates, as well as uridylate kinase and carbamate kinase. This family includes: Aspartokinase EC:2.7.2.4. Acetylglutamate kinase EC:2.7.2.8. Glutamate 5-kinase EC:2.7.2.11. Uridylate kinase EC:2.7.4.-. Carbamate kinase EC:2.7.2.2.¡€0€ª€0€ €CDD¡€ €¯U¢€0€0€ €Dpfam00697, PRAI, N-(5'phosphoribosyl)anthranilate (PRA) isomerase. ¡€0€ª€0€ €CDD¡€ €¯V¢€0€0€ €4pfam00698, Acyl_transf_1, Acyl transferase domain. ¡€0€ª€0€ €CDD¡€ €B0¢€0€0€ €\pfam00699, Urease_beta, Urease beta subunit. This subunit is known as alpha in Heliobacter.¡€0€ª€0€ €CDD¡€ €¯W¢€0€0€ €‚‚pfam00700, Flagellin_C, Bacterial flagellin C-terminal helical region. Flagellins polymerize to form bacterial flagella. There is some similarity between this family and pfam00669, particularly the motif NRFXSXIXXL. It has been suggested that these two regions associate and this is shown to be correct as structurally this family forms an extended helix that interacts with pfam00700.¡€0€ª€0€ €CDD¡€ €B2¢€0€0€ €apfam00701, DHDPS, Dihydrodipicolinate synthetase family. This family has a TIM barrel structure.¡€0€ª€0€ €CDD¡€ €B3¢€0€0€ €‚Æpfam00702, Hydrolase, haloacid dehalogenase-like hydrolase. This family is structurally different from the alpha/beta hydrolase family (pfam00561). This family includes L-2-haloacid dehalogenase, epoxide hydrolases and phosphatases. The structure of the family consists of two domains. One is an inserted four helix bundle, which is the least well conserved region of the alignment, between residues 16 and 96 of Pseudomonas sp. (S)-2-haloacid dehalogenase 1. The rest of the fold is composed of the core alpha/beta domain. Those members with the characteristic DxD triad at the N-terminus are probably phosphatidylglycerolphosphate (PGP) phosphatases involved in cardiolipin biosynthesis in the mitochondria.¡€0€ª€0€ €CDD¡€ €¯X¢€0€0€ €•pfam00703, Glyco_hydro_2, Glycosyl hydrolases family 2. This family contains beta-galactosidase, beta-mannosidase and beta-glucuronidase activities.¡€0€ª€0€ €CDD¡€ €¯Y¢€0€0€ €;pfam00704, Glyco_hydro_18, Glycosyl hydrolases family 18. ¡€0€ª€0€ €CDD¡€ €¯Z¢€0€0€ €êpfam00705, PCNA_N, Proliferating cell nuclear antigen, N-terminal domain. N-terminal and C-terminal domains of PCNA are topologically identical. Three PCNA molecules are tightly associated to form a closed ring encircling duplex DNA.¡€0€ª€0€ €CDD¡€ €B7¢€0€0€ €)pfam00706, Toxin_4, Anenome neurotoxin. ¡€0€ª€0€ €CDD¡€ €¬¶¢€0€0€ €Jpfam00707, IF3_C, Translation initiation factor IF-3, C-terminal domain. ¡€0€ª€0€ €CDD¡€ €¯[¢€0€0€ €.pfam00708, Acylphosphatase, Acylphosphatase. ¡€0€ª€0€ €CDD¡€ €¯\¢€0€0€ €:pfam00709, Adenylsucc_synt, Adenylosuccinate synthetase. ¡€0€ª€0€ €CDD¡€ €¯]¢€0€0€ €apfam00710, Asparaginase, Asparaginase, N-terminal. This is the N-terminal domain of this enzyme.¡€0€ª€0€ €CDD¡€ €¯^¢€0€0€ €¦pfam00711, Defensin_beta, Beta defensin. The beta defensins are antimicrobial peptides implicated in the resistance of epithelial surfaces to microbial colonisation.¡€0€ª€0€ €CDD¡€ €¯_¢€0€0€ €øpfam00712, DNA_pol3_beta, DNA polymerase III beta subunit, N-terminal domain. A dimer of the beta subunit of DNA polymerase beta forms a ring which encircles duplex DNA. Each monomer contains three domains of identical topology and DNA clamp fold.¡€0€ª€0€ €CDD¡€ €B=¢€0€0€ €pfam00713, Hirudin, Hirudin. ¡€0€ª€0€ €CDD¡€ €¯`¢€0€0€ €)pfam00714, IFN-gamma, Interferon gamma. ¡€0€ª€0€ €CDD¡€ €¯a¢€0€0€ € pfam00715, IL2, Interleukin 2. ¡€0€ª€0€ €CDD¡€ €¯b¢€0€0€ €=pfam00716, Peptidase_S21, Assemblin (Peptidase family S21). ¡€0€ª€0€ €CDD¡€ €BA¢€0€0€ €/pfam00717, Peptidase_S24, Peptidase S24-like. ¡€0€ª€0€ €CDD¡€ €¯c¢€0€0€ €5pfam00718, Polyoma_coat, Polyomavirus coat protein. ¡€0€ª€0€ €CDD¡€ €¯d¢€0€0€ €8pfam00719, Pyrophosphatase, Inorganic pyrophosphatase. ¡€0€ª€0€ €CDD¡€ €¯e¢€0€0€ €,pfam00720, SSI, Subtilisin inhibitor-like. ¡€0€ª€0€ €CDD¡€ €¯f¢€0€0€ €¥pfam00721, TMV_coat, Virus coat protein (TMV like). This family contains coat proteins from tobamoviruses, hordeiviruses, Tobraviruses, Furoviruses and Potyviruses.¡€0€ª€0€ €CDD¡€ €¯g¢€0€0€ €;pfam00722, Glyco_hydro_16, Glycosyl hydrolases family 16. ¡€0€ª€0€ €CDD¡€ €¯h¢€0€0€ €‹pfam00723, Glyco_hydro_15, Glycosyl hydrolases family 15. In higher organisms this family is represented by phosphorylase kinase subunits.¡€0€ª€0€ €CDD¡€ €¯i¢€0€0€ €Lpfam00724, Oxidored_FMN, NADH:flavin oxidoreductase / NADH oxidase family. ¡€0€ª€0€ €CDD¡€ €BI¢€0€0€ €¤pfam00725, 3HCDH, 3-hydroxyacyl-CoA dehydrogenase, C-terminal domain. This family also includes lambda crystallin. Some proteins include two copies of this domain.¡€0€ª€0€ €CDD¡€ €¯j¢€0€0€ €"pfam00726, IL10, Interleukin 10. ¡€0€ª€0€ €CDD¡€ €BK¢€0€0€ € pfam00727, IL4, Interleukin 4. ¡€0€ª€0€ €CDD¡€ €¯k¢€0€0€ €npfam00728, Glyco_hydro_20, Glycosyl hydrolase family 20, catalytic domain. This domain has a TIM barrel fold.¡€0€ª€0€ €CDD¡€ €¯l¢€0€0€ €7pfam00729, Viral_coat, Viral coat protein (S domain). ¡€0€ª€0€ €CDD¡€ €BN¢€0€0€ €‚·pfam00730, HhH-GPD, HhH-GPD superfamily base excision DNA repair protein. This family contains a diverse range of structurally related DNA repair proteins. The superfamily is called the HhH-GPD family after its hallmark Helix-hairpin-helix and Gly/Pro rich loop followed by a conserved aspartate. This includes endonuclease III, EC:4.2.99.18 and MutY an A/G-specific adenine glycosylase, both have a C terminal 4Fe-4S cluster. The family also includes 8-oxoguanine DNA glycosylases. The methyl-CPG binding protein MBD4 also contains a related domain that is a thymine DNA glycosylase. The family also includes DNA-3-methyladenine glycosylase II EC:3.2.2.21 and other members of the AlkA family.¡€0€ª€0€ €CDD¡€ €BO¢€0€0€ €‚pfam00731, AIRC, AIR carboxylase. Members of this family catalyze the decarboxylation of 1-(5-phosphoribosyl)-5-amino-4-imidazole-carboxylate (AIR). This family catalyze the sixth step of de novo purine biosynthesis. Some members of this family contain two copies of this domain.¡€0€ª€0€ €CDD¡€ €¯m¢€0€0€ €\pfam00732, GMC_oxred_N, GMC oxidoreductase. This family of proteins bind FAD as a cofactor.¡€0€ª€0€ €CDD¡€ €¯n¢€0€0€ €°pfam00733, Asn_synthase, Asparagine synthase. This family is always found associated with pfam00310. Members of this family catalyze the conversion of aspartate to asparagine.¡€0€ª€0€ €CDD¡€ €¯o¢€0€0€ €4pfam00734, CBM_1, Fungal cellulose binding domain. ¡€0€ª€0€ €CDD¡€ €¯p¢€0€0€ €‚ïpfam00735, Septin, Septin. Members of this family include CDC3, CDC10, CDC11 and CDC12/Septin. Members of this family bind GTP. As regards the septins, these are polypeptides of 30-65kDa with three characteristic GTPase motifs (G-1, G-3 and G-4) that are similar to those of the Ras family. The G-4 motif is strictly conserved with a unique septin consensus of AKAD. Most septins are thought to have at least one coiled-coil region, which in some cases is necessary for intermolecular interactions that allow septins to polymerize to form rod-shaped complexes. In turn, these are arranged into tandem arrays to form filaments. They are multifunctional proteins, with roles in cytokinesis, sporulation, germ cell development, exocytosis and apoptosis.¡€0€ª€0€ €CDD¡€ €¯q¢€0€0€ €—pfam00736, EF1_GNE, EF-1 guanine nucleotide exchange domain. This family is the guanine nucleotide exchange domain of EF-1 beta and EF-1 delta chains.¡€0€ª€0€ €CDD¡€ €¯r¢€0€0€ €upfam00737, PsbH, Photosystem II 10 kDa phosphoprotein. This protein is phosphorylated in a light dependent reaction.¡€0€ª€0€ €CDD¡€ €¯s¢€0€0€ €pfam00738, Polyhedrin, Polyhedrin. These proteins are found in occlusion bodies in various viruses. The polyhedrin protein protects the virus.¡€0€ª€0€ €CDD¡€ €¯t¢€0€0€ €}pfam00739, X, Trans-activation protein X. This protein is found in hepadnaviruses where it is indispensable for replication.¡€0€ª€0€ €CDD¡€ €¬×¢€0€0€ €‚/pfam00740, Parvo_coat, Parvovirus coat protein VP2. This protein, together with VP1 forms a capsomer. Both of these proteins are formed from the same transcript using alternative splicing. As a result, VP1 and VP2 differ only in the N-terminal region of VP1. VP2 is involved in packaging the viral DNA.¡€0€ª€0€ €CDD¡€ €BX¢€0€0€ €.pfam00741, Gas_vesicle, Gas vesicle protein. ¡€0€ª€0€ €CDD¡€ €¯u¢€0€0€ €5pfam00742, Homoserine_dh, Homoserine dehydrogenase. ¡€0€ª€0€ €CDD¡€ €¯v¢€0€0€ €£pfam00743, FMO-like, Flavin-binding monooxygenase-like. This family includes FMO proteins, cyclohexanone mono-oxygenase and a number of different mono-oxygenases.¡€0€ª€0€ €CDD¡€ €¬Û¢€0€0€ €Jpfam00745, GlutR_dimer, Glutamyl-tRNAGlu reductase, dimerisation domain. ¡€0€ª€0€ €CDD¡€ €¯w¢€0€0€ €3pfam00746, Gram_pos_anchor, Gram positive anchor. ¡€0€ª€0€ €CDD¡€ €¯x¢€0€0€ €vpfam00747, Viral_DNA_bp, ssDNA binding protein. This protein is found in herpesviruses and is needed for replication.¡€0€ª€0€ €CDD¡€ €B]¢€0€0€ €ppfam00748, Calpain_inhib, Calpain inhibitor. This region is found multiple times in calpain inhibitor proteins.¡€0€ª€0€ €CDD¡€ €¯y¢€0€0€ €‚:pfam00749, tRNA-synt_1c, tRNA synthetases class I (E and Q), catalytic domain. Other tRNA synthetase sub-families are too dissimilar to be included. This family includes only glutamyl and glutaminyl tRNA synthetases. In some organisms, a single glutamyl-tRNA synthetase aminoacylates both tRNA(Glu) and tRNA(Gln).¡€0€ª€0€ €CDD¡€ €B_¢€0€0€ €°pfam00750, tRNA-synt_1d, tRNA synthetases class I (R). Other tRNA synthetase sub-families are too dissimilar to be included. This family includes only arginyl tRNA synthetase.¡€0€ª€0€ €CDD¡€ €B`¢€0€0€ €‚Epfam00751, DM, DM DNA binding domain. The DM domain is named after dsx and mab-3. dsx contains a single amino-terminal DM domain, whereas mab-3 contains two amino-terminal domains. The DM domain has a pattern of conserved zinc chelating residues C2H2C4. The dsx DM domain has been shown to dimerize and bind palindromic DNA.¡€0€ª€0€ €CDD¡€ €¯z¢€0€0€ €*pfam00752, XPG_N, XPG N-terminal domain. ¡€0€ª€0€ €CDD¡€ €¯{¢€0€0€ €=pfam00753, Lactamase_B, Metallo-beta-lactamase superfamily. ¡€0€ª€0€ €CDD¡€ €¯|¢€0€0€ €lpfam00754, F5_F8_type_C, F5/8 type C domain. This domain is also known as the discoidin (DS) domain family.¡€0€ª€0€ €CDD¡€ €¯}¢€0€0€ €Bpfam00755, Carn_acyltransf, Choline/Carnitine o-acyltransferase. ¡€0€ª€0€ €CDD¡€ €¯~¢€0€0€ €Æpfam00756, Esterase, Putative esterase. This family contains Esterase D. However it is not clear if all members of the family have the same function. This family is related to the pfam00135 family.¡€0€ª€0€ €CDD¡€ €¯¢€0€0€ €9pfam00757, Furin-like, Furin-like cysteine rich region. ¡€0€ª€0€ €CDD¡€ €¯€¢€0€0€ €4pfam00758, EPO_TPO, Erythropoietin/thrombopoietin. ¡€0€ª€0€ €CDD¡€ €¯¢€0€0€ €8pfam00759, Glyco_hydro_9, Glycosyl hydrolase family 9. ¡€0€ª€0€ €CDD¡€ €¯‚¢€0€0€ €3pfam00760, Cucumo_coat, Cucumovirus coat protein. ¡€0€ª€0€ €CDD¡€ €Bi¢€0€0€ €6pfam00761, Polyoma_coat2, Polyomavirus coat protein. ¡€0€ª€0€ €CDD¡€ €Ñ¢€0€0€ €,pfam00762, Ferrochelatase, Ferrochelatase. ¡€0€ª€0€ €CDD¡€ €¯ƒ¢€0€0€ €Zpfam00763, THF_DHG_CYH, Tetrahydrofolate dehydrogenase/cyclohydrolase, catalytic domain. ¡€0€ª€0€ €CDD¡€ €¯„¢€0€0€ €\pfam00764, Arginosuc_synth, Arginosuccinate synthase. This family contains a PP-loop motif.¡€0€ª€0€ €CDD¡€ €Bl¢€0€0€ €1pfam00765, Autoind_synth, Autoinducer synthase. ¡€0€ª€0€ €CDD¡€ €¯…¢€0€0€ €‚[pfam00766, ETF_alpha, Electron transfer flavoprotein FAD-binding domain. This domain found at the C-terminus of electron transfer flavoprotein alpha chain and binds to FAD. The fold consists of a five-stranded parallel beta sheet as the core of the domain, flanked by alternating helices. A small part of this domain is donated by the beta chain.¡€0€ª€0€ €CDD¡€ €¯†¢€0€0€ €/pfam00767, Poty_coat, Potyvirus coat protein. ¡€0€ª€0€ €CDD¡€ €Bo¢€0€0€ €@pfam00768, Peptidase_S11, D-alanyl-D-alanine carboxypeptidase. ¡€0€ª€0€ €CDD¡€ €Bp¢€0€0€ €¸pfam00769, ERM, Ezrin/radixin/moesin family. This family of proteins contain a band 4.1 domain (pfam00373), at their amino terminus. This family represents the rest of these proteins.¡€0€ª€0€ €CDD¡€ €¯‡¢€0€0€ €¢pfam00770, Peptidase_C5, Adenovirus endoprotease. This family of adenovirus thiol endoproteases specifically cleave Gly-Ala peptides in viral precursor peptides.¡€0€ª€0€ €CDD¡€ €Br¢€0€0€ €#pfam00771, FHIPEP, FHIPEP family. ¡€0€ª€0€ €CDD¡€ €¯ˆ¢€0€0€ €‚=pfam00772, DnaB, DnaB-like helicase N terminal domain. The hexameric helicase DnaB unwinds the DNA duplex at the Escherichia coli chromosome replication fork. Although the mechanism by which DnaB both couples ATP hydrolysis to translocation along DNA and denatures the duplex is unknown, a change in the quaternary structure of the protein involving dimerisation of the N-terminal domain has been observed and may occur during the enzymatic cycle. This N-terminal domain is required both for interaction with other proteins in the primosome and for DnaB helicase activity.¡€0€ª€0€ €CDD¡€ €¯‰¢€0€0€ €Tpfam00773, RNB, RNB domain. This domain is the catalytic domain of ribonuclease II.¡€0€ª€0€ €CDD¡€ €¯Š¢€0€0€ €(pfam00775, Dioxygenase_C, Dioxygenase. ¡€0€ª€0€ €CDD¡€ €¯‹¢€0€0€ €pfam00777, Glyco_transf_29, Glycosyltransferase family 29 (sialyltransferase). Members of this family belong to glycosyltransferase family 29.¡€0€ª€0€ €CDD¡€ €¯Œ¢€0€0€ €‚?pfam00778, DIX, DIX domain. The DIX domain is present in Dishevelled and axin. This domain is involved in homo- and hetero-oligomerization. It is involved in the homo- oligomerization of mouse axin. The axin DIX domain also interacts with the dishevelled DIX domain. The DIX domain has also been called the DAX domain.¡€0€ª€0€ €CDD¡€ €¯¢€0€0€ €‚pfam00779, BTK, BTK motif. Zinc-binding motif containing conserved cysteines and a histidine. Always found C-terminal to PH domains. The crystal structure shows this motif packs against the PH domain. The PH+Btk module pair has been called the Tec homology (TH) region.¡€0€ª€0€ €CDD¡€ €¯Ž¢€0€0€ €{pfam00780, CNH, CNH domain. Domain found in NIK1-like kinase, mouse citron and yeast ROM1, ROM2. Unpublished observations.¡€0€ª€0€ €CDD¡€ €¯¢€0€0€ €‚ãpfam00781, DAGK_cat, Diacylglycerol kinase catalytic domain. Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. The catalytic domain is assumed from the finding of bacterial homologs. YegS is the Escherichia coli protein in this family whose crystal structure reveals an active site in the inter-domain cleft formed by four conserved sequence motifs, revealing a novel metal-binding site. The residues of this site are conserved across the family.¡€0€ª€0€ €CDD¡€ €¯¢€0€0€ €âpfam00782, DSPc, Dual specificity phosphatase, catalytic domain. Ser/Thr and Tyr protein phosphatases. The enzyme's tertiary fold is highly similar to that of tyrosine-specific phosphatases, except for a "recognition" region.¡€0€ª€0€ €CDD¡€ €¯‘¢€0€0€ €‡pfam00784, MyTH4, MyTH4 domain. Domain in myosin and kinesin tails, present twice in myosin-VIIa, and also present in 3 other myosins.¡€0€ª€0€ €CDD¡€ €¯’¢€0€0€ €Ÿpfam00786, PBD, P21-Rho-binding domain. Small domains that bind Cdc42p- and/or Rho-like small GTPases. Also known as the Cdc42/Rac interactive binding (CRIB).¡€0€ª€0€ €CDD¡€ €¯“¢€0€0€ €@pfam00787, PX, PX domain. PX domains bind to phosphoinositides.¡€0€ª€0€ €CDD¡€ €¯”¢€0€0€ €‚Npfam00788, RA, Ras association (RalGDS/AF-6) domain. RasGTP effectors (in cases of AF6, canoe and RalGDS); putative RasGTP effectors in other cases. Recent evidence (not yet in MEDLINE) shows that some RA domains do NOT bind RasGTP. Predicted structure similar to that determined, and that of the RasGTP-binding domain of Raf kinase.¡€0€ª€0€ €CDD¡€ €¯•¢€0€0€ €pfam00789, UBX, UBX domain. This domain is present in ubiquitin-regulatory proteins and is a general Cdc48-interacting module.¡€0€ª€0€ €CDD¡€ €B¢€0€0€ €Dpfam00790, VHS, VHS domain. Domain present in VPS-27, Hrs and STAM.¡€0€ª€0€ €CDD¡€ €¯–¢€0€0€ €npfam00791, ZU5, ZU5 domain. Domain present in ZO-1 and Unc5-like netrin receptors Domain of unknown function.¡€0€ª€0€ €CDD¡€ €¯—¢€0€0€ €“pfam00792, PI3K_C2, Phosphoinositide 3-kinase C2. Phosphoinositide 3-kinase region postulated to contain a C2 domain. Outlier of pfam00168 family.¡€0€ª€0€ €CDD¡€ €¯˜¢€0€0€ €‚¾pfam00793, DAHP_synth_1, DAHP synthetase I family. Members of this family catalyze the first step in aromatic amino acid biosynthesis from chorismate. E-coli has three related synthetases, which are inhibited by different aromatic amino acids. This family also includes KDSA which has very similar catalytic activity but is involved in the first step of liposaccharide biosynthesis. The enzyme is also part of the shikimate pathway, EC:2.5.1.54.¡€0€ª€0€ €CDD¡€ €B„¢€0€0€ €‚pfam00794, PI3K_rbd, PI3-kinase family, ras-binding domain. Certain members of the PI3K family possess Ras-binding domains in their N-termini. These regions show some similarity (although not highly significant similarity) to Ras-binding pfam00788 domains (unpublished observation).¡€0€ª€0€ €CDD¡€ €¯™¢€0€0€ €‚ƒpfam00795, CN_hydrolase, Carbon-nitrogen hydrolase. This family contains hydrolases that break carbon-nitrogen bonds. The family includes: Nitrilase EC:3.5.5.1, Aliphatic amidase EC:3.5.1.4, Biotidinase EC:3.5.1.12, Beta-ureidopropionase EC:3.5.1.6. Nitrilase-related proteins generally have a conserved E-K-C catalytic triad, and are multimeric alpha-beta-beta-alpha sandwich proteins.¡€0€ª€0€ €CDD¡€ €¯š¢€0€0€ €?pfam00796, PSI_8, Photosystem I reaction centre subunit VIII. ¡€0€ª€0€ €CDD¡€ €B‡¢€0€0€ €‚Àpfam00797, Acetyltransf_2, N-acetyltransferase. Arylamine N-acetyltransferase (NAT) is a cytosolic enzyme of approximately 30kDa. It facilitates the transfer of an acetyl group from Acetyl Coenzyme A on to a wide range of arylamine, N-hydroxyarylamines and hydrazines. Acetylation of these compounds generally results in inactivation. NAT is found in many species from Mycobacteria (M. tuberculosis, M. smegmatis etc) to man. It was the first enzyme to be observed to have polymorphic activity amongst human individuals. NAT is responsible for the inactivation of Isoniazid (a drug used to treat Tuberculosis) in humans. The NAT protein has also been shown to be involved in the breakdown of folic acid.¡€0€ª€0€ €CDD¡€ €¯›¢€0€0€ €6pfam00798, Arena_glycoprot, Arenavirus glycoprotein. ¡€0€ª€0€ €CDD¡€ €B‰¢€0€0€ €‚kpfam00799, Gemini_AL1, Geminivirus Rep catalytic domain. The AL1 proteins encodes the replication initiator protein (Rep) of geminiviruses, which is a replicon-specific initiator enzyme and is an essential component of the replisome. For geminivirus Rep protein, this N-terminal region is crucial for origin recognition and DNA cleavage and nucleotidyl transfer.¡€0€ª€0€ €CDD¡€ €BŠ¢€0€0€ €¬pfam00800, PDT, Prephenate dehydratase. This protein is involved in Phenylalanine biosynthesis. This protein catalyzes the decarboxylation of prephenate to phenylpyruvate.¡€0€ª€0€ €CDD¡€ €¯œ¢€0€0€ €§pfam00801, PKD, PKD domain. This domain was first identified in the Polycystic kidney disease protein PKD1. This domain has been predicted to contain an Ig-like fold.¡€0€ª€0€ €CDD¡€ €BŒ¢€0€0€ €‚Œpfam00802, Glycoprotein_G, Pneumovirus attachment glycoprotein G. This family includes attachment proteins from respiratory synctial virus. Glycoprotein G has not been shown to have any neuraminidase or hemagglutinin activity. The amino terminus is thought to be cytoplasmic, and the carboxyl terminus extracellular. The extracellular region contains four completely conserved cysteine residues.¡€0€ª€0€ €CDD¡€ €4¢€0€0€ €‚èpfam00803, 3A, 3A/RNA2 movement protein family. This family includes movement proteins from various viruses. The 3A protein is found in bromoviruses and Cucumoviruses. The genome of these viruses contain 3 RNA segments. The third segment (RNA 3) contains two proteins, the coat protein and the 3A protein. The function of the 3A protein is uncertain but has been shown to be involved in cell-to- cell movement of the virus. The family also includes movement proteins from Dianthoviruses.¡€0€ª€0€ €CDD¡€ €B¢€0€0€ €‚²pfam00804, Syntaxin, Syntaxin. Syntaxins are the prototype family of SNARE proteins. They usually consist of three main regions - a C-terminal transmembrane region, a central SNARE domain which is characteristic of and conserved in all syntaxins (pfam05739), and an N-terminal domain that is featured in this entry. This domain varies between syntaxin isoforms; in syntaxin 1A it is found as three alpha-helices with a left-handed twist. It may fold back on the SNARE domain to allow the molecule to adopt a 'closed' configuration that prevents formation of the core fusion complex - it thus has an auto-inhibitory role. The function of syntaxins is determined by their localization. They are involved in neuronal exocytosis, ER-Golgi transport and Golgi-endosome transport, for example. They also interact with other proteins as well as those involved in SNARE complexes. These include vesicle coat proteins, Rab GTPases, and tethering factors.¡€0€ª€0€ €CDD¡€ €¯¢€0€0€ €‚cpfam00805, Pentapeptide, Pentapeptide repeats (8 copies). These repeats are found in many cyanobacterial proteins. The repeats were first identified in hglK. The function of these repeats is unknown. The structure of this repeat has been predicted to be a beta-helix. The repeat can be approximately described as A(D/N)LXX, where X can be any amino acid.¡€0€ª€0€ €CDD¡€ €¯ž¢€0€0€ €‚pfam00806, PUF, Pumilio-family RNA binding repeat. Puf repeats (aka PUM-HD, Pumilio homology domain) are necessary and sufficient for sequence specific RNA binding in fly Pumilio and worm FBF-1 and FBF-2. Both proteins function as translational repressors in early embryonic development by binding sequences in the 3' UTR of target mRNAs (e.g. the nanos response element (NRE) in fly Hunchback mRNA, or the point mutation element (PME) in worm fem-3 mRNA). Other proteins that contain Puf domains are also plausible RNA binding proteins. Puf domains usually occur as a tandem repeat of 8 domains. The Pfam model does not necessarily recognize all 8 repeats in all sequences; some sequences appear to have 5 or 6 repeats on initial analysis, but further analysis suggests the presence of additional divergent repeats. Structures of PUF repeat proteins show they consist of a two helix structure.¡€0€ª€0€ €CDD¡€ €Ñ"¢€0€0€ €‚pfam00807, Apidaecin, Apidaecin. These antibacterial peptides are found in bees. These heat-stable, non-helical peptides are active against a wide range of plant-associated bacteria and some human pathogens. The Pfam alignment includes the propeptide and apidaecin sequence.¡€0€ª€0€ €CDD¡€ €¯Ÿ¢€0€0€ €Ãpfam00808, CBFD_NFYB_HMF, Histone-like transcription factor (CBF/NF-Y) and archaeal histone. This family includes archaebacterial histones and histone like transcription factors from eukaryotes.¡€0€ª€0€ €CDD¡€ €B‘¢€0€0€ €‚ëpfam00809, Pterin_bind, Pterin binding enzyme. This family includes a variety of pterin binding enzymes that all adopt a TIM barrel fold. The family includes dihydropteroate synthase EC:2.5.1.15 as well as a group methyltransferase enzymes including methyltetrahydrofolate, corrinoid iron-sulfur protein methyltransferase (MeTr) that catalyzes a key step in the Wood-Ljungdahl pathway of carbon dioxide fixation. It transfers the N5-methyl group from methyltetrahydrofolate (CH3-H4folate) to a cob(I)amide centre in another protein, the corrinoid iron-sulfur protein. MeTr is a member of a family of proteins that includes methionine synthase and methanogenic enzymes that activate the methyl group of methyltetra-hydromethano(or -sarcino)pterin.¡€0€ª€0€ €CDD¡€ €¯ ¢€0€0€ €Bpfam00810, ER_lumen_recept, ER lumen protein retaining receptor. ¡€0€ª€0€ €CDD¡€ €¯¡¢€0€0€ €"pfam00811, Ependymin, Ependymin. ¡€0€ª€0€ €CDD¡€ €¯¢¢€0€0€ €pfam00812, Ephrin, Ephrin. ¡€0€ª€0€ €CDD¡€ €¯£¢€0€0€ €pfam00813, FliP, FliP family. ¡€0€ª€0€ €CDD¡€ €¯¤¢€0€0€ €‚Àpfam00814, Peptidase_M22, Glycoprotease family. The Peptidase M22 proteins are part of the HSP70-actin superfamily. The region represented here is an insert into the fold and is not found in the rest of the family (beyond the Peptidase M22 family). Included in this family are the Rhizobial NodU proteins and the HypF regulator. This region also contains the histidine dyad believed to coordinate the metal ion and hence provide catalytic activity. Interestingly the histidines are not well conserved, and there is a lack of experimental evidence to support peptidase activity as a general property of this family. There also appear to be instances of this domain outside of the HSP70-actin superfamily.¡€0€ª€0€ €CDD¡€ €¯¥¢€0€0€ €5pfam00815, Histidinol_dh, Histidinol dehydrogenase. ¡€0€ª€0€ €CDD¡€ €¯¦¢€0€0€ €.pfam00816, Histone_HNS, H-NS histone family. ¡€0€ª€0€ €CDD¡€ €¯§¢€0€0€ €Upfam00817, IMS, impB/mucB/samB family. These proteins are involved in UV protection.¡€0€ª€0€ €CDD¡€ €¯¨¢€0€0€ €;pfam00818, Ice_nucleation, Ice nucleation protein repeat. ¡€0€ª€0€ €CDD¡€ €¯©¢€0€0€ €‚pfam00819, Myotoxins, Myotoxin, crotamine. Crotamine is a family of cationic peptides expressed by the venom gland of, for example, Crotalus durissus terrificus. It acts as a cell-penetrating peptide (CPP) and as a potent voltage-gated potassium channel (Kv) inhibitor.¡€0€ª€0€ €CDD¡€ €­#¢€0€0€ €›pfam00820, Lipoprotein_1, Borrelia lipoprotein. This family of lipoproteins is found in Borrelia spirochetes. The function of these proteins is uncertain.¡€0€ª€0€ €CDD¡€ €¯ª¢€0€0€ €ˆpfam00821, PEPCK, Phosphoenolpyruvate carboxykinase. catalyzes the formation of phosphoenolpyruvate by decarboxylation of oxaloacetate.¡€0€ª€0€ €CDD¡€ €¯«¢€0€0€ €;pfam00822, PMP22_Claudin, PMP-22/EMP/MP20/Claudin family. ¡€0€ª€0€ €CDD¡€ €Ñ-¢€0€0€ €‚‹pfam00823, PPE, PPE family. This family named after a PPE motif near to the amino terminus of the domain. The PPE family of proteins all contain an amino-terminal region of about 180 amino acids. The carboxyl terminus of this family are variable, and on the basis of this region fall into at least three groups. The MPTR subgroup has tandem copies of a motif NXGXGNXG. The second subgroup contains a conserved motif at about position 350. The third group are only related in the amino terminal region. The function of these proteins is uncertain but it has been suggested that they may be related to antigenic variation of Mycobacterium tuberculosis.¡€0€ª€0€ €CDD¡€ €¯¬¢€0€0€ €,pfam00825, Ribonuclease_P, Ribonuclease P. ¡€0€ª€0€ €CDD¡€ €BŸ¢€0€0€ €+pfam00827, Ribosomal_L15e, Ribosomal L15. ¡€0€ª€0€ €CDD¡€ €¯­¢€0€0€ €‚pfam00828, Ribosomal_L27A, Ribosomal proteins 50S-L15, 50S-L18e, 60S-L27A. This family includes higher eukaryotic ribosomal 60S L27A, archaeal 50S L18e, prokaryotic 50S L15, fungal mitochondrial L10, plant L27A, mitochondrial L15 and chloroplast L18-3 proteins.¡€0€ª€0€ €CDD¡€ €¯®¢€0€0€ €?pfam00829, Ribosomal_L21p, Ribosomal prokaryotic L21 protein. ¡€0€ª€0€ €CDD¡€ €¯¯¢€0€0€ €‚pfam00830, Ribosomal_L28, Ribosomal L28 family. The ribosomal 28 family includes L28 proteins from bacteria and chloroplasts. The L24 protein from yeast also contains a region of similarity to prokaryotic L28 proteins. L24 from yeast is also found in the large ribosomal subunit.¡€0€ª€0€ €CDD¡€ €¯°¢€0€0€ €2pfam00831, Ribosomal_L29, Ribosomal L29 protein. ¡€0€ª€0€ €CDD¡€ €¯±¢€0€0€ €2pfam00832, Ribosomal_L39, Ribosomal L39 protein. ¡€0€ª€0€ €CDD¡€ €¯²¢€0€0€ €+pfam00833, Ribosomal_S17e, Ribosomal S17. ¡€0€ª€0€ €CDD¡€ €¯³¢€0€0€ €žpfam00834, Ribul_P_3_epim, Ribulose-phosphate 3 epimerase family. This enzyme catalyzes the conversion of D-ribulose 5-phosphate into D-xylulose 5-phosphate.¡€0€ª€0€ €CDD¡€ €B§¢€0€0€ €ñpfam00835, SNAP-25, SNAP-25 family. SNAP-25 (synaptosome-associated protein 25 kDa) proteins are components of SNARE complexes. Members of this family contain a cluster of cysteine residues that can be palmitoylated for membrane attachment.¡€0€ª€0€ €CDD¡€ €¯´¢€0€0€ €‚pfam00836, Stathmin, Stathmin family. The Stathmin family of proteins play an important role in the regulation of the microtubule cytoskeleton. They regulate microtubule dynamics by promoting depolymerization of microtubules and/or preventing polymerization of tubulin heterodimers.¡€0€ª€0€ €CDD¡€ €¯µ¢€0€0€ €†pfam00837, T4_deiodinase, Iodothyronine deiodinase. Iodothyronine deiodinase converts thyroxine (T4) to 3,5,3'-triiodothyronine (T3).¡€0€ª€0€ €CDD¡€ €Bª¢€0€0€ €pfam00843, Arena_nucleocap, Arenavirus nucleocapsid protein. ¡€0€ª€0€ €CDD¡€ €B°¢€0€0€ €‚‡pfam00844, Gemini_coat, Geminivirus coat protein/nuclear export factor BR1 family. It has been shown that the 104 N-terminal amino acids of the maize streak virus coat protein bind DNA non- specifically. This family also includes various geminivirus movement proteins that are nuclear export factors or shuttles. One member BR1 facilitates the export of both ds and ss DNA form the nucleus.¡€0€ª€0€ €CDD¡€ €¯º¢€0€0€ €Ïpfam00845, Gemini_BL1, Geminivirus BL1 movement protein. Geminiviruses encode two movement proteins that are essential for systemic infection of their host but dispensable for replication and encapsidation.¡€0€ª€0€ €CDD¡€ €¯»¢€0€0€ €>pfam00846, Hanta_nucleocap, Hantavirus nucleocapsid protein. ¡€0€ª€0€ €CDD¡€ €B²¢€0€0€ €}pfam00847, AP2, AP2 domain. This 60 amino acid residue domain can bind to DNA and is found in transcription factor proteins.¡€0€ª€0€ €CDD¡€ €¯¼¢€0€0€ €øpfam00848, Ring_hydroxyl_A, Ring hydroxylating alpha subunit (catalytic domain). This family is the catalytic domain of aromatic-ring- hydroxylating dioxygenase systems. The active site contains a non-heme ferrous ion coordinated by three ligands.¡€0€ª€0€ €CDD¡€ €¯½¢€0€0€ €‚npfam00849, PseudoU_synth_2, RNA pseudouridylate synthase. Members of this family are involved in modifying bases in RNA molecules. They carry out the conversion of uracil bases to pseudouridine. This family includes RluD, a pseudouridylate synthase that converts specific uracils to pseudouridine in 23S rRNA. RluA from E. coli converts bases in both rRNA and tRNA.¡€0€ª€0€ €CDD¡€ €¯¾¢€0€0€ €‚,pfam00850, Hist_deacetyl, Histone deacetylase domain. Histones can be reversibly acetylated on several lysine residues. Regulation of transcription is caused in part by this mechanism. Histone deacetylases catalyze the removal of the acetyl group. Histone deacetylases are related to other proteins.¡€0€ª€0€ €CDD¡€ €¯¿¢€0€0€ €spfam00851, Peptidase_C6, Helper component proteinase. This protein is found in genome polyproteins of potyviruses.¡€0€ª€0€ €CDD¡€ €B·¢€0€0€ €‚wpfam00852, Glyco_transf_10, Glycosyltransferase family 10 (fucosyltransferase) C-term. This is the C-terminal domain of a family of fucosyltransferases. This enzyme transfers fucose from GDP-Fucose to GlcNAc in an alpha1,3 linkage. This family is known as glycosyltransferase family 10. The C-terminal domain is the likely binding-region for ADP (manuscript in publication).¡€0€ª€0€ €CDD¡€ €¯À¢€0€0€ €pfam00853, Runt, Runt domain. ¡€0€ª€0€ €CDD¡€ €¯Á¢€0€0€ €†pfam00854, PTR2, POT family. The POT (proton-dependent oligopeptide transport) family all appear to be proton dependent transporters.¡€0€ª€0€ €CDD¡€ €Bº¢€0€0€ €‚pfam00855, PWWP, PWWP domain. The PWWP domain is named after a conserved Pro-Trp-Trp-Pro motif. The domain binds to Histone-4 methylated at lysine-20, H4K20me, suggesting that it is methyl-lysine recognition motif. Removal of two conserved aromatic residues in a hydrophobic cavity created by this domain within the full-length protein, Pdp1, abolishes the interaction o f the protein with H4K20me3. In fission yeast, Set9 is the sole enzyme that catalyzes all three states of H4K20me, and Set9-mediated H4K20me is required for efficient recruitment of checkpoint protein Crb2 to sites of DNA damage. The methylation of H4K20 is involved in a diverse array of cellular processes, such as organising higher-order chromatin, maintaining genome stability, and regulating cell-cycle progression.¡€0€ª€0€ €CDD¡€ €¯Â¢€0€0€ €‚pfam00856, SET, SET domain. SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure.¡€0€ª€0€ €CDD¡€ €¯Ã¢€0€0€ €Wpfam00857, Isochorismatase, Isochorismatase family. This family are hydrolase enzymes.¡€0€ª€0€ €CDD¡€ €¯Ä¢€0€0€ €5pfam00858, ASC, Amiloride-sensitive sodium channel. ¡€0€ª€0€ €CDD¡€ €¯Å¢€0€0€ €Fpfam00859, CTF_NFI, CTF/NF-I family transcription modulation region. ¡€0€ª€0€ €CDD¡€ €¯Æ¢€0€0€ €‚6pfam00860, Xan_ur_permease, Permease family. This family includes permeases for diverse substrates such as xanthine, uracil, and vitamin C. However many members of this family are functionally uncharacterized and may transport other substrates. Members of this family have ten predicted transmembrane helices.¡€0€ª€0€ €CDD¡€ €BÀ¢€0€0€ €‚ pfam00861, Ribosomal_L18p, Ribosomal L18 of archaea, bacteria, mitoch. and chloroplast. This family includes the large subunit ribosomal proteins from bacteria, archaea, the mitochondria and the chloroplast. It does not include the 60S L18 or L5 proteins from Metazoa.¡€0€ª€0€ €CDD¡€ €¯Ç¢€0€0€ €‚2pfam00862, Sucrose_synth, Sucrose synthase. Sucrose synthases catalyze the synthesis of sucrose from UDP-glucose and fructose. This family includes the bulk of the sucrose synthase protein. However the carboxyl terminal region of the sucrose synthases belongs to the glycosyl transferase family pfam00534.¡€0€ª€0€ €CDD¡€ €¯È¢€0€0€ €ypfam00863, Peptidase_C4, Peptidase family C4. This peptidase is present in the nuclear inclusion protein of potyviruses.¡€0€ª€0€ €CDD¡€ €Bâ€0€0€ €,pfam00864, P2X_receptor, ATP P2X receptor. ¡€0€ª€0€ €CDD¡€ €¯É¢€0€0€ €&pfam00865, Osteopontin, Osteopontin. ¡€0€ª€0€ €CDD¡€ €¯Ê¢€0€0€ €†pfam00866, Ring_hydroxyl_B, Ring hydroxylating beta subunit. This subunit has a similar structure to NTF-2 and scytalone dehydratase.¡€0€ª€0€ €CDD¡€ €¯Ë¢€0€0€ €!pfam00867, XPG_I, XPG I-region. ¡€0€ª€0€ €CDD¡€ €¯Ì¢€0€0€ €2pfam00868, Transglut_N, Transglutaminase family. ¡€0€ª€0€ €CDD¡€ €¯Í¢€0€0€ €Xpfam00869, Flavi_glycoprot, Flavivirus glycoprotein, central and dimerisation domains. ¡€0€ª€0€ €CDD¡€ €BÉ¢€0€0€ €åpfam00870, P53, P53 DNA-binding domain. This family contains one anomalous member, viz: Zea mays (Q6JAD8). This sequence is identical to human P53 and would appear to be a a human contaminant within the Zea mays sampling effort.¡€0€ª€0€ €CDD¡€ €¯Î¢€0€0€ €ƒpfam00871, Acetate_kinase, Acetokinase family. This family includes acetate kinase, butyrate kinase and 2-methylpropanoate kinase.¡€0€ª€0€ €CDD¡€ €BË¢€0€0€ €:pfam00872, Transposase_mut, Transposase, Mutator family. ¡€0€ª€0€ €CDD¡€ €¯Ï¢€0€0€ €‚pfam00873, ACR_tran, AcrB/AcrD/AcrF family. Members of this family are integral membrane proteins. Some are involved in drug resistance. AcrB cooperates with a membrane fusion protein, AcrA, and an outer membrane channel TolC. The structure shows the AcrB forms a homotrimer.¡€0€ª€0€ €CDD¡€ €¯Ð¢€0€0€ €‚"pfam00874, PRD, PRD domain. The PRD domain (for PTS Regulation Domain), is the phosphorylatable regulatory domain found in bacterial transcriptional antiterminator such as BglG, SacY and LicT, as well as in activators such as MtlR and LevR. The PRD is phosphorylated on one or two conserved histidine residues. PRD-containing proteins are involved in the regulation of catabolic operons in Gram+ and Gram- bacteria and are often characterized by a short N-terminal effector domain that binds to either RNA (CAT-RBD for antiterminators pfam03123) or DNA (for activators), and a duplicated PRD module which is phosphorylated by the sugar phosphotransferase system (PTS) in response to the availability of carbon source. The phosphorylations modify the conformation and stability of the dimeric proteins and thereby the RNA- or DNA-binding activity of the effector domain. The structure of the LicT PRD domains has been solved in both the active (Structure 1h99) and inactive state (Structure 1tlv), revealing massive structural rearrangements upon activation.¡€0€ª€0€ €CDD¡€ €B΢€0€0€ €Zpfam00875, DNA_photolyase, DNA photolyase. This domain binds a light harvesting cofactor.¡€0€ª€0€ €CDD¡€ €¯Ñ¢€0€0€ €‚ pfam00876, Innexin, Innexin. This family includes the Drosophila proteins Ogre and shaking-B, and the C. elegans proteins Unc-7 and Unc-9. Members of this family are integral membrane proteins which are involved in the formation of gap junctions. This family has been named the Innexins.¡€0€ª€0€ €CDD¡€ €¯Ò¢€0€0€ €spfam00877, NLPC_P60, NlpC/P60 family. The function of this domain is unknown. It is found in several lipoproteins.¡€0€ª€0€ €CDD¡€ €¯Ó¢€0€0€ €pfam00878, CIMR, Cation-independent mannose-6-phosphate receptor repeat. The cation-independent mannose-6-phosphate receptor contains 15 copies of a repeat.¡€0€ª€0€ €CDD¡€ €¯Ô¢€0€0€ €2pfam00879, Defensin_propep, Defensin propeptide. ¡€0€ª€0€ €CDD¡€ €¯Õ¢€0€0€ €%pfam00880, Nebulin, Nebulin repeat. ¡€0€ª€0€ €CDD¡€ €¯Ö¢€0€0€ €Äpfam00881, Nitroreductase, Nitroreductase family. The nitroreductase family comprises a group of FMN- or FAD-dependent and NAD(P)H-dependent enzymes able to metabolize nitrosubstituted compounds.¡€0€ª€0€ €CDD¡€ €¯×¢€0€0€ €9pfam00882, Zn_dep_PLPC, Zinc dependent phospholipase C. ¡€0€ª€0€ €CDD¡€ €¯Ø¢€0€0€ €Ôpfam00883, Peptidase_M17, Cytosol aminopeptidase family, catalytic domain. The two associated zinc ions and the active site are entirely enclosed within the C-terminal catalytic domain in leucine aminopeptidase.¡€0€ª€0€ €CDD¡€ €¯Ù¢€0€0€ €"pfam00884, Sulfatase, Sulfatase. ¡€0€ª€0€ €CDD¡€ €¯Ú¢€0€0€ €‚¦pfam00885, DMRL_synthase, 6,7-dimethyl-8-ribityllumazine synthase. This family includes the beta chain of 6,7-dimethyl-8- ribityllumazine synthase EC:2.5.1.9, an enzyme involved in riboflavin biosynthesis. The family also includes a subfamily of distant archaebacterial proteins that may also have the same function. The family contains a number of different subsets including a family of proteins comprising archaeal lumazine and riboflavin synthases, type I lumazine synthases, and the eubacterial type II lumazine synthases. It has been established that lumazine synthase catalyzes the penultimate step in the biosynthesis of riboflavin in plants and microorganisms. The type I lumazine synthases area active in pentameric or icosahedral quaternary assemblies, whereas the type II are decameric. Brucella, a bacterial genus that causes brucellosis, and other Rhizobiales have an atypical riboflavin metabolic pathway. Brucella spp code for both a type-I and a type-II lumazine synthase, and it has been shown that at least one of these two has to be present in order for Brucella to be viable, showing that in the case of Brucella flavin metabolism is implicated in bacterial virulence.¡€0€ª€0€ €CDD¡€ €¯Û¢€0€0€ €2pfam00886, Ribosomal_S16, Ribosomal protein S16. ¡€0€ª€0€ €CDD¡€ €¯Ü¢€0€0€ €,pfam00887, ACBP, Acyl CoA binding protein. ¡€0€ª€0€ €CDD¡€ €¯Ý¢€0€0€ €#pfam00888, Cullin, Cullin family. ¡€0€ª€0€ €CDD¡€ €¯Þ¢€0€0€ €)pfam00889, EF_TS, Elongation factor TS. ¡€0€ª€0€ €CDD¡€ €¯ß¢€0€0€ €ûpfam00890, FAD_binding_2, FAD binding domain. This family includes members that bind FAD. This family includes the flavoprotein subunits from succinate and fumarate dehydrogenase, aspartate oxidase and the alpha subunit of adenylylsulphate reductase.¡€0€ª€0€ €CDD¡€ €¯à¢€0€0€ €“pfam00891, Methyltransf_2, O-methyltransferase. This family includes a range of O-methyltransferases. These enzymes utilize S-adenosyl methionine.¡€0€ª€0€ €CDD¡€ €¯á¢€0€0€ €‚Rpfam00892, EamA, EamA-like transporter family. This family includes many hypothetical membrane proteins of unknown function. Many of the proteins contain two copies of the aligned region. The family used to be known as DUF6. Members of this family usually carry 5+5 transmembrane domains, and this domain attempts to model five of these.¡€0€ª€0€ €CDD¡€ €¯â¢€0€0€ €‚pfam00893, Multi_Drug_Res, Small Multidrug Resistance protein. This family is the Small Multidrug Resistance (SMR) family. Several members have been shown to export a range of toxins, including ethidium bromide and quaternary ammonium compounds, through coupling with proton influx.¡€0€ª€0€ €CDD¡€ €Bᢀ0€0€ €1pfam00894, Luteo_coat, Luteovirus coat protein. ¡€0€ª€0€ €CDD¡€ €B⢀0€0€ €0pfam00895, ATP-synt_8, ATP synthase protein 8. ¡€0€ª€0€ €CDD¡€ €B㢀0€0€ €Àpfam00897, Orbi_VP7, Orbivirus inner capsid protein VP7. In BTV, 260 trimers of VP7 are found in the core. The major proteins of the core are VP7 and VP3. VP7 forms an outer layer around VP3.¡€0€ª€0€ €CDD¡€ €B䢀0€0€ €ªpfam00898, Orbi_VP2, Orbivirus outer capsid protein VP2. VP2 acts as an anchor for VP1 and VP3. VP2 contains a non-specific DNA and RNA binding domain in the N-terminus.¡€0€ª€0€ €CDD¡€ €¯ã¢€0€0€ €»pfam00899, ThiF, ThiF family. This domain is found in ubiquitin activating E1 family and members of the bacterial ThiF/MoeB/HesA family. It is repeated in Ubiquitin-activating enzyme E1.¡€0€ª€0€ €CDD¡€ €B梀0€0€ €1pfam00900, Ribosomal_S4e, Ribosomal family S4e. ¡€0€ª€0€ €CDD¡€ €¯ä¢€0€0€ €¦pfam00901, Orbi_VP5, Orbivirus outer capsid protein VP5. cryoelectron microscopy indicates that VP5 is a trimer implying that there are 360 copies of VP5 per virion.¡€0€ª€0€ €CDD¡€ €B袀0€0€ €‚pfam00902, TatC, Sec-independent protein translocase protein (TatC). The bacterial Tat system has a remarkable ability to transport folded proteins even enzyme complexes across the cytoplasmic membrane. It is structurally and mechanistically similar to the Delta pH-driven thylakoidal protein import pathway. A functional Tat system or Delta pH-dependent pathway requires three integral membrane proteins: TatA/Tha4, TatB/Hcf106 and TatC/cpTatC. The TatC protein is essential for the function of both pathways. It might be involved in twin-arginine signal peptide recognition, protein translocation and proton translocation. Sequence analysis predicts that TatC contains six transmembrane helices (TMHs), and experimental data confirmed that N- and C-termini of TatC or cpTatC are exposed to the cytoplasmic or stromal face of the membrane. The cytoplasmic N-terminus and the first cytoplasmic loop region of the Escherichia coli TatC protein are essential for protein export. At least two TatC molecules co-exist within each Tat translocon.¡€0€ª€0€ €CDD¡€ €¯å¢€0€0€ €Ypfam00903, Glyoxalase, Glyoxalase/Bleomycin resistance protein/Dioxygenase superfamily. ¡€0€ª€0€ €CDD¡€ €¯æ¢€0€0€ €+pfam00904, Involucrin, Involucrin repeat. ¡€0€ª€0€ €CDD¡€ €¯ç¢€0€0€ €pfam00905, Transpeptidase, Penicillin binding protein transpeptidase domain. The active site serine is conserved in all members of this family.¡€0€ª€0€ €CDD¡€ €¯è¢€0€0€ €‚pfam00906, Hepatitis_core, Hepatitis core antigen. The core antigen of hepatitis viruses possesses a carboxyl terminus rich in arginine. On this basis it was predicted that the core antigen would bind DNA. There is some experimental evidence to support this.¡€0€ª€0€ €CDD¡€ €Bí¢€0€0€ €‚tpfam00907, T-box, T-box. The T-box encodes a 180 amino acid domain that binds to DNA. Genes encoding T-box proteins are found in a wide range of animals, but not in other kingdoms such as plants. Family members are all thought to bind to the DNA consensus sequence TCACACCT. they are found exclusively in the nucleus, and perform DNA-binding and transcriptional activation/repression roles. They are generally required for development of the specific tissues they are expressed in, and mutations in T-box genes are implicated in human conditions such as DiGeorge syndrome and X-linked cleft palate, which feature malformations.¡€0€ª€0€ €CDD¡€ €¯é¢€0€0€ €ßpfam00908, dTDP_sugar_isom, dTDP-4-dehydrorhamnose 3,5-epimerase. This family catalyze the isomerisation of dTDP-4-dehydro-6-deoxy -D-glucose with dTDP-4-dehydro-6-deoxy-L-mannose. The EC number of this enzyme is 5.1.3.13.¡€0€ª€0€ €CDD¡€ €¯ê¢€0€0€ €:pfam00909, Ammonium_transp, Ammonium Transporter Family. ¡€0€ª€0€ €CDD¡€ €¯ë¢€0€0€ €ápfam00910, RNA_helicase, RNA helicase. This family includes RNA helicases thought to be involved in duplex unwinding during viral RNA replication. Members of this family are found in a variety of single stranded RNA viruses.¡€0€ª€0€ €CDD¡€ €Bð¢€0€0€ €‚ pfam00912, Transgly, Transglycosylase. The penicillin-binding proteins are bifunctional proteins consisting of transglycosylase and transpeptidase in the N- and C-terminus respectively. The transglycosylase domain catalyzes the polymerization of murein glycan chains.¡€0€ª€0€ €CDD¡€ €¯ì¢€0€0€ €‚(pfam00913, Trypan_glycop, Trypanosome variant surface glycoprotein (A-type). The trypanosome parasite expresses these proteins to evade the immune response. This family includes a variety of surface proteins such as Trypanosoma brucei VSGs such as expression site associated gene (ESAG) 6 and 7.¡€0€ª€0€ €CDD¡€ €¯í¢€0€0€ €3pfam00915, Calici_coat, Calicivirus coat protein. ¡€0€ª€0€ €CDD¡€ €¯î¢€0€0€ €‚Ypfam00916, Sulfate_transp, Sulfate permease family. This family of integral membrane proteins are known as the Sulfate Permease (SulP) family. SulP is a large family found in all domains of life. Although sulfate is a commonly transported ion there are many other activities in this family. See the TCDB description for a comprehensive summary.¡€0€ª€0€ €CDD¡€ €Bô¢€0€0€ € pfam00917, MATH, MATH domain. This motif has been called the Meprin And TRAF-Homology (MATH) domain. This domain is hugely expanded in the nematode C. elegans.¡€0€ª€0€ €CDD¡€ €Bõ¢€0€0€ €5pfam00918, Gastrin, Gastrin/cholecystokinin family. ¡€0€ª€0€ €CDD¡€ €¯ï¢€0€0€ €‚.pfam00919, UPF0004, Uncharacterized protein family UPF0004. This family is the N terminal half of the Prosite family. The C-terminal half has been shown to be related to MiaB proteins. This domain is a nearly always found in conjunction with pfam04055 and pfam01938 although its function is uncertain.¡€0€ª€0€ €CDD¡€ €¯ð¢€0€0€ €*pfam00920, ILVD_EDD, Dehydratase family. ¡€0€ª€0€ €CDD¡€ €¯ñ¢€0€0€ €›pfam00921, Lipoprotein_2, Borrelia lipoprotein. This family of lipoproteins is found in Borrelia spirochetes. The function of these proteins is uncertain.¡€0€ª€0€ €CDD¡€ €¯ò¢€0€0€ €:pfam00922, Phosphoprotein, Vesiculovirus phosphoprotein. ¡€0€ª€0€ €CDD¡€ €Bú¢€0€0€ €‚²pfam00923, TAL_FSA, Transaldolase/Fructose-6-phosphate aldolase. Transaldolase (TAL) is an enzyme of the pentose phosphate pathway (PPP) found almost ubiquitously in the three domains of life (Archaea, Bacteria, and Eukarya). TAL shares a high degree of structural similarity and sequence identity with fructose-6-phosphate aldolase (FSA). They both belong to the class I aldolase family. Their protein structures have been revealed.¡€0€ª€0€ €CDD¡€ €¯ó¢€0€0€ €‚bpfam00924, MS_channel, Mechanosensitive ion channel. Two members of this protein family of M. jannaschii have been functionally characterized. Both proteins form mechanosensitive (MS) ion channels upon reconstitution into liposomes and functional examination by the patch-clamp technique. Therefore this family are likely to also be MS channel proteins.¡€0€ª€0€ €CDD¡€ €¯ô¢€0€0€ €pfam00925, GTP_cyclohydro2, GTP cyclohydrolase II. GTP cyclohydrolase II catalyzes the first committed step in the biosynthesis of riboflavin.¡€0€ª€0€ €CDD¡€ €¯õ¢€0€0€ €‚pfam00926, DHBP_synthase, 3,4-dihydroxy-2-butanone 4-phosphate synthase. 3,4-Dihydroxy-2-butanone 4-phosphate is biosynthesized from ribulose 5-phosphate and serves as the biosynthetic precursor for the xylene ring of riboflavin. Sometimes found as a bifunctional enzyme with pfam00925.¡€0€ª€0€ €CDD¡€ €¯ö¢€0€0€ €Mpfam00927, Transglut_C, Transglutaminase family, C-terminal ig like domain. ¡€0€ª€0€ €CDD¡€ €¯÷¢€0€0€ €‚pfam00928, Adap_comp_sub, Adaptor complexes medium subunit family. This family also contains members which are coatomer subunits.¡€0€ª€0€ €CDD¡€ €¯ø¢€0€0€ €¡pfam00929, RNase_T, Exonuclease. This family includes a variety of exonuclease proteins, such as ribonuclease T and the epsilon subunit of DNA polymerase III.;.¡€0€ª€0€ €CDD¡€ €C¢€0€0€ €×pfam00930, DPPIV_N, Dipeptidyl peptidase IV (DPP IV) N-terminal region. This family is an alignment of the region to the N-terminal side of the active site. The Prosite motif does not correspond to this Pfam entry.¡€0€ª€0€ €CDD¡€ €¯ù¢€0€0€ €#pfam00931, NB-ARC, NB-ARC domain. ¡€0€ª€0€ €CDD¡€ €¯ú¢€0€0€ €‚Rpfam00932, LTD, Lamin Tail Domain. The lamin-tail domain (LTD), which has an immunoglobulin (Ig) fold, is found in Nuclear Lamins, Chlo1887 from Chloroflexus, and several bacterial proteins where it occurs with membrane associated hydrolases of the metallo-beta-lactamase,synaptojanin, and calcineurin-like phosphoesterase superfamilies.¡€0€ª€0€ €CDD¡€ €¯û¢€0€0€ €Jpfam00933, Glyco_hydro_3, Glycosyl hydrolase family 3 N terminal domain. ¡€0€ª€0€ €CDD¡€ €¯ü¢€0€0€ €‚pfam00934, PE, PE family. This family named after a PE motif near to the amino terminus of the domain. The PE family of proteins all contain an amino-terminal region of about 110 amino acids. The carboxyl terminus of this family are variable and fall into several classes. The largest class of PE proteins is the highly repetitive PGRS class which have a high glycine content. The function of these proteins is uncertain but it has been suggested that they may be related to antigenic variation of Mycobacterium tuberculosis.¡€0€ª€0€ €CDD¡€ €¯ý¢€0€0€ €2pfam00935, Ribosomal_L44, Ribosomal protein L44. ¡€0€ª€0€ €CDD¡€ €¯þ¢€0€0€ €‚pfam00936, BMC, BMC domain. Bacterial microcompartments are primitive organelles composed entirely of protein subunits. The prototypical bacterial microcompartment is the carboxysome, a protein shell for sequestering carbon fixation reactions. These proteins for hexameric structure.¡€0€ª€0€ €CDD¡€ €¯ÿ¢€0€0€ €?pfam00937, Corona_nucleoca, Coronavirus nucleocapsid protein. ¡€0€ª€0€ €CDD¡€ €C ¢€0€0€ €[pfam00938, Lipoprotein_3, Lipoprotein. This family of lipoproteins is Mycoplasma specific.¡€0€ª€0€ €CDD¡€ €C ¢€0€0€ €½pfam00939, Na_sulph_symp, Sodium:sulfate symporter transmembrane region. There are also some members in this family that do not match the Prosite motif, and belong to the subfamily SODIT1.¡€0€ª€0€ €CDD¡€ €C ¢€0€0€ €dpfam00940, RNA_pol, DNA-dependent RNA polymerase. This is a family of single chain RNA polymerases.¡€0€ª€0€ €CDD¡€ €°¢€0€0€ €Npfam00941, FAD_binding_5, FAD binding domain in molybdopterin dehydrogenase. ¡€0€ª€0€ €CDD¡€ €°¢€0€0€ €-pfam00942, CBM_3, Cellulose binding domain. ¡€0€ª€0€ €CDD¡€ €°¢€0€0€ €ªpfam00943, Alpha_E2_glycop, Alphavirus E2 glycoprotein. E2 forms a heterodimer with E1. The virus spikes are made up of 80 trimers of these heterodimers (sindbis virus).¡€0€ª€0€ €CDD¡€ €C¢€0€0€ €‚$pfam00944, Peptidase_S3, Alphavirus core protein. Also known as coat protein C and capsid protein C. This makes the literature very confusing. Alphaviruses consist of a nucleoprotein core, a lipid membrane which envelopes the core, and glycoprotein spikes protruding from the lipid membrane.¡€0€ª€0€ €CDD¡€ €C¢€0€0€ €‚jpfam00945, Rhabdo_ncap, Rhabdovirus nucleocapsid protein. The Nucleocapsid (N) Protein is said to have a "tight" structure. The carboxyl end of the N-terminal domain possesses an RNA binding domain. Sequence alignments show 2 regions of reasonable conservation, approx. 64-103 and 201-329. A whole functional protein is required for encapsidation to take place.¡€0€ª€0€ €CDD¡€ €C¢€0€0€ €‚Âpfam00946, Mononeg_RNA_pol, Mononegavirales RNA dependent RNA polymerase. Members of the Mononegavirales including the Paramyxoviridae, like other non-segmented negative strand RNA viruses, have an RNA-dependent RNA polymerase composed of two subunits, a large protein L and a phosphoprotein P. This is a protein family of the L protein. The L protein confers the RNA polymerase activity on the complex. The P protein acts as a transcription factor.¡€0€ª€0€ €CDD¡€ €°¢€0€0€ €xpfam00947, Pico_P2A, Picornavirus core protein 2A. This protein is a protease, involved in cleavage of the polyprotein.¡€0€ª€0€ €CDD¡€ €C¢€0€0€ €‚9pfam00948, Flavi_NS1, Flavivirus non-structural Protein NS1. The NS1 protein is well conserved amongst the flaviviruses. It contains 12 cysteines, and undergoes glycosylation in a similar manner to other NS proteins. Mutational analysis has strongly implied a role for NS1 in the early stages of RNA replication.¡€0€ª€0€ €CDD¡€ €C¢€0€0€ €‚Rpfam00949, Peptidase_S7, Peptidase S7, Flavivirus NS3 serine protease. The viral genome is a positive strand RNA that encodes a single polyprotein precursor. Processing of the polyprotein precursor into mature proteins is carried out by the host signal peptidase and by NS3 serine protease, which requires NS2B (pfam01002) as a cofactor.¡€0€ª€0€ €CDD¡€ €°¢€0€0€ €+pfam00950, ABC-3, ABC 3 transport family. ¡€0€ª€0€ €CDD¡€ €C¢€0€0€ €‚Upfam00951, Arteri_Gl, Arterivirus GL envelope glycoprotein. Arteriviruses encode 4 envelope proteins, Gl, Gs, M and N. Gl envelope protein, is encoded in ORF5, and is 30- 45 kDa in size. Gl is heterogenously glycosylated with N-acetyllactosamine in a cell-type-specific manner. The Gl glycoprotein expresses the neutralisation determinants.¡€0€ª€0€ €CDD¡€ €­¢¢€0€0€ €‚pfam00952, Bunya_nucleocap, Bunyavirus nucleocapsid (N) protein. The bunyaviruses are enveloped viruses with a genome consisting of 3 ssRNA segments (called L, M and S). The nucleocapsid protein is encode on the small (S) genomic RNA. The N protein is the major component of the nucleocapsids. This protein is thought to interact with the L protein, virus RNA and/or other N proteins.¡€0€ª€0€ €CDD¡€ €C¢€0€0€ €pfam00970, FAD_binding_6, Oxidoreductase FAD-binding domain. ¡€0€ª€0€ €CDD¡€ €°¢€0€0€ €‚;pfam00971, EIAV_GP90, EIAV coat protein, gp90. Equine infectious anaemia (EIAV). EIAV belongs to the family Retroviridae. EIAV gp90 is hypervariable in the carboxyl-end region and more stable in the amino-end region. This variability is a pathogenicity factor that allows the evasion of the host's immune response.¡€0€ª€0€ €CDD¡€ €Ñ™¢€0€0€ €‚pfam00972, Flavi_NS5, Flavivirus RNA-directed RNA polymerase. Flaviviruses produce a polyprotein from the ssRNA genome. This protein is also known as NS5. This RNA-directed RNA polymerase possesses a number of short regions and motifs homologous to other RNA-directed RNA polymerases.¡€0€ª€0€ €CDD¡€ €C(¢€0€0€ €‚Žpfam00973, Paramyxo_ncap, Paramyxovirus nucleocapsid protein. The nucleocapsid protein is referred to as NP. NP is is the major structural component of the nucleocapsid. The protein is approx. 58 kDa. 2600 NP molecules go to tightly encapsidate the RNA. NP interacts with several other viral encoded proteins, all of which are involved in controlling replication. {NP-NP, NP-P, NP-(PL), and NP-V}.¡€0€ª€0€ €CDD¡€ €°¢€0€0€ €‚kpfam00974, Rhabdo_glycop, Rhabdovirus spike glycoprotein. Frequently abbreviated to G protein. The glycoprotein spike is made up of a trimer of G proteins. Channel formed by glycoprotein spike is thought to function in a similar manner to Influenza virus M2 protein channel, thus allowing a signal to pass across the viral membrane to signal for viral uncoating.¡€0€ª€0€ €CDD¡€ €C)¢€0€0€ €‚wpfam00975, Thioesterase, Thioesterase domain. Peptide synthetases are involved in the non-ribosomal synthesis of peptide antibiotics. Next to the operons encoding these enzymes, in almost all cases, are genes that encode proteins that have similarity to the type II fatty acid thioesterases of vertebrates. There are also modules within the peptide synthetases that also share this similarity. With respect to antibiotic production, thioesterases are required for the addition of the last amino acid to the peptide antibiotic, thereby forming a cyclic antibiotic. Thioesterases (non-integrated) have molecular masses of 25-29 kDa.¡€0€ª€0€ €CDD¡€ €°¢€0€0€ €4pfam00976, ACTH_domain, Corticotropin ACTH domain. ¡€0€ª€0€ €CDD¡€ €°¢€0€0€ €‚Spfam00977, His_biosynth, Histidine biosynthesis protein. Proteins involved in steps 4 and 6 of the histidine biosynthesis pathway are contained in this family. Histidine is formed by several complex and distinct biochemical reactions catalyzed by eight enzymes. The enzymes in this Pfam entry are called His6 and His7 in eukaryotes and HisA and HisF in prokaryotes. The structure of HisA is known to be a TIM barrel fold. In some archaeal HisA proteins the TIM barrel is composed of two tandem repeats of a half barrel. This family belong to the common phosphate binding site TIM barrel family.¡€0€ª€0€ €CDD¡€ €C,¢€0€0€ €‚ pfam00978, RdRP_2, RNA dependent RNA polymerase. This family may represent an RNA dependent RNA polymerase. The family also contains the following proteins: 2A protein from bromoviruses putative RNA dependent RNA polymerase from tobamoviruses Non structural polyprotein from togaviruses.¡€0€ª€0€ €CDD¡€ €Ñž¢€0€0€ €‚Úpfam00979, Reovirus_cap, Reovirus outer capsid protein, Sigma 3. Sigma 3 is the major outer capsid protein of reovirus. Sigma 3 is encoded by genome segment 4. Sigma 3 binds to double stranded RNA and associates with polypeptide u1 and its cleavage product u1C to form the outer shell of the virion. The Sigma 3 protein possesses a zinc-finger motif and an RNA-binding domain in the N and C termini respectively. This protein is also thought to play a role in pathogenesis.¡€0€ª€0€ €CDD¡€ €4™¢€0€0€ €‚Spfam00980, Rota_Capsid_VP6, Rotavirus major capsid protein VP6. Rotaviruses consist of three concentric protein shells. The intermediate (middle) protein layer consists 260 trimers of VP6. VP6 in the most abundant protein in the virion. VP6 is also involved in virion assembly, and possesses the ability to interact with VP2, VP4 and VP7.¡€0€ª€0€ €CDD¡€ €C-¢€0€0€ €‚wpfam00981, Rota_NS53, Rotavirus RNA-binding Protein 53 (NS53). This protein is also known as NSP1. NS53 is encoded by gene 5. It is made in low levels in the infected cells and is a component of early replication. The protein is known to accumulate on the cytoskeleton of the infected cell. NS53 is an RNA binding protein that contains a characteristic cysteine rich region.¡€0€ª€0€ €CDD¡€ €4š¢€0€0€ €‚pfam00982, Glyco_transf_20, Glycosyltransferase family 20. Members of this family belong to glycosyl transferase family 20. OtsA (Trehalose-6-phosphate synthase) is homologous to regions in the subunits of yeast trehalose-6-phosphate synthase/phosphate complex,.¡€0€ª€0€ €CDD¡€ €C.¢€0€0€ €/pfam00983, Tymo_coat, Tymovirus coat protein. ¡€0€ª€0€ €CDD¡€ €C/¢€0€0€ €‚6pfam00984, UDPG_MGDP_dh, UDP-glucose/GDP-mannose dehydrogenase family, central domain. The UDP-glucose/GDP-mannose dehydrogenaseses are a small group of enzymes which possesses the ability to catalyze the NAD-dependent 2-fold oxidation of an alcohol to an acid without the release of an aldehyde intermediate.¡€0€ª€0€ €CDD¡€ €°¢€0€0€ €?pfam00985, MSA_2, Merozoite Surface Antigen 2 (MSA-2) family. ¡€0€ª€0€ €CDD¡€ €4¢€0€0€ €‚-pfam00986, DNA_gyraseB_C, DNA gyrase B subunit, carboxyl terminus. The amino terminus of eukaryotic and prokaryotic DNA topoisomerase II are similar, but they have a different carboxyl terminus. The amino-terminal portion of the DNA gyrase B protein is thought to catalyze the ATP-dependent super-coiling of DNA. See pfam00204. The carboxyl-terminal end supports the complexation with the DNA gyrase A protein and the ATP-independent relaxation. This family also contains Topoisomerase IV. This is a bacterial enzyme that is closely related to DNA gyrase,.¡€0€ª€0€ €CDD¡€ €°¢€0€0€ €‚Úpfam00988, CPSase_sm_chain, Carbamoyl-phosphate synthase small chain, CPSase domain. The carbamoyl-phosphate synthase domain is in the amino terminus of protein. Carbamoyl-phosphate synthase catalyzes the ATP-dependent synthesis of carbamyl-phosphate from glutamine or ammonia and bicarbonate. This important enzyme initiates both the urea cycle and the biosynthesis of arginine and/or pyrimidines. The carbamoyl-phosphate synthase (CPS) enzyme in prokaryotes is a heterodimer of a small and large chain. The small chain promotes the hydrolysis of glutamine to ammonia, which is used by the large chain to synthesize carbamoyl phosphate. See pfam00289. The small chain has a GATase domain in the carboxyl terminus. See pfam00117.¡€0€ª€0€ €CDD¡€ €°¢€0€0€ €½pfam00989, PAS, PAS fold. The PAS fold corresponds to the structural domain that has previously been defined as PAS and PAC motifs. The PAS fold appears in archaea, eubacteria and eukarya.¡€0€ª€0€ €CDD¡€ €°¢€0€0€ €‚pfam00990, GGDEF, Diguanylate cyclase, GGDEF domain. This domain is found linked to a wide range of non-homologous domains in a variety of bacteria. It has been shown to be homologous to the adenylyl cyclase catalytic domain and has diguanylate cyclase activity. This observation correlates with the functional information available on two GGDEF-containing proteins, namely diguanylate cyclase and phosphodiesterase A of Acetobacter xylinum, both of which regulate the turnover of cyclic diguanosine monophosphate. In the WspR protein of Pseudomonas aeruginosa, the GGDEF domain acts as a diguanylate cyclase, Structure 3bre, when the whole molecule appears to form a tetramer consisting of two symmetrically-related dimers representing a biological unit. The active site is the GGD/EF motif, buried in the structure, and the cyclic dimeric guanosine monophosphate (c-di-GMP) bind to the inhibitory-motif RxxD on the surface. The enzyme thus catalyzes the cyclisation of two guanosine triphosphate (GTP) molecules to one c-di-GMP molecule.¡€0€ª€0€ €CDD¡€ €°¢€0€0€ €‚×pfam00992, Troponin, Troponin. Troponin (Tn) contains three subunits, Ca2+ binding (TnC), inhibitory (TnI), and tropomyosin binding (TnT). this Pfam contains members of the TnT subunit. Troponin is a complex of three proteins, Ca2+ binding (TnC), inhibitory (TnI), and tropomyosin binding (TnT). The troponin complex regulates Ca++ induced muscle contraction. This family includes troponin T and troponin I. Troponin I binds to actin and troponin T binds to tropomyosin.¡€0€ª€0€ €CDD¡€ €°¢€0€0€ €Mpfam00993, MHC_II_alpha, Class II histocompatibility antigen, alpha domain. ¡€0€ª€0€ €CDD¡€ €°¢€0€0€ €‚bpfam00994, MoCF_biosynth, Probable molybdopterin binding domain. This domain is found a variety of proteins involved in biosynthesis of molybdopterin cofactor. The domain is presumed to bind molybdopterin. The structure of this domain is known, and it forms an alpha/beta structure. In the known structure of Gephyrin this domain mediates trimerisation.¡€0€ª€0€ €CDD¡€ €°¢€0€0€ €pfam00995, Sec1, Sec1 family. ¡€0€ª€0€ €CDD¡€ €°¢€0€0€ €-pfam00996, GDI, GDP dissociation inhibitor. ¡€0€ª€0€ €CDD¡€ €° ¢€0€0€ €‚Ípfam00997, Casein_kappa, Kappa casein. Kappa-casein is a mammalian milk protein involved in a number of important physiological processes. In the gut, the ingested protein is split into an insoluble peptide (para kappa-casein) and a soluble hydrophilic glycopeptide (caseinomacropeptide). Caseinomacropeptide is responsible for increased efficiency of digestion, prevention of neonate hypersensitivity to ingested proteins, and inhibition of gastric pathogens.¡€0€ª€0€ €CDD¡€ €°!¢€0€0€ €©pfam00998, RdRP_3, Viral RNA dependent RNA polymerase. This family includes viral RNA dependent RNA polymerase enzymes from hepatitis C virus and various plant viruses.¡€0€ª€0€ €CDD¡€ €°"¢€0€0€ €‚|pfam00999, Na_H_Exchanger, Sodium/hydrogen exchanger family. Na/H antiporters are key transporters in maintaining the pH of actively metabolising cells. The molecular mechanisms of antiport are unclear. These antiporters contain 10-12 transmembrane regions (M) at the amino-terminus and a large cytoplasmic region at the carboxyl terminus. The transmembrane regions M3-M12 share identity with other members of the family. The M6 and M7 regions are highly conserved. Thus, this is thought to be the region that is involved in the transport of sodium and hydrogen ions. The cytoplasmic region has little similarity throughout the family.¡€0€ª€0€ €CDD¡€ €C;¢€0€0€ €Üpfam01000, RNA_pol_A_bac, RNA polymerase Rpb3/RpoA insert domain. Members of this family include: alpha subunit from eubacteria alpha subunits from chloroplasts Rpb3 subunits from eukaryotes RpoD subunits from archaeal.¡€0€ª€0€ €CDD¡€ €°#¢€0€0€ €õpfam01001, HCV_NS4b, Hepatitis C virus non-structural protein NS4b. No precise function has been assigned to NS4b. However, it is known that NS4b interacts with NS4a and NS3 to form a large replicase complex to direct the viral RNA replication.¡€0€ª€0€ €CDD¡€ €­Ð¢€0€0€ €îpfam01002, Flavi_NS2B, Flavivirus non-structural protein NS2B. Flaviviruses encode a single polyprotein. This is cleaved into three structural and seven non-structural proteins. All, but two, are cleaved by the NS2B-NS3 protease complex.¡€0€ª€0€ €CDD¡€ €C=¢€0€0€ €òpfam01003, Flavi_capsid, Flavivirus capsid protein C. Flaviviruses are small enveloped viruses with virions comprised of 3 proteins called C, M and E. Multiple copies of the C protein form the nucleocapsid, which contains the ssRNA molecule.¡€0€ª€0€ €CDD¡€ €°$¢€0€0€ €‚Öpfam01004, Flavi_M, Flavivirus envelope glycoprotein M. Flaviviruses are small enveloped viruses with virions comprised of 3 proteins called C, M and E. The envelope glycoprotein M is made as a precursor, called prM. The precursor portion of the protein is the signal peptide for the proteins entry into the membrane. prM is cleaved to form M in a late-stage cleavage event. Associated with this cleavage is a change in the infectivity and fusion activity of the virus.¡€0€ª€0€ €CDD¡€ €°%¢€0€0€ €‚’pfam01005, Flavi_NS2A, Flavivirus non-structural protein NS2A. NS2A is a hydrophobic protein about 25 kDa is size. NS2A is cleaved from NS1 by a membrane bound host protease. NS2A has been found to associate with the dsRNA within the vesicle packages. It has also been found that NS2A associates with the known replicase components and so NS2A has been postulated to be part of this replicase complex.¡€0€ª€0€ €CDD¡€ €C?¢€0€0€ €‚)pfam01006, HCV_NS4a, Hepatitis C virus non-structural protein NS4a. NS4a forms an integral part of the NS3 serine protease, as it is required in a number of cases as a cofactor of cleavage. It has also been reported that NS4a interacts with NS4b and NS3 to form a multi-subunit replicase complex.¡€0€ª€0€ €CDD¡€ €°&¢€0€0€ €5pfam01007, IRK, Inward rectifier potassium channel. ¡€0€ª€0€ €CDD¡€ €°'¢€0€0€ €‚špfam01008, IF-2B, Initiation factor 2 subunit family. This family includes initiation factor 2B alpha, beta and delta subunits from eukaryotes, initiation factor 2B subunits 1 and 2 from archaebacteria and some proteins of unknown function from prokaryotes. Initiation factor 2 binds to Met-tRNA, GTP and the small ribosomal subunit. Members of this family have also been characterized as 5-methylthioribose- 1-phosphate isomerases, an enzyme of the methionine salvage pathway. The crystal structure of Ypr118w, a non-essential, low-copy number gene product from Saccharomyces cerevisiae, reveals a dimeric protein with two domains and a putative active site cleft.¡€0€ª€0€ €CDD¡€ €CB¢€0€0€ €‚0pfam01010, Proton_antipo_C, NADH-dehyrogenase subunit F, TMs, (complex I) C-terminus. This sub-family represents a carboxyl terminal extension of pfam00361. It includes subunit 5 from chloroplasts, and bacterial subunit L. This sub-family is part of complex I which catalyzes the transfer of two electrons from NADH to ubiquinone in a reaction that is associated with proton translocation across the membrane. This family is largely a few TM regions of the F subunit of NADH-Ubiquinone oxidoreductase from plants. The TMs form part of the anti-porter subunit.¡€0€ª€0€ €CDD¡€ €CC¢€0€0€ €Ípfam01011, PQQ, PQQ enzyme repeat. The family represent a single repeat of a beta propeller. This propeller has been found in several enzymes which utilize pyrrolo-quinoline quinone as a prosthetic group.¡€0€ª€0€ €CDD¡€ €°(¢€0€0€ €´pfam01012, ETF, Electron transfer flavoprotein domain. This family includes the homologous domain shared between the alpha and beta subunits of the electron transfer flavoprotein.¡€0€ª€0€ €CDD¡€ €°)¢€0€0€ €pfam01014, Uricase, Uricase. ¡€0€ª€0€ €CDD¡€ €°*¢€0€0€ €3pfam01015, Ribosomal_S3Ae, Ribosomal S3Ae family. ¡€0€ª€0€ €CDD¡€ €°+¢€0€0€ €2pfam01016, Ribosomal_L27, Ribosomal L27 protein. ¡€0€ª€0€ €CDD¡€ €°,¢€0€0€ €‚Fpfam01017, STAT_alpha, STAT protein, all-alpha domain. STAT proteins (Signal Transducers and Activators of Transcription) are a family of transcription factors that are specifically activated to regulate gene transcription when cells encounter cytokines and growth factors. STAT proteins also include an SH2 domain pfam00017.¡€0€ª€0€ €CDD¡€ €°-¢€0€0€ €‚ pfam01018, GTP1_OBG, GTP1/OBG. The N-terminal domain of the GTPase OBG has the OBG fold, which is formed by three glycine-rich regions inserted into a small 8-stranded beta-sandwich these regions form six left-handed collagen-like helices packed and H-bonded together.¡€0€ª€0€ €CDD¡€ €°.¢€0€0€ €;pfam01019, G_glu_transpept, Gamma-glutamyltranspeptidase. ¡€0€ª€0€ €CDD¡€ €°/¢€0€0€ €špfam01020, Ribosomal_L40e, Ribosomal L40e family. Bovine L40 has been identified as a secondary RNA binding protein. L40 is fused to a ubiquitin protein.¡€0€ª€0€ €CDD¡€ €°0¢€0€0€ €‚pfam01021, TYA, TYA transposon protein. Ty are yeast transposons. A 5.7kb transcript codes for p3 a fusion protein of TYA and TYB. The TYA protein is analogous to the gag protein of retroviruses. TYA a is cleaved to form 46kd protein which can form mature virion like particles.¡€0€ª€0€ €CDD¡€ €CM¢€0€0€ €Ûpfam01022, HTH_5, Bacterial regulatory protein, arsR family. Members of this family contains a DNA binding 'helix-turn-helix' motif. This family includes other proteins which are not included in the Prosite definition.¡€0€ª€0€ €CDD¡€ €CN¢€0€0€ €„pfam01023, S_100, S-100/ICaBP type calcium binding domain. The S-100 domain is a subfamily of the EF-hand calcium binding proteins.¡€0€ª€0€ €CDD¡€ €°1¢€0€0€ €2pfam01024, Colicin, Colicin pore forming domain. ¡€0€ª€0€ €CDD¡€ €°2¢€0€0€ €pfam01025, GrpE, GrpE. ¡€0€ª€0€ €CDD¡€ €°3¢€0€0€ €Ëpfam01026, TatD_DNase, TatD related DNase. This family of proteins are related to a large superfamily of metalloenzymes. TatD, a member of this family has been shown experimentally to be a DNase enzyme.¡€0€ª€0€ €CDD¡€ €°4¢€0€0€ €‚ pfam01027, Bax1-I, Inhibitor of apoptosis-promoting Bax1. Programmed cell-death involves a set of Bcl-2 family proteins, some of which inhibit apoptosis (Bcl-2 and Bcl-XL) and some of which promote it (Bax and Bak). Human Bax inhibitor, BI-1, is an evolutionarily conserved integral membrane protein containing multiple membrane-spanning segments predominantly localized to intracellular membranes. It has 6-7 membrane-spanning domains. The C termini of the mammalian BI-1 proteins are comprised of basic amino acids resembling some nuclear targeting sequences, but otherwise the predicted proteins lack motifs that suggest a function. As plant BI-1 appears to localize predominantly to the ER, we hypothesized that plant BI-1 could also regulate cell death triggered by ER stress. BI-1 appears to exert its effect through an interaction with calmodulin. The budding yeast member of this family has been found unexpectedly to encode a BH3 domain-containing protein (Ybh3p) that regulates the mitochondrial pathway of apoptosis in a phylogenetically conserved manner. Examination of the crystal structure of a bacterial member of this family shows that these proteins mediate a calcium leak across the membrane that is pH-dependent. Calcium homoeostasis balances passive calcium leak with active calcium uptake. The structure exists in a pore-closed and pore-open conformation, at pHs of 8 and 6 respectively, and the pore can be opened by intracrystalline transition; together these findings suggest that pH controls the conformational transition.¡€0€ª€0€ €CDD¡€ €°5¢€0€0€ €‚pfam01028, Topoisom_I, Eukaryotic DNA topoisomerase I, catalytic core. Topoisomerase I promotes the relaxation of DNA superhelical tension by introducing a transient single-stranded break in duplex DNA and are vital for the processes of replication, transcription, and recombination.¡€0€ª€0€ €CDD¡€ €°6¢€0€0€ €†pfam01029, NusB, NusB family. The NusB protein is involved in the regulation of rRNA biosynthesis by transcriptional antitermination.¡€0€ª€0€ €CDD¡€ €°7¢€0€0€ €‚pfam01030, Recep_L_domain, Receptor L domain. The L domains from these receptors make up the bilobal ligand binding site. Each L domain consists of a single-stranded right hand beta-helix. This Pfam entry is missing the first 50 amino acid residues of the domain.¡€0€ª€0€ €CDD¡€ €°8¢€0€0€ €¡pfam01031, Dynamin_M, Dynamin central region. This region lies between the GTPase domain, see pfam00350, and the pleckstrin homology (PH) domain, see pfam00169.¡€0€ª€0€ €CDD¡€ €°9¢€0€0€ €×pfam01032, FecCD, FecCD transport family. This is a sub-family of bacterial binding protein-dependent transport systems family. This Pfam entry contains the inner components of this multicomponent transport system.¡€0€ª€0€ €CDD¡€ €°:¢€0€0€ €1pfam01033, Somatomedin_B, Somatomedin B domain. ¡€0€ª€0€ €CDD¡€ €°;¢€0€0€ €»pfam01034, Syndecan, Syndecan domain. Syndecans are transmembrane heparin sulfate proteoglycans which are implicated in the binding of extracellular matrix components and growth factors.¡€0€ª€0€ €CDD¡€ €°<¢€0€0€ €zpfam01035, DNA_binding_1, 6-O-methylguanine DNA methyltransferase, DNA binding domain. This domain is a 3 helical bundle.¡€0€ª€0€ €CDD¡€ €°=¢€0€0€ €‚pfam01036, Bac_rhodopsin, Bacteriorhodopsin-like protein. The bacterial opsins are retinal-binding proteins that provide light- dependent ion transport and sensory functions to a family of halophilic bacteria. They are integral membrane proteins believed to contain seven transmembrane (TM) domains, the last of which contains the attachment point for retinal (a conserved lysine). This family also includes distantly related proteins that do not contain the retinal binding lysine and so cannot function as opsins.¡€0€ª€0€ €CDD¡€ €°>¢€0€0€ €‚¬pfam01037, AsnC_trans_reg, Lrp/AsnC ligand binding domain. The l-leucine-responsive regulatory protein (Lrp/AsnC) family is a family of similar bacterial transcription regulatory proteins. The family is named after two E. coli proteins involved in regulating amino acid metabolism. This entry corresponds to the usually C-terminal regulatory ligand binding domain. Structurally this domain has a dimeric alpha/beta barrel fold.¡€0€ª€0€ €CDD¡€ €°?¢€0€0€ €‚ápfam01039, Carboxyl_trans, Carboxyl transferase domain. All of the members in this family are biotin dependent carboxylases. The carboxyl transferase domain carries out the following reaction; transcarboxylation from biotin to an acceptor molecule. There are two recognized types of carboxyl transferase. One of them uses acyl-CoA and the other uses 2-oxoacid as the acceptor molecule of carbon dioxide. All of the members in this family utilize acyl-CoA as the acceptor molecule.¡€0€ª€0€ €CDD¡€ €°@¢€0€0€ €1pfam01040, UbiA, UbiA prenyltransferase family. ¡€0€ª€0€ €CDD¡€ €°A¢€0€0€ €‚ïpfam01041, DegT_DnrJ_EryC1, DegT/DnrJ/EryC1/StrS aminotransferase family. The members of this family are probably all pyridoxal-phosphate-dependent aminotransferase enzymes with a variety of molecular functions. The family includes StsA, StsC and StsS. The aminotransferase activity was demonstrated for purified StsC protein as the L-glutamine:scyllo-inosose aminotransferase EC:2.6.1.50, which catalyzes the first amino transfer in the biosynthesis of the streptidine subunit of streptomycin.¡€0€ª€0€ €CDD¡€ €°B¢€0€0€ €‚ pfam01042, Ribonuc_L-PSP, Endoribonuclease L-PSP. Endoribonuclease active on single-stranded mRNA. Inhibits protein synthesis by cleavage of mRNA. Previously thought to inhibit protein synthesis initiation. This protein may also be involved in the regulation of purine biosynthesis. YjgF (renamed RidA) family members are enamine/imine deaminases. They hydrolyze reactive intermediates released by PLP-dependent enzymes, including threonine dehydratase. YjgF also prevents inhibition of transaminase B (IlvE) in Salmonella.¡€0€ª€0€ €CDD¡€ €Ca¢€0€0€ €‚ƒpfam01043, SecA_PP_bind, SecA preprotein cross-linking domain. The SecA ATPase is involved in the insertion and retraction of preproteins through the plasma membrane. This domain has been found to cross-link to preproteins, thought to indicate a role in preprotein binding. The pre-protein cross-linking domain is comprised of two sub domains that are inserted within the ATPase domain.¡€0€ª€0€ €CDD¡€ €°C¢€0€0€ €'pfam01044, Vinculin, Vinculin family. ¡€0€ª€0€ €CDD¡€ €Cc¢€0€0€ €‚?pfam01047, MarR, MarR family. The Mar proteins are involved in the multiple antibiotic resistance, a non-specific resistance system. The expression of the mar operon is controlled by a repressor, MarR. A large number of compounds induce transcription of the mar operon. This is thought to be due to the compound binding to MarR, and the resulting complex stops MarR binding to the DNA. With the MarR repression lost, transcription of the operon proceeds. The structure of MarR is known and shows MarR as a dimer with each subunit containing a winged-helix DNA binding motif.¡€0€ª€0€ €CDD¡€ €Cd¢€0€0€ €Ñpfam01048, PNP_UDP_1, Phosphorylase superfamily. Members of this family include: purine nucleoside phosphorylase (PNP) Uridine phosphorylase (UdRPase) 5'-methylthioadenosine phosphorylase (MTA phosphorylase).¡€0€ª€0€ €CDD¡€ €°D¢€0€0€ €‚ëpfam01049, Cadherin_C, Cadherin cytoplasmic region. Cadherins are vital in cell-cell adhesion during tissue differentiation. Cadherins are linked to the cytoskeleton by catenins. Catenins bind to the cytoplasmic tail of the cadherin. Cadherins cluster to form foci of homophilic binding units. A key determinant to the strength of the binding that it is mediated by cadherins is the juxtamembrane region of the cadherin. This region induces clustering and also binds to the protein p120ctn.¡€0€ª€0€ €CDD¡€ €°E¢€0€0€ €‚lpfam01050, MannoseP_isomer, Mannose-6-phosphate isomerase. All of the members of this Pfam entry belong to family 2 of the mannose-6-phosphate isomerases. The type II phosphomannose isomerases are bifunctional enzymes. This Pfam entry covers the isomerase domain. The guanosine diphospho-D-mannose pyrophosphorylase domain is in another Pfam entry, see pfam00483.¡€0€ª€0€ €CDD¡€ €°F¢€0€0€ €‚špfam01051, Rep_3, Initiator Replication protein. This protein is an initiator of plasmid replication. RepB possesses nicking-closing (topoisomerase I) like activity. It is also able to perform a strand transfer reaction on ssDNA that contains its target. This family also includes RepA which is an E.coli protein involved in plasmid replication. The RepA protein binds to DNA repeats that flank the repA gene.¡€0€ª€0€ €CDD¡€ €°G¢€0€0€ €õpfam01052, FliMN_C, Type III flagellar switch regulator (C-ring) FliN C-term. This family includes the C-terminal region of flagellar motor switch proteins FliN and FliM. It is associated with family FliM, pfam02154 and family FliN_N pfam16973.¡€0€ª€0€ €CDD¡€ €°H¢€0€0€ €‚Öpfam01053, Cys_Met_Meta_PP, Cys/Met metabolism PLP-dependent enzyme. This family includes enzymes involved in cysteine and methionine metabolism. The following are members: Cystathionine gamma-lyase, Cystathionine gamma-synthase, Cystathionine beta-lyase, Methionine gamma-lyase, OAH/OAS sulfhydrylase, O-succinylhomoserine sulfhydrylase All of these members participate is slightly different reactions. All these enzymes use PLP (pyridoxal-5'-phosphate) as a cofactor.¡€0€ª€0€ €CDD¡€ €°I¢€0€0€ €épfam01054, MMTV_SAg, Mouse mammary tumor virus superantigen. The mouse mammary tumor virus (MMTV) is a milk-transmitted type B retrovirus. The superantigen (SAg) is encoded by the long terminal repeat. The SAgs are also called PR73.¡€0€ª€0€ €CDD¡€ €Ck¢€0€0€ €Épfam01055, Glyco_hydro_31, Glycosyl hydrolases family 31. Glycosyl hydrolases are key enzymes of carbohydrate metabolism. Family 31 comprises of enzymes that are, or similar to, alpha- galactosidases.¡€0€ª€0€ €CDD¡€ €°J¢€0€0€ €‚´pfam01056, Myc_N, Myc amino-terminal region. The myc family belongs to the basic helix-loop-helix leucine zipper class of transcription factors, see pfam00010. Myc forms a heterodimer with Max, and this complex regulates cell growth through direct activation of genes involved in cell replication. Mutations in the C-terminal 20 residues of this domain cause unique changes in the induction of apoptosis, transformation, and G2 arrest.¡€0€ª€0€ €CDD¡€ €°K¢€0€0€ €‚4pfam01057, Parvo_NS1, Parvovirus non-structural protein NS1. This family also contains the NS2 protein. Parvoviruses encode two non-structural proteins, NS1 and NS2. The mRNA for NS2 contains the coding sequence for the first 87 amino acids of NS1, then by an alternative splicing mechanism mRNA from a different reading frame, encoding the last 78 amino acids, makes up the full length of the NS2 mRNA. NS1, is the major non-structural protein. It is essential for DNA replication. It is an 83-kDa nuclear phosphoprotein. It has DNA helicase and ATPase activity.¡€0€ª€0€ €CDD¡€ €°L¢€0€0€ €Hpfam01058, Oxidored_q6, NADH ubiquinone oxidoreductase, 20 Kd subunit. ¡€0€ª€0€ €CDD¡€ €°M¢€0€0€ €Spfam01059, Oxidored_q5_N, NADH-ubiquinone oxidoreductase chain 4, amino terminus. ¡€0€ª€0€ €CDD¡€ €Cp¢€0€0€ €‚ûpfam01060, TTR-52, Transthyretin-like family. TTR-52 was called family 2 in, and has weak similarity to transthyretin (formerly called pre-albumin) which transports thyroid hormones. The specific function of this protein is as a bridging molecule in apoptosis cross-linking dying cells to phagocytes. TTR-52 bridges by cross-linking surface-exposed phosphatidylserine (PtdSer) on apoptotic cells to the CED-1 receptor, a transmembrane receptor, on phagocytes. TTR-52 has an open beta-barrel-like structure.¡€0€ª€0€ €CDD¡€ €°N¢€0€0€ €3pfam01061, ABC2_membrane, ABC-2 type transporter. ¡€0€ª€0€ €CDD¡€ €°O¢€0€0€ €‚3pfam01062, Bestrophin, Bestrophin, RFP-TM, chloride channel. Bestrophin is a 68-kDa basolateral plasma membrane protein expressed in retinal pigment epithelial cells (RPE). It is encoded by the VMD2 gene, which is mutated in Best macular dystrophy, a disease characterized by a depressed light peak in the electrooculogram. VMD2 encodes a 585-amino acid protein with an approximate mass of 68 kDa which has been designated bestrophin. Bestrophin shares homology with the Caenorhabditis elegans RFP gene family, named for the presence of a conserved arginine (R), phenylalanine (F), proline (P), amino acid sequence motif. Bestrophin is a plasma membrane protein, localized to the basolateral surface of RPE cells consistent with a role for bestrophin in the generation or regulation of the EOG light peak. Bestrophin and other RFP family members represent a new class of chloride channels, indicating a direct role for bestrophin in generating the light peak. The VMD2 gene underlying Best disease was shown to represent the first human member of the RFP-TM protein family. More than 97% of the disease-causing mutations are located in the N-terminal RFP-TM domain implying important functional properties. The bestrophins are four-pass transmembrane chloride-channel proteins, and the RFP-TM or bestrophin domain extends from the N-terminus through approximately 350 amino acids and contains all of the TM domains as well as nearly all reported disease causing mutations. Interestingly, the RFP motif is not conserved evolutionarily back beyond Metazoa, neither is it in plant members.¡€0€ª€0€ €CDD¡€ €°P¢€0€0€ €‚€pfam01063, Aminotran_4, Amino-transferase class IV. The D-amino acid transferases (D-AAT) are required by bacteria to catalyze the synthesis of D-glutamic acid and D-alanine, which are essential constituents of bacterial cell wall and are the building block for other D-amino acids. Despite the difference in the structure of the substrates, D-AATs and L-ATTs have strong similarity.¡€0€ª€0€ €CDD¡€ €°Q¢€0€0€ €‚Œpfam01064, Activin_recp, Activin types I and II receptor domain. This Pfam entry consists of both TGF-beta receptor types. This is an alignment of the hydrophilic cysteine-rich ligand-binding domains, Both receptor types, (type I and II) posses a 9 amino acid cysteine box, with the the consensus CCX{4-5}CN. The type I receptors also possess 7 extracellular residues preceding the cysteine box.¡€0€ª€0€ €CDD¡€ €°R¢€0€0€ €‚Ipfam01065, Adeno_hexon, Hexon, adenovirus major coat protein, N-terminal domain. Hexon is the major coat protein from adenovirus type 2. Hexon forms a homo-trimer. The 240 copies of the hexon trimer are organised so that 12 lie on each of the 20 facets. The central 9 hexons in a facet are cemented together by 12 copies of polypeptide IX. The penton complex, formed by the peripentonal hexons and base hexon (holding in place a fibre), lie at each of the 12 vertices. The N and C-terminal domains adopt the same PNGase F-like fold although they are significantly different in length.¡€0€ª€0€ €CDD¡€ €°S¢€0€0€ €‚pfam01066, CDP-OH_P_transf, CDP-alcohol phosphatidyltransferase. All of these members have the ability to catalyze the displacement of CMP from a CDP-alcohol by a second alcohol with formation of a phosphodiester bond and concomitant breaking of a phosphoride anhydride bond.¡€0€ª€0€ €CDD¡€ €°T¢€0€0€ €‚&pfam01067, Calpain_III, Calpain large subunit, domain III. The function of the domain III and I are currently unknown. Domain II is a cysteine protease and domain IV is a calcium binding domain. Calpains are believed to participate in intracellular signaling pathways mediated by calcium ions.¡€0€ª€0€ €CDD¡€ €°U¢€0€0€ €’pfam01068, DNA_ligase_A_M, ATP dependent DNA ligase domain. This domain belongs to a more diverse superfamily, including pfam01331 and pfam01653.¡€0€ª€0€ €CDD¡€ €Cy¢€0€0€ €1pfam01070, FMN_dh, FMN-dependent dehydrogenase. ¡€0€ª€0€ €CDD¡€ €°V¢€0€0€ €‚Ñpfam01071, GARS_A, Phosphoribosylglycinamide synthetase, ATP-grasp (A) domain. Phosphoribosylglycinamide synthetase catalyzes the second step in the de novo biosynthesis of purine. The reaction catalyzed by Phosphoribosylglycinamide synthetase is the ATP- dependent addition of 5-phosphoribosylamine to glycine to form 5'phosphoribosylglycinamide. This domain is related to the ATP-grasp domain of biotin carboxylase/carbamoyl phosphate synthetase (see pfam02786).¡€0€ª€0€ €CDD¡€ €°W¢€0€0€ €‚pfam01073, 3Beta_HSD, 3-beta hydroxysteroid dehydrogenase/isomerase family. The enzyme 3 beta-hydroxysteroid dehydrogenase/5-ene-4-ene isomerase (3 beta-HSD) catalyzes the oxidation and isomerisation of 5-ene-3 beta-hydroxypregnene and 5-ene-hydroxyandrostene steroid precursors into the corresponding 4-ene-ketosteroids necessary for the formation of all classes of steroid hormones.¡€0€ª€0€ €CDD¡€ €C|¢€0€0€ €Œpfam01074, Glyco_hydro_38, Glycosyl hydrolases family 38 N-terminal domain. Glycosyl hydrolases are key enzymes of carbohydrate metabolism.¡€0€ª€0€ €CDD¡€ €°X¢€0€0€ €‚”pfam01075, Glyco_transf_9, Glycosyltransferase family 9 (heptosyltransferase). Members of this family belong to glycosyltransferase family 9. Lipopolysaccharide is a major component of the outer leaflet of the outer membrane in Gram-negative bacteria. It is composed of three domains; lipid A, Core oligosaccharide and the O-antigen. All of these enzymes transfer heptose to the lipopolysaccharide core.¡€0€ª€0€ €CDD¡€ €°Y¢€0€0€ €‚Fpfam01076, Mob_Pre, Plasmid recombination enzyme. With some plasmids, recombination can occur in a site specific manner that is independent of RecA. In such cases, the recombination event requires another protein called Pre. Pre is a plasmid recombination enzyme. This protein is also known as Mob (conjugative mobilisation).¡€0€ª€0€ €CDD¡€ €°Z¢€0€0€ €‚pfam01077, NIR_SIR, Nitrite and sulphite reductase 4Fe-4S domain. Sulphite and nitrite reductases are vital in the biosynthetic assimilation of sulphur and nitrogen, respectfully. They are also both important for the dissimilation of oxidised anions for energy transduction.¡€0€ª€0€ €CDD¡€ €°[¢€0€0€ €‚ïpfam01078, Mg_chelatase, Magnesium chelatase, subunit ChlI. Magnesium-chelatase is a three-component enzyme that catalyzes the insertion of Mg2+ into protoporphyrin IX. This is the first unique step in the synthesis of (bacterio)chlorophyll. Due to this, it is thought that Mg-chelatase has an important role in channelling inter- mediates into the (bacterio)chlorophyll branch in response to conditions suitable for photosynthetic growth. ChlI and BchD have molecular weight between 38-42 kDa.¡€0€ª€0€ €CDD¡€ €°\¢€0€0€ €¤pfam01079, Hint, Hint module. This is an alignment of the Hint module in the Hedgehog proteins. It does not include any Inteins which also possess the Hint module.¡€0€ª€0€ €CDD¡€ €°]¢€0€0€ €ípfam01080, Presenilin, Presenilin. Mutations in presenilin-1 are a major cause of early onset Alzheimer's disease. It has been found that presenilin-1 binds to beta-catenin in-vivo. This family also contains SPE proteins from C.elegans.¡€0€ª€0€ €CDD¡€ €°^¢€0€0€ €Æpfam01081, Aldolase, KDPG and KHG aldolase. This family includes the following members: 4-hydroxy-2-oxoglutarate aldolase (KHG-aldolase) Phospho-2-dehydro-3-deoxygluconate aldolase (KDPG-aldolase).¡€0€ª€0€ €CDD¡€ €C„¢€0€0€ €ºpfam01082, Cu2_monooxygen, Copper type II ascorbate-dependent monooxygenase, N-terminal domain. The N and C-terminal domains of members of this family adopt the same PNGase F-like fold.¡€0€ª€0€ €CDD¡€ €°_¢€0€0€ € pfam01083, Cutinase, Cutinase. ¡€0€ª€0€ €CDD¡€ €°`¢€0€0€ €2pfam01084, Ribosomal_S18, Ribosomal protein S18. ¡€0€ª€0€ €CDD¡€ €°a¢€0€0€ €Ípfam01085, HH_signal, Hedgehog amino-terminal signalling domain. For the carboxyl Hint module, see pfam01079. Hedgehog is a family of secreted signal molecules required for embryonic cell differentiation.¡€0€ª€0€ €CDD¡€ €°b¢€0€0€ €2pfam01086, Clathrin_lg_ch, Clathrin light chain. ¡€0€ª€0€ €CDD¡€ €°c¢€0€0€ €³pfam01087, GalP_UDP_transf, Galactose-1-phosphate uridyl transferase, N-terminal domain. SCOP reports fold duplication with C-terminal domain. Both involved in Zn and Fe binding.¡€0€ª€0€ €CDD¡€ €°d¢€0€0€ €Lpfam01088, Peptidase_C12, Ubiquitin carboxyl-terminal hydrolase, family 1. ¡€0€ª€0€ €CDD¡€ €°e¢€0€0€ €4pfam01090, Ribosomal_S19e, Ribosomal protein S19e. ¡€0€ª€0€ €CDD¡€ €°f¢€0€0€ €Ppfam01091, PTN_MK_C, PTN/MK heparin-binding protein family, C-terminal domain. ¡€0€ª€0€ €CDD¡€ €°g¢€0€0€ €2pfam01092, Ribosomal_S6e, Ribosomal protein S6e. ¡€0€ª€0€ €CDD¡€ €°h¢€0€0€ €"pfam01093, Clusterin, Clusterin. ¡€0€ª€0€ €CDD¡€ €°i¢€0€0€ €ðpfam01094, ANF_receptor, Receptor family ligand binding region. This family includes extracellular ligand binding domains of a wide range of receptors. This family also includes the bacterial amino acid binding proteins of known structure.¡€0€ª€0€ €CDD¡€ €°j¢€0€0€ €,pfam01095, Pectinesterase, Pectinesterase. ¡€0€ª€0€ €CDD¡€ €°k¢€0€0€ €8pfam01096, TFIIS_C, Transcription factor S-II (TFIIS). ¡€0€ª€0€ €CDD¡€ €°l¢€0€0€ €,pfam01097, Defensin_2, Arthropod defensin. ¡€0€ª€0€ €CDD¡€ €C“¢€0€0€ €npfam01098, FTSW_RODA_SPOVE, Cell cycle protein. This entry includes the following members; FtsW, RodA, SpoVE.¡€0€ª€0€ €CDD¡€ €C”¢€0€0€ €Ñpfam01099, Uteroglobin, Uteroglobin family. Uteroglobin is a homodimer of two identical 70 amino acid polypeptides linked by two disulphide bridges. The precise role of uteroglobin has still to be elucidated.¡€0€ª€0€ €CDD¡€ €°m¢€0€0€ €'pfam01101, HMG14_17, HMG14 and HMG17. ¡€0€ª€0€ €CDD¡€ €°n¢€0€0€ €*pfam01102, Glycophorin_A, Glycophorin A. ¡€0€ª€0€ €CDD¡€ €°o¢€0€0€ €‚spfam01103, Bac_surface_Ag, Surface antigen. This entry includes the following surface antigens; D15 antigen from H.influenzae, OMA87 from P.multocida, OMP85 from N.meningitidis and N.gonorrhoeae. The family also includes a number of eukaryotic proteins that are members of the UPF0140 family. There also appears to be a relationship to pfam03865 (personal obs: C Yeats). In eukaryotes, it appears that these proteins are not surface antigens; S. cerevisiae YNL026W (SAM50) is an essential component of the Sorting and Assembly Machinery (SAM) of the mitochondrial outer membrane. The protein was localized to the mitochondria.¡€0€ª€0€ €CDD¡€ €°p¢€0€0€ €Ôpfam01104, Bunya_NS-S, Bunyavirus non-structural protein NS-s. The NS-s protein is encoded by the S RNA. This segment also encodes for the N protein. These two proteins are encoded by overlapping reading frames.¡€0€ª€0€ €CDD¡€ €C™¢€0€0€ €‚dpfam01105, EMP24_GP25L, emp24/gp25L/p24 family/GOLD. Members of this family are implicated in bringing cargo forward from the ER and binding to coat proteins by their cytoplasmic domains. This domain corresponds closely to the beta-strand rich GOLD domain described in. The GOLD domain is always found combined with lipid- or membrane-association domains.¡€0€ª€0€ €CDD¡€ €°q¢€0€0€ €÷pfam01106, NifU, NifU-like domain. This is an alignment of the carboxy-terminal domain. This is the only common region between the NifU protein from nitrogen-fixing bacteria and rhodobacterial species. The biochemical function of NifU is unknown.¡€0€ª€0€ €CDD¡€ €°r¢€0€0€ €‚ïpfam01107, MP, Viral movement protein (MP). This family includes a variety of movement proteins (MP)s. The MP is necessary for the initial cell-to-cell movement during the early stages of a viral infection. This movement is active, and it is known that the MP interacts with the plasmodesmata and possesses the ability to bind to RNA to achieve its role. This family also includes consists of virus movement proteins from the caulimovirus family. It has been suggested in cauliflower mosaic virus that these proteins mediated viral movement by modifying plasmodesmata and forming tubules in the channel that can accommodate the virus particles and references therein. The family contains a conserved DXR motif that is probably functionally important.¡€0€ª€0€ €CDD¡€ €Cœ¢€0€0€ €‚-pfam01108, Tissue_fac, Tissue factor. This family is found in metazoa, and is very similar to the fibronectin type III domain. The family is found in cytokine receptors, interleukin and interferon receptors and coagulation factor III proteins. It occurs multiple times, as does fn3, family pfam00041.¡€0€ª€0€ €CDD¡€ €°s¢€0€0€ €Fpfam01109, GM_CSF, Granulocyte-macrophage colony-stimulating factor. ¡€0€ª€0€ €CDD¡€ €4ö¢€0€0€ €/pfam01110, CNTF, Ciliary neurotrophic factor. ¡€0€ª€0€ €CDD¡€ €°t¢€0€0€ €=pfam01111, CKS, Cyclin-dependent kinase regulatory subunit. ¡€0€ª€0€ €CDD¡€ €°u¢€0€0€ €*pfam01112, Asparaginase_2, Asparaginase. ¡€0€ª€0€ €CDD¡€ €°v¢€0€0€ €‚™pfam01113, DapB_N, Dihydrodipicolinate reductase, N-terminus. Dihydrodipicolinate reductase (DapB) reduces the alpha,beta-unsaturated cyclic imine, dihydro-dipicolinate. This reaction is the second committed step in the biosynthesis of L-lysine and its precursor meso-diaminopimelate, which are critical for both protein and cell wall biosynthesis. The N-terminal domain of DapB binds the dinucleotide NADPH.¡€0€ª€0€ €CDD¡€ €°w¢€0€0€ €{pfam01114, Colipase, Colipase, N-terminal domain. SCOP reports duplication of common fold with Colipase C-terminal domain.¡€0€ª€0€ €CDD¡€ €C¢¢€0€0€ €Bpfam01115, F_actin_cap_B, F-actin capping protein, beta subunit. ¡€0€ª€0€ €CDD¡€ €°x¢€0€0€ €Dpfam01116, F_bP_aldolase, Fructose-bisphosphate aldolase class-II. ¡€0€ª€0€ €CDD¡€ €°y¢€0€0€ €bpfam01117, Aerolysin, Aerolysin toxin. This family represents the pore forming lobe of aerolysin.¡€0€ª€0€ €CDD¡€ €°z¢€0€0€ €Öpfam01118, Semialdhyde_dh, Semialdehyde dehydrogenase, NAD binding domain. This Pfam entry contains the following members: N-acetyl-glutamine semialdehyde dehydrogenase (AgrC) Aspartate-semialdehyde dehydrogenase.¡€0€ª€0€ €CDD¡€ €°{¢€0€0€ €Épfam01119, DNA_mis_repair, DNA mismatch repair protein, C-terminal domain. This family represents the C-terminal domain of the mutL/hexB/PMS1 family. This domain has a ribosomal S5 domain 2-like fold.¡€0€ª€0€ €CDD¡€ €°|¢€0€0€ €/pfam01120, Alpha_L_fucos, Alpha-L-fucosidase. ¡€0€ª€0€ €CDD¡€ €°}¢€0€0€ €Çpfam01121, CoaE, Dephospho-CoA kinase. This family catalyzes the phosphorylation of the 3'-hydroxyl group of dephosphocoenzyme A to form Coenzyme A EC:2.7.1.24. This enzyme uses ATP in its reaction.¡€0€ª€0€ €CDD¡€ €C©¢€0€0€ €Bpfam01122, Cobalamin_bind, Eukaryotic cobalamin-binding protein. ¡€0€ª€0€ €CDD¡€ €°~¢€0€0€ €Qpfam01123, Stap_Strp_toxin, Staphylococcal/Streptococcal toxin, OB-fold domain. ¡€0€ª€0€ €CDD¡€ €C«¢€0€0€ €‚èpfam01124, MAPEG, MAPEG family. This family is has been called MAPEG (Membrane Associated Proteins in Eicosanoid and Glutathione metabolism). It includes proteins such as Prostaglandin E synthase. This enzyme catalyzes the synthesis of PGE2 from PGH2 (produced by cyclooxygenase from arachidonic acid). Because of structural similarities in the active sites of FLAP, LTC4 synthase and PGE synthase, substrates for each enzyme can compete with one another and modulate synthetic activity.¡€0€ª€0€ €CDD¡€ €°¢€0€0€ €pfam01125, G10, G10 protein. ¡€0€ª€0€ €CDD¡€ €°€¢€0€0€ €,pfam01126, Heme_oxygenase, Heme oxygenase. ¡€0€ª€0€ €CDD¡€ €°¢€0€0€ €Ëpfam01127, Sdh_cyt, Succinate dehydrogenase/Fumarate reductase transmembrane subunit. This family includes a transmembrane protein from both the Succinate dehydrogenase and Fumarate reductase complexes.¡€0€ª€0€ €CDD¡€ €°‚¢€0€0€ €‚pfam01128, IspD, 2-C-methyl-D-erythritol 4-phosphate cytidylyltransferase. Members of this family are enzymes which catalyze the formation of 4-diphosphocytidyl-2-C-methyl-D-erythritol from cytidine triphosphate and 2-C-methyl-D-erythritol 4-phosphate (MEP).¡€0€ª€0€ €CDD¡€ €C°¢€0€0€ €6pfam01129, ART, NAD:arginine ADP-ribosyltransferase. ¡€0€ª€0€ €CDD¡€ €C±¢€0€0€ €Òpfam01130, CD36, CD36 family. The CD36 family is thought to be a novel class of scavenger receptors. There is also evidence suggesting a possible role in signal transduction. CD36 is involved in cell adhesion.¡€0€ª€0€ €CDD¡€ €°ƒ¢€0€0€ €‚#pfam01131, Topoisom_bac, DNA topoisomerase. This subfamily of topoisomerase is divided on the basis that these enzymes preferentially relax negatively supercoiled DNA, from a 5' phospho- tyrosine linkage in the enzyme-DNA covalent intermediate and has high affinity for single stranded DNA.¡€0€ª€0€ €CDD¡€ €°„¢€0€0€ €7pfam01132, EFP, Elongation factor P (EF-P) OB domain. ¡€0€ª€0€ €CDD¡€ €°…¢€0€0€ €Òpfam01133, ER, Enhancer of rudimentary. Enhancer of rudimentary is a protein of unknown function that is highly conserved in plants and animals. This protein is found to be an enhancer of the rudimentary gene.¡€0€ª€0€ €CDD¡€ €°†¢€0€0€ €8pfam01134, GIDA, Glucose inhibited division protein A. ¡€0€ª€0€ €CDD¡€ €Ò¢€0€0€ €Rpfam01135, PCMT, Protein-L-isoaspartate(D-aspartate) O-methyltransferase (PCMT). ¡€0€ª€0€ €CDD¡€ €Ò¢€0€0€ €1pfam01136, Peptidase_U32, Peptidase family U32. ¡€0€ª€0€ €CDD¡€ €°‡¢€0€0€ €‚épfam01137, RTC, RNA 3'-terminal phosphate cyclase. RNA cyclases are a family of RNA-modifying enzymes that are conserved in all cellular organisms. They catalyze the ATP-dependent conversion of the 3'-phosphate to the 2',3'-cyclic phosphodiester at the end of RNA, in a reaction involving formation of the covalent AMP-cyclase intermediate. The structure of RTC demonstrates that RTCs are comprised two domain. The larger domain contains an insert domain of approximately 100 amino acids.¡€0€ª€0€ €CDD¡€ €°ˆ¢€0€0€ €‚¼pfam01138, RNase_PH, 3' exoribonuclease family, domain 1. This family includes 3'-5' exoribonucleases. Ribonuclease PH contains a single copy of this domain, and removes nucleotide residues following the -CCA terminus of tRNA. Polyribonucleotide nucleotidyltransferase (PNPase) contains two tandem copies of the domain. PNPase is involved in mRNA degradation in a 3'-5' direction. The exosome is a 3'-5' exoribonuclease complex that is required for 3' processing of the 5.8S rRNA. Three of its five protein components contain a copy of this domain. A hypothetical protein from S. pombe appears to belong to an uncharacterized subfamily. This subfamily is found in both eukaryotes and archaebacteria.¡€0€ª€0€ €CDD¡€ €°‰¢€0€0€ €øpfam01139, RtcB, tRNA-splicing ligase RtcB. This family of RNA ligases (EC:6.5.1.3) join 2',3'-cyclic phosphate and 5'-OH ends. They catalyze the splicing of tRNA and may also participate in tRNA repair and recovery from stress-induced RNA damage.¡€0€ª€0€ €CDD¡€ €°Š¢€0€0€ €ƒpfam01140, Gag_MA, Matrix protein (MA), p15. The matrix protein, p15, is encoded by the gag gene. MA is involved in pathogenicity.¡€0€ª€0€ €CDD¡€ €°‹¢€0€0€ €‚pfam01141, Gag_p12, Gag polyprotein, inner coat protein p12. The retroviral p12 is a virion structural protein. p12 is proline rich. The function carried out by p12 in assembly and replication is unknown. p12 is associated with pathogenicity of the virus.¡€0€ª€0€ €CDD¡€ €C»¢€0€0€ €ëpfam01142, TruD, tRNA pseudouridine synthase D (TruD). TruD is responsible for synthesis of pseudouridine from uracil-13 in transfer RNAs. The structure of TruD reveals an overall V-shaped molecule which contains an RNA-binding cleft.¡€0€ª€0€ €CDD¡€ €°Œ¢€0€0€ €/pfam01144, CoA_trans, Coenzyme A transferase. ¡€0€ª€0€ €CDD¡€ €C½¢€0€0€ €øpfam01145, Band_7, SPFH domain / Band 7 family. This family has been called SPFH, Band 7 or PHB domain. Recent phylogenetic analysis has shown this domain to be a slipin or Stomatin-like integral membrane domain conserved from protozoa to mammals.¡€0€ª€0€ €CDD¡€ €°¢€0€0€ €‚ÿpfam01146, Caveolin, Caveolin. All three known Caveolin forms have the FEDVIAEP caveolin 'signature motif' within their hydrophilic N-terminal domain. Caveolin 2 (Cav-2) is co-localized and co-expressed with Cav-1/VIP21, forms heterodimers with it and needs Cav-1 for proper membrane localization. Cav-3 has greater protein sequence similarity to Cav-1 than to Cav-2. Cellular processes caveolins are involved in include vesicular transport, cholesterol homeostasis, signal transduction, and tumor suppression.¡€0€ª€0€ €CDD¡€ €°Ž¢€0€0€ €Ipfam01147, Crust_neurohorm, Crustacean CHH/MIH/GIH neurohormone family. ¡€0€ª€0€ €CDD¡€ €°¢€0€0€ €‚pfam01148, CTP_transf_1, Cytidylyltransferase family. The members of this family are integral membrane protein cytidylyltransferases. The family includes phosphatidate cytidylyltransferase EC:2.7.7.41 as well as Sec59 from yeast. Sec59 is a dolichol kinase EC:2.7.1.108.¡€0€ª€0€ €CDD¡€ €CÁ¢€0€0€ €‚Kpfam01149, Fapy_DNA_glyco, Formamidopyrimidine-DNA glycosylase N-terminal domain. Formamidopyrimidine-DNA glycosylase (Fpg) is a DNA repair enzyme that excises oxidised purines from damaged DNA. This family is the N-terminal domain contains eight beta-strands, forming a beta-sandwich with two alpha-helices parallel to its edges.¡€0€ª€0€ €CDD¡€ €°¢€0€0€ €Bpfam01150, GDA1_CD39, GDA1/CD39 (nucleoside phosphatase) family. ¡€0€ª€0€ €CDD¡€ €Câ€0€0€ €‚pfam01151, ELO, GNS1/SUR4 family. Members of this family are involved in long chain fatty acid elongation systems that produce the 26-carbon precursors for ceramide and sphingolipid synthesis. Predicted to be integral membrane proteins, in eukaryotes they are probably located on the endoplasmic reticulum. Yeast ELO3 affects plasma membrane H+-ATPase activity, and may act on a glucose-signaling pathway that controls the expression of several genes that are transcriptionally regulated by glucose such as PMA1.¡€0€ª€0€ €CDD¡€ €°‘¢€0€0€ €µpfam01152, Bac_globin, Bacterial-like globin. This family of heme binding proteins are found mainly in bacteria. However they can also be found in some protozoa and plants as well.¡€0€ª€0€ €CDD¡€ €°’¢€0€0€ € pfam01153, Glypican, Glypican. ¡€0€ª€0€ €CDD¡€ €°“¢€0€0€ €Rpfam01154, HMG_CoA_synt_N, Hydroxymethylglutaryl-coenzyme A synthase N terminal. ¡€0€ª€0€ €CDD¡€ €°”¢€0€0€ €‚¨pfam01155, HypA, Hydrogenase/urease nickel incorporation, metallochaperone, hypA. HypA is a metallochaperone that binds nickel to bring it safely to its target. The targets for Hypa are the nickel-containing enzymes [Ni,Fe]-hydrogenase and urease. The nickel coordinates with four nitrogens within the protein. The four conserved cysteines towards the C-terminus bind one zinc moiety probably to stabilize the protein fold.¡€0€ª€0€ €CDD¡€ €°•¢€0€0€ €Kpfam01156, IU_nuc_hydro, Inosine-uridine preferring nucleoside hydrolase. ¡€0€ª€0€ €CDD¡€ €°–¢€0€0€ €4pfam01157, Ribosomal_L21e, Ribosomal protein L21e. ¡€0€ª€0€ €CDD¡€ €°—¢€0€0€ €4pfam01158, Ribosomal_L36e, Ribosomal protein L36e. ¡€0€ª€0€ €CDD¡€ €°˜¢€0€0€ €2pfam01159, Ribosomal_L6e, Ribosomal protein L6e. ¡€0€ª€0€ €CDD¡€ €°™¢€0€0€ €Ipfam01160, Opiods_neuropep, Vertebrate endogenous opioids neuropeptide. ¡€0€ª€0€ €CDD¡€ €°š¢€0€0€ €;pfam01161, PBP, Phosphatidylethanolamine-binding protein. ¡€0€ª€0€ €CDD¡€ €°›¢€0€0€ €‚Npfam01163, RIO1, RIO1 family. This is a family of atypical serine kinases which are found in archaea, bacteria and eukaryotes. Activity of Rio1 is vital in Saccharomyces cerevisiae for the processing of ribosomal RNA, as well as for proper cell cycle progression and chromosome maintenance. The structure of RIO1 has been determined.¡€0€ª€0€ €CDD¡€ €°œ¢€0€0€ €2pfam01165, Ribosomal_S21, Ribosomal protein S21. ¡€0€ª€0€ €CDD¡€ €°¢€0€0€ €*pfam01166, TSC22, TSC-22/dip/bun family. ¡€0€ª€0€ €CDD¡€ €°ž¢€0€0€ €pfam01167, Tub, Tub family. ¡€0€ª€0€ €CDD¡€ €°Ÿ¢€0€0€ €Apfam01168, Ala_racemase_N, Alanine racemase, N-terminal domain. ¡€0€ª€0€ €CDD¡€ €° ¢€0€0€ €‚bpfam01169, UPF0016, Uncharacterized protein family UPF0016. This family contains integral membrane proteins of unknown function. Most members of the family contain two copies of a region that contains an EXGD motif. Each of these regions contains three predicted transmembrane regions. It has been suggested that these proteins are calcium transporters.¡€0€ª€0€ €CDD¡€ €°¡¢€0€0€ €·pfam01170, UPF0020, Putative RNA methylase family UPF0020. This domain is probably a methylase. It is associated with the THUMP domain that also occurs with RNA modification domains.¡€0€ª€0€ €CDD¡€ €CÕ¢€0€0€ €cpfam01171, ATP_bind_3, PP-loop family. This family of proteins belongs to the PP-loop superfamily.¡€0€ª€0€ €CDD¡€ €CÖ¢€0€0€ €‚ªpfam01172, SBDS, Shwachman-Bodian-Diamond syndrome (SBDS) protein. This family is highly conserved in species ranging from archaea to vertebrates and plants. The family contains several Shwachman-Bodian-Diamond syndrome (SBDS) proteins from both mouse and humans. Shwachman-Diamond syndrome is an autosomal recessive disorder with clinical features that include pancreatic exocrine insufficiency, haematological dysfunction and skeletal abnormalities. It is characterized by bone marrow failure and leukemia predisposition. Members of this family play a role in RNA metabolism. In yeast these proteins have been shown to be critical for the release and recycling of the nucleolar shuttling factor Tif6 from pre-60S ribosomes, a key step in 60S maturation and translational activation of ribosomes. This data links defective late 60S subunit maturation to an inherited bone marrow failure syndrome associated with leukemia predisposition.¡€0€ª€0€ €CDD¡€ €°¢¢€0€0€ €‚ipfam01174, SNO, SNO glutamine amidotransferase family. This family and its amidotransferase domain was first described in. It is predicted that members of this family are involved in the pyridoxine biosynthetic pathway, based on the proximity and co-regulation of the corresponding genes and physical interaction between the members of pfam01174 and pfam01680.¡€0€ª€0€ €CDD¡€ €°£¢€0€0€ €"pfam01175, Urocanase, Urocanase. ¡€0€ª€0€ €CDD¡€ €°¤¢€0€0€ €µpfam01176, eIF-1a, Translation initiation factor 1A / IF-1. This family includes both the eukaryotic translation factor eIF-1A and the bacterial translation initiation factor IF-1.¡€0€ª€0€ €CDD¡€ €°¥¢€0€0€ €Ðpfam01177, Asp_Glu_race, Asp/Glu/Hydantoin racemase. This family contains aspartate racemase, maleate isomerases EC:5.2.1.1, glutamate racemase, hydantoin racemase and arylmalonate decarboxylase EC:4.1.1.76.¡€0€ª€0€ €CDD¡€ €°¦¢€0€0€ €‚^pfam01179, Cu_amine_oxid, Copper amine oxidase, enzyme domain. Copper amine oxidases are a ubiquitous and novel group of quinoenzymes that catalyze the oxidative deamination of primary amines to the corresponding aldehydes, with concomitant reduction of molecular oxygen to hydrogen peroxide. The enzymes are dimers of identical 70-90 kDa subunits, each of which contains a single copper ion and a covalently bound cofactor formed by the post-translational modification of a tyrosine side chain to 2,4,5-trihydroxyphenylalanine quinone (TPQ). This family corresponds to the catalytic domain of the enzyme.¡€0€ª€0€ €CDD¡€ €°§¢€0€0€ €2pfam01180, DHO_dh, Dihydroorotate dehydrogenase. ¡€0€ª€0€ €CDD¡€ €CÜ¢€0€0€ €[pfam01182, Glucosamine_iso, Glucosamine-6-phosphate isomerases/6-phosphogluconolactonase. ¡€0€ª€0€ €CDD¡€ €°¨¢€0€0€ €;pfam01183, Glyco_hydro_25, Glycosyl hydrolases family 25. ¡€0€ª€0€ €CDD¡€ €CÞ¢€0€0€ €ñpfam01184, Grp1_Fun34_YaaH, GPR1/FUN34/yaaH family. The Ady2 protein is required for acetate in Saccharomyces cerevisiae, and is probably an acetate transporter. A homolog in Yarrowia lipolytica (GPR1) has a role in acetic acid sensitivity.¡€0€ª€0€ €CDD¡€ €°©¢€0€0€ €-pfam01185, Hydrophobin, Fungal hydrophobin. ¡€0€ª€0€ €CDD¡€ €°ª¢€0€0€ €*pfam01186, Lysyl_oxidase, Lysyl oxidase. ¡€0€ª€0€ €CDD¡€ €°«¢€0€0€ €?pfam01187, MIF, Macrophage migration inhibitory factor (MIF). ¡€0€ª€0€ €CDD¡€ €Ò;¢€0€0€ €‚`pfam01189, Methyltr_RsmB-F, 16S rRNA methyltransferase RsmB/F. This is the catalytic core of this SAM-dependent 16S ribosomal methyltransferase RsmB/F enzyme. There is a catalytic cysteine residue at 180 in UniProtKB:Q5SII2, with another highly conserved cysteine at residue 230. It methylates the C(5) position of cytosine 2870 (m5C2870) in 25S rRNA.¡€0€ª€0€ €CDD¡€ €C⢀0€0€ €:pfam01190, Pollen_Ole_e_I, Pollen proteins Ole e I like. ¡€0€ª€0€ €CDD¡€ €°¬¢€0€0€ €¼pfam01191, RNA_pol_Rpb5_C, RNA polymerase Rpb5, C-terminal domain. The assembly domain of Rpb5. The archaeal equivalent to this domain is subunit H. Subunit H lacks the N-terminal domain.¡€0€ª€0€ €CDD¡€ €°­¢€0€0€ €‚pfam01192, RNA_pol_Rpb6, RNA polymerase Rpb6. Rpb6 is an essential subunit in the eukaryotic polymerases Pol I, II and III. This family also contains the bacterial equivalent to Rpb6, the omega subunit. Rpb6 and omega are structurally conserved and both function in polymerase assembly.¡€0€ª€0€ €CDD¡€ €°®¢€0€0€ €‚ìpfam01193, RNA_pol_L, RNA polymerase Rpb3/Rpb11 dimerisation domain. The two eukaryotic subunits Rpb3 and Rpb11 dimerize to from a platform onto which the other subunits of the RNA polymerase assemble (D/L in archaea). The prokaryotic equivalent of the Rpb3/Rpb11 platform is the alpha-alpha dimer. The dimerisation domain of the alpha subunit/Rpb3 is interrupted by an insert domain (pfam01000). Some of the alpha subunits also contain iron-sulphur binding domains (pfam00037). Rpb11 is found as a continuous domain. Members of this family include: alpha subunit from eubacteria, alpha subunits from chloroplasts, Rpb3 subunits from eukaryotes, Rpb11 subunits from eukaryotes, RpoD subunits from archaeal spp, and RpoL subunits from archaeal spp.¡€0€ª€0€ €CDD¡€ €°¯¢€0€0€ €:pfam01194, RNA_pol_N, RNA polymerases N / 8 kDa subunit. ¡€0€ª€0€ €CDD¡€ €°°¢€0€0€ €6pfam01195, Pept_tRNA_hydro, Peptidyl-tRNA hydrolase. ¡€0€ª€0€ €CDD¡€ €°±¢€0€0€ €2pfam01196, Ribosomal_L17, Ribosomal protein L17. ¡€0€ª€0€ €CDD¡€ €°²¢€0€0€ €2pfam01197, Ribosomal_L31, Ribosomal protein L31. ¡€0€ª€0€ €CDD¡€ €°³¢€0€0€ €4pfam01198, Ribosomal_L31e, Ribosomal protein L31e. ¡€0€ª€0€ €CDD¡€ €°´¢€0€0€ €4pfam01199, Ribosomal_L34e, Ribosomal protein L34e. ¡€0€ª€0€ €CDD¡€ €°µ¢€0€0€ €4pfam01200, Ribosomal_S28e, Ribosomal protein S28e. ¡€0€ª€0€ €CDD¡€ €°¶¢€0€0€ €2pfam01201, Ribosomal_S8e, Ribosomal protein S8e. ¡€0€ª€0€ €CDD¡€ €°·¢€0€0€ €#pfam01202, SKI, Shikimate kinase. ¡€0€ª€0€ €CDD¡€ €C0€0€ €Âpfam01203, T2SSN, Type II secretion system (T2SS), protein N. Members of the T2SN family are involved in the Type II protein secretion system. The precise function of these proteins is unknown.¡€0€ª€0€ €CDD¡€ €°¸¢€0€0€ €‚,pfam01204, Trehalase, Trehalase. Trehalase (EC:3.2.1.28) is known to recycle trehalose to glucose. Trehalose is a physiological hallmark of heat-shock response in yeast and protects of proteins and membranes against a variety of stresses. This family is found in conjunction with pfam07492 in fungi.¡€0€ª€0€ €CDD¡€ €°¹¢€0€0€ €=pfam01205, UPF0029, Uncharacterized protein family UPF0029. ¡€0€ª€0€ €CDD¡€ €°º¢€0€0€ €[pfam01206, TusA, Sulfurtransferase TusA. This family includes the TusA sulfurtransferases.¡€0€ª€0€ €CDD¡€ €°»¢€0€0€ €‚Dpfam01207, Dus, Dihydrouridine synthase (Dus). Members of this family catalyze the reduction of the 5,6-double bond of a uridine residue on tRNA. Dihydrouridine modification of tRNA is widely observed in prokaryotes and eukaryotes, and also in some archae. Most dihydrouridines are found in the D loop of t-RNAs. The role of dihydrouridine in tRNA is currently unknown, but may increase conformational flexibility of the tRNA. It is likely that different family members have different substrate specificities, which may overlap. Dus 1 from Saccharomyces cerevisiae acts on pre-tRNA-Phe, while Dus 2 acts on pre-tRNA-Tyr and pre-tRNA-Leu. Dus 1 is active as a single subunit, requiring NADPH or NADH, and is stimulated by the presence of FAD. Some family members may be targeted to the mitochondria and even have a role in mitochondria.¡€0€ª€0€ €CDD¡€ €°¼¢€0€0€ €;pfam01208, URO-D, Uroporphyrinogen decarboxylase (URO-D). ¡€0€ª€0€ €CDD¡€ €°½¢€0€0€ €Apfam01209, Ubie_methyltran, ubiE/COQ5 methyltransferase family. ¡€0€ª€0€ €CDD¡€ €Cö¢€0€0€ €‚"pfam01210, NAD_Gly3P_dh_N, NAD-dependent glycerol-3-phosphate dehydrogenase N-terminus. NAD-dependent glycerol-3-phosphate dehydrogenase (GPDH) catalyzes the interconversion of dihydroxyacetone phosphate and L-glycerol-3-phosphate. This family represents the N-terminal NAD-binding domain.¡€0€ª€0€ €CDD¡€ €°¾¢€0€0€ €5pfam01212, Beta_elim_lyase, Beta-eliminating lyase. ¡€0€ª€0€ €CDD¡€ €°¿¢€0€0€ €Bpfam01213, CAP_N, Adenylate cyclase associated (CAP) N terminal. ¡€0€ª€0€ €CDD¡€ €°À¢€0€0€ €=pfam01214, CK_II_beta, Casein kinase II regulatory subunit. ¡€0€ª€0€ €CDD¡€ €°Á¢€0€0€ €4pfam01215, COX5B, Cytochrome c oxidase subunit Vb. ¡€0€ª€0€ €CDD¡€ €°Â¢€0€0€ €*pfam01216, Calsequestrin, Calsequestrin. ¡€0€ª€0€ €CDD¡€ €°Ã¢€0€0€ €Bpfam01217, Clat_adaptor_s, Clathrin adaptor complex small chain. ¡€0€ª€0€ €CDD¡€ €ÒS¢€0€0€ €=pfam01218, Coprogen_oxidas, Coproporphyrinogen III oxidase. ¡€0€ª€0€ €CDD¡€ €°Ä¢€0€0€ €pfam01302, CAP_GLY, CAP-Gly domain. Cytoskeleton-associated proteins (CAPs) are involved in the organisation of microtubules and transportation of vesicles and organelles along the cytoskeletal network. A conserved motif, CAP-Gly, has been identified in a number of CAPs, including CLIP-170 and dynactins. The crystal structure of Caenorhabditis elegans F53F4.3 protein CAP-Gly domain was recently solved. The domain contains three beta-strands. The most conserved sequence, GKNDG, is located in two consecutive sharp turns on the surface, forming the entrance to a groove.¡€0€ª€0€ €CDD¡€ €± ¢€0€0€ €²pfam01303, Egg_lysin, Egg lysin (Sperm-lysin). Egg lysin creates a hole in the envelope of the egg thereby allowing the sperm to pass through the envelope and fuse with the egg.¡€0€ª€0€ €CDD¡€ €± ¢€0€0€ €Fpfam01304, Gas_vesicle_C, Gas vesicles protein GVPc repeated domain. ¡€0€ª€0€ €CDD¡€ €DJ¢€0€0€ €spfam01306, LacY_symp, LacY proton/sugar symporter. This family is closely related to the sugar transporter family.¡€0€ª€0€ €CDD¡€ €®ï¢€0€0€ €èpfam01307, Plant_vir_prot, Plant viral movement protein. This family includes several known plant viral movement proteins from a number of different ssRNA plant virus families including potexviruses, hordeiviruses and carlaviruses.¡€0€ª€0€ €CDD¡€ €DK¢€0€0€ €‚Åpfam01308, Chlam_OMP, Chlamydia major outer membrane protein. The major outer membrane protein of Chlamydia contains four symmetrically spaced variable domains (VDs I to IV). This protein is believed to be an integral part to the pathogenesis, possibly adhesion. Along with the lipopolysaccharide, the major out membrane protein (MOMP) makes up the surface of the elementary body cell. The MOMP is the protein used to determine the different serotypes.¡€0€ª€0€ €CDD¡€ €DL¢€0€0€ €Ôpfam01309, EAV_GS, Equine arteritis virus small envelope glycoprotein. Equine arteritis virus small envelope glycoprotein (Gs) is a class I transmembrane protein which adopts a number of different conformations.¡€0€ª€0€ €CDD¡€ €DM¢€0€0€ €xpfam01310, Adeno_PVIII, Adenovirus hexon associated protein, protein VIII. See pfam01065. This family represents Hexon.¡€0€ª€0€ €CDD¡€ €DN¢€0€0€ €‚opfam01311, Bac_export_1, Bacterial export proteins, family 1. This family includes the following members; FliR, MopE, SsaT, YopT, Hrp, HrcT and SpaR All of these members export proteins, that do not possess signal peptides, through the membrane. Although the proteins that these exporters move may be different, the exporters are thought to function in similar ways.¡€0€ª€0€ €CDD¡€ €± ¢€0€0€ €‚pfam01312, Bac_export_2, FlhB HrpN YscU SpaS Family. This family includes the following members: FlhB, HrpN, YscU, SpaS, HrcU SsaU and YopU. All of these proteins export peptides using the type III secretion system. The peptides exported are quite diverse.¡€0€ª€0€ €CDD¡€ €± ¢€0€0€ €‚ipfam01313, Bac_export_3, Bacterial export proteins, family 3. This family includes the following members; FliQ, MopD, HrcS, Hrp, YopS and SpaQ All of these members export proteins, that do not possess signal peptides, through the membrane. Although the proteins that these exporters move may be different, the exporters are thought to function in similar ways.¡€0€ª€0€ €CDD¡€ €± ¢€0€0€ €‚ pfam01314, AFOR_C, Aldehyde ferredoxin oxidoreductase, domains 2 & 3. Aldehyde ferredoxin oxidoreductase (AOR) catalyzes the reversible oxidation of aldehydes to their corresponding carboxylic acids with their accompanying reduction of the redox protein ferredoxin. This family is composed of two structural domains that bind the tungsten cofactor via DXXGL(C/D) motifs. In addition to maintaining specific binding interactions with the cofactor, another role for domains 2 and 3 may be to regulate substrate access to AOR.¡€0€ª€0€ €CDD¡€ €±¢€0€0€ €^pfam01315, Ald_Xan_dh_C, Aldehyde oxidase and xanthine dehydrogenase, a/b hammerhead domain. ¡€0€ª€0€ €CDD¡€ €±¢€0€0€ €Cpfam01316, Arg_repressor, Arginine repressor, DNA binding domain. ¡€0€ª€0€ €CDD¡€ €±¢€0€0€ €1pfam01318, Bromo_coat, Bromovirus coat protein. ¡€0€ª€0€ €CDD¡€ €DU¢€0€0€ €Ppfam01320, Colicin_Pyocin, Colicin immunity protein / pyocin immunity protein. ¡€0€ª€0€ €CDD¡€ €DV¢€0€0€ €Êpfam01321, Creatinase_N, Creatinase/Prolidase N-terminal domain. This family includes the N-terminal non-catalytic domains from creatinase and prolidase. The exact function of this domain is uncertain.¡€0€ª€0€ €CDD¡€ €±¢€0€0€ €*pfam01322, Cytochrom_C_2, Cytochrome C'. ¡€0€ª€0€ €CDD¡€ €±¢€0€0€ €‚‚pfam01323, DSBA, DSBA-like thioredoxin domain. This family contains a diverse set of proteins with a thioredoxin-like structure pfam00085. This family also includes 2-hydroxychromene-2-carboxylate (HCCA) isomerase enzymes catalyze one step in prokaryotic polyaromatic hydrocarbon (PAH) catabolic pathways. This family also contains members with functions other than HCCA isomerisation, such as Kappa family GSTs, whose similarity to HCCA isomerases was not previously recognized. Some members have been annotated as dioxygenases, dehydrogenases, or putative glycerol-3-phosphate transfer proteins, but are most likely HCCA isomerase enzymes.¡€0€ª€0€ €CDD¡€ €±¢€0€0€ €Çpfam01324, Diphtheria_R, Diphtheria toxin, R domain. C-terminal receptor binding (R) domain - binds to cell surface receptor, permitting the toxin to enter the cell by receptor mediated endocytosis.¡€0€ª€0€ €CDD¡€ €DZ¢€0€0€ €ºpfam01325, Fe_dep_repress, Iron dependent repressor, N-terminal DNA binding domain. This family includes the Diphtheria toxin repressor. DNA binding is through a helix-turn-helix motif.¡€0€ª€0€ €CDD¡€ €D[¢€0€0€ €¹pfam01326, PPDK_N, Pyruvate phosphate dikinase, PEP/pyruvate binding domain. This enzyme catalyzes the reversible conversion of ATP to AMP, pyrophosphate and phosphoenolpyruvate (PEP).¡€0€ª€0€ €CDD¡€ €±¢€0€0€ €6pfam01327, Pep_deformylase, Polypeptide deformylase. ¡€0€ª€0€ €CDD¡€ €±¢€0€0€ €{pfam01328, Peroxidase_2, Peroxidase, family 2. The peroxidases in this family do not have similarity to other peroxidases.¡€0€ª€0€ €CDD¡€ €D^¢€0€0€ €½pfam01329, Pterin_4a, Pterin 4 alpha carbinolamine dehydratase. Pterin 4 alpha carbinolamine dehydratase is also known as DCoH (dimerisation cofactor of hepatocyte nuclear factor 1-alpha).¡€0€ª€0€ €CDD¡€ €±¢€0€0€ €‘pfam01330, RuvA_N, RuvA N terminal domain. The N terminal domain of RuvA has an OB-fold structure. This domain forms the RuvA tetramer contacts.¡€0€ª€0€ €CDD¡€ €±¢€0€0€ €—pfam01331, mRNA_cap_enzyme, mRNA capping enzyme, catalytic domain. This family represents the ATP binding catalytic domain of the mRNA capping enzyme.¡€0€ª€0€ €CDD¡€ €Da¢€0€0€ €mpfam01333, Apocytochr_F_C, Apocytochrome F, C-terminal. This is a sub-family of cytochrome C. See pfam00034.¡€0€ª€0€ €CDD¡€ €±¢€0€0€ €(pfam01335, DED, Death effector domain. ¡€0€ª€0€ €CDD¡€ €±¢€0€0€ €‚fpfam01336, tRNA_anti-codon, OB-fold nucleic acid binding domain. This family contains OB-fold domains that bind to nucleic acids. The family includes the anti-codon binding domain of lysyl, aspartyl, and asparaginyl -tRNA synthetases (see pfam00152). Aminoacyl-tRNA synthetases catalyze the addition of an amino acid to the appropriate tRNA molecule EC:6.1.1.-. This family also includes part of RecG helicase involved in DNA repair. Replication factor A is a hetero-trimeric complex, that contains a subunit in this family. This domain is also found at the C-terminus of bacterial DNA polymerase III alpha chain.¡€0€ª€0€ €CDD¡€ €±¢€0€0€ €2pfam01337, Barstar, Barstar (barnase inhibitor). ¡€0€ª€0€ €CDD¡€ €±¢€0€0€ €:pfam01338, Bac_thur_toxin, Bacillus thuringiensis toxin. ¡€0€ª€0€ €CDD¡€ €±¢€0€0€ €1pfam01339, CheB_methylest, CheB methylesterase. ¡€0€ª€0€ €CDD¡€ €±¢€0€0€ €+pfam01340, MetJ, Met Apo-repressor, MetJ. ¡€0€ª€0€ €CDD¡€ €Dh¢€0€0€ €9pfam01341, Glyco_hydro_6, Glycosyl hydrolases family 6. ¡€0€ª€0€ €CDD¡€ €±¢€0€0€ €‚ypfam01342, SAND, SAND domain. The DNA binding activity of two proteins has been mapped to the SAND domain. The conserved KDWK motif is necessary for DNA binding, and it appears to be important for dimerisation. This region is also found in the putative transcription factor RegA from the multicellular green alga Volvox cateri. This region of RegA is known as the VARL domain.¡€0€ª€0€ €CDD¡€ €±¢€0€0€ €1pfam01343, Peptidase_S49, Peptidase family S49. ¡€0€ª€0€ €CDD¡€ €Dk¢€0€0€ €‚›pfam01344, Kelch_1, Kelch motif. The kelch motif was initially discovered in Kelch. In this protein there are six copies of the motif. It has been shown that the ring canal kelch protein is related to Galactose Oxidase for which a structure has been solved. The kelch motif forms a beta sheet. Several of these sheets associate to form a beta propeller structure as found in pfam00064, pfam00400 and pfam00415.¡€0€ª€0€ €CDD¡€ €Dl¢€0€0€ €‚pfam01345, DUF11, Domain of unknown function DUF11. A domain of unknown function found in multiple copies in several archaebacterial proteins. Conserved N-terminal lysine and C-terminal asparagine with central asp/glu suggests that many of these domain may contain an isopeptide bond.¡€0€ª€0€ €CDD¡€ €± ¢€0€0€ €³pfam01346, FKBP_N, Domain amino terminal to FKBP-type peptidyl-prolyl isomerase. This family is only found at the amino terminus of pfam00254. This domain is of unknown function.¡€0€ª€0€ €CDD¡€ €±!¢€0€0€ €‚Cpfam01347, Vitellogenin_N, Lipoprotein amino terminal region. This family contains regions from: Vitellogenin, Microsomal triglyceride transfer protein and apolipoprotein B-100. These proteins are all involved in lipid transport. This family contains the LV1n chain from lipovitellin, that contains two structural domains.¡€0€ª€0€ €CDD¡€ €Do¢€0€0€ €‚Ûpfam01348, Intron_maturas2, Type II intron maturase. Group II introns use intron-encoded reverse transcriptase, maturase and DNA endonuclease activities for site-specific insertion into DNA. Although this type of intron is self splicing in vitro they require a maturase protein for splicing in vivo. It has been shown that a specific region of the aI2 intron is needed for the maturase function. This region was found to be conserved in group II introns and called domain X.¡€0€ª€0€ €CDD¡€ €Dp¢€0€0€ €‚Épfam01349, Flavi_NS4B, Flavivirus non-structural protein NS4B. Flaviviruses encode a single polyprotein. This is cleaved into three structural and seven non-structural proteins. The NS4B protein is small and poorly conserved among the Flaviviruses. NS4B contains multiple hydrophobic potential membrane spanning regions. NS4B may form membrane components of the viral replication complex and could be involved in membrane localization of NS3 and pfam00972.¡€0€ª€0€ €CDD¡€ €Dq¢€0€0€ €‚}pfam01350, Flavi_NS4A, Flavivirus non-structural protein NS4A. Flaviviruses encode a single polyprotein. This is cleaved into three structural and seven non-structural proteins. The NS4A protein is small and poorly conserved among the Flaviviruses. NS4A contains multiple hydrophobic potential membrane spanning regions. NS4A has only been found in cells infected by Kunjin virus.¡€0€ª€0€ €CDD¡€ €Dr¢€0€0€ €)pfam01351, RNase_HII, Ribonuclease HII. ¡€0€ª€0€ €CDD¡€ €Ds¢€0€0€ €‚…pfam01352, KRAB, KRAB box. The KRAB domain (or Kruppel-associated box) is present in about a third of zinc finger proteins containing C2H2 fingers. The KRAB domain is found to be involved in protein-protein interactions. The KRAB domain is generally encoded by two exons. The regions coded by the two exons are known as KRAB-A and KRAB-B. The A box plays an important role in repression by binding to corepressors, while the B box is thought to enhance this repression brought about by the A box. KRAB-containing proteins are thought to have critical functions in cell proliferation and differentiation, apoptosis and neoplastic transformation.¡€0€ª€0€ €CDD¡€ €±"¢€0€0€ €,pfam01353, GFP, Green fluorescent protein. ¡€0€ª€0€ €CDD¡€ €±#¢€0€0€ €7pfam01355, HIPIP, High potential iron-sulfur protein. ¡€0€ª€0€ €CDD¡€ €±$¢€0€0€ €6pfam01356, A_amylase_inhib, Alpha amylase inhibitor. ¡€0€ª€0€ €CDD¡€ €¯¢€0€0€ €vpfam01357, Pollen_allerg_1, Pollen allergen. This family contains allergens lol PI, PII and PIII from Lolium perenne.¡€0€ª€0€ €CDD¡€ €±%¢€0€0€ €Cpfam01358, PARP_regulatory, Poly A polymerase regulatory subunit. ¡€0€ª€0€ €CDD¡€ €Dx¢€0€0€ €jpfam01359, Transposase_1, Transposase (partial DDE domain). This family includes the mariner transposase.¡€0€ª€0€ €CDD¡€ €±&¢€0€0€ €¹pfam01361, Tautomerase, Tautomerase enzyme. This family includes the enzyme 4-oxalocrotonate tautomerase, which catalyzes the ketonisation of 2-hydroxymuconate to 2-oxo-3-hexenedioate.¡€0€ª€0€ €CDD¡€ €Dy¢€0€0€ €‚üpfam01363, FYVE, FYVE zinc finger. The FYVE zinc finger is named after four proteins that it has been found in: Fab1, YOTB/ZK632.12, Vac1, and EEA1. The FYVE finger has been shown to bind two Zn++ ions. The FYVE finger has eight potential zinc coordinating cysteine positions. Many members of this family also include two histidines in a motif R+HHC+XCG, where + represents a charged residue and X any residue. We have included members which do not conserve these histidine residues but are clearly related.¡€0€ª€0€ €CDD¡€ €±'¢€0€0€ €1pfam01364, Peptidase_C25, Peptidase family C25. ¡€0€ª€0€ €CDD¡€ €±(¢€0€0€ €‚pfam01365, RYDR_ITPR, RIH domain. The RIH (RyR and IP3R Homology) domain is an extracellular domain from two types of calcium channels. This region is found in the ryanodine receptor and the inositol-1,4,5-trisphosphate receptor. This domain may form a binding site for IP3.¡€0€ª€0€ €CDD¡€ €±)¢€0€0€ €¡pfam01366, PRTP, Herpesvirus processing and transport protein. The members of this family are associate with capsid intermediates during packaging of the virus.¡€0€ª€0€ €CDD¡€ €D}¢€0€0€ €@pfam01367, 5_3_exonuc, 5'-3' exonuclease, C-terminal SAM fold. ¡€0€ª€0€ €CDD¡€ €±*¢€0€0€ €§pfam01368, DHH, DHH family. It is predicted that this family of proteins all perform a phosphoesterase function. It included the single stranded DNA exonuclease RecJ.¡€0€ª€0€ €CDD¡€ €±+¢€0€0€ €vpfam01369, Sec7, Sec7 domain. The Sec7 domain is a guanine-nucleotide-exchange-factor (GEF) for the pfam00025 family.¡€0€ª€0€ €CDD¡€ €±,¢€0€0€ €Öpfam01370, Epimerase, NAD dependent epimerase/dehydratase family. This family of proteins utilize NAD as a cofactor. The proteins in this family use nucleotide-sugar substrates for a variety of chemical reactions.¡€0€ª€0€ €CDD¡€ €±-¢€0€0€ €pfam01371, Trp_repressor, Trp repressor protein. This protein binds to tryptophan and represses transcription of the Trp operon.¡€0€ª€0€ €CDD¡€ €±.¢€0€0€ € pfam01372, Melittin, Melittin. ¡€0€ª€0€ €CDD¡€ €Dƒ¢€0€0€ €Xpfam01373, Glyco_hydro_14, Glycosyl hydrolase family 14. This family are beta amylases.¡€0€ª€0€ €CDD¡€ €±/¢€0€0€ €^pfam01374, Glyco_hydro_46, Glycosyl hydrolase family 46. This family are chitosanase enzymes.¡€0€ª€0€ €CDD¡€ €±0¢€0€0€ €@pfam01375, Enterotoxin_a, Heat-labile enterotoxin alpha chain. ¡€0€ª€0€ €CDD¡€ €±1¢€0€0€ €?pfam01376, Enterotoxin_b, Heat-labile enterotoxin beta chain. ¡€0€ª€0€ €CDD¡€ €D‡¢€0€0€ €—pfam01378, IgG_binding_B, B domain. This domain is found as a tandem repeat in Streptococcal cell surface proteins, such as the IgG binding protein G.¡€0€ª€0€ €CDD¡€ €5¹¢€0€0€ €^pfam01379, Porphobil_deam, Porphobilinogen deaminase, dipyromethane cofactor binding domain. ¡€0€ª€0€ €CDD¡€ €±2¢€0€0€ €‚?pfam01380, SIS, SIS domain. SIS (Sugar ISomerase) domains are found in many phosphosugar isomerases and phosphosugar binding proteins. SIS domains are also found in proteins that regulate the expression of genes involved in synthesis of phosphosugars. Presumably the SIS domains bind to the end-product of the pathway.¡€0€ª€0€ €CDD¡€ €D‰¢€0€0€ €‚‰pfam01381, HTH_3, Helix-turn-helix. This large family of DNA binding helix-turn helix proteins includes Cro and CI. Within Neisseria gonorrhoeae NGO_0477, the full protein fold incorporates a helix-turn-helix motif, but the function of this member is unlikely to be that of a DNA-binding regulator, the function of most other members, so is not necessarily characteristic of the whole family.¡€0€ª€0€ €CDD¡€ €±3¢€0€0€ €#pfam01382, Avidin, Avidin family. ¡€0€ª€0€ €CDD¡€ €±4¢€0€0€ €6pfam01383, CpcD, CpcD/allophycocyanin linker domain. ¡€0€ª€0€ €CDD¡€ €±5¢€0€0€ €Äpfam01384, PHO4, Phosphate transporter family. This family includes PHO-4 from Neurospora crassa which is a is a Na(+)-phosphate symporter. This family also contains the leukaemia virus receptor.¡€0€ª€0€ €CDD¡€ €±6¢€0€0€ €–pfam01385, OrfB_IS605, Probable transposase. This family includes IS891, IS1136 and IS1341. DUF1225, pfam06774, has now been merged into this family.¡€0€ª€0€ €CDD¡€ €±7¢€0€0€ €¾pfam01386, Ribosomal_L25p, Ribosomal L25p family. Ribosomal protein L25 is an RNA binding protein, that binds 5S rRNA. This family includes Ctc from B. subtilis, which is induced by stress.¡€0€ª€0€ €CDD¡€ €±8¢€0€0€ €‚5pfam01387, Synuclein, Synuclein. There are three types of synucleins in humans, these are called alpha, beta and gamma. Alpha synuclein has been found mutated in families with autosomal dominant Parkinson's disease. A peptide of alpha synuclein has also been found in amyloid plaques in Alzheimer's patients.¡€0€ª€0€ €CDD¡€ €±9¢€0€0€ €’pfam01388, ARID, ARID/BRIGHT DNA binding domain. This domain is know as ARID for AT-Rich Interaction Domain, and also known as the BRIGHT domain.¡€0€ª€0€ €CDD¡€ €±:¢€0€0€ €ùpfam01389, OmpA_membrane, OmpA-like transmembrane domain. The structure of OmpA transmembrane domain shows that it consists of an eight stranded beta barrel. This family includes some other distantly related outer membrane proteins with low scores.¡€0€ª€0€ €CDD¡€ €±;¢€0€0€ €çpfam01390, SEA, SEA domain. Domain found in Sea urchin sperm protein, Enterokinase, Agrin (SEA). Proposed function of regulating or binding carbohydrate side chains. Recently a proteolytic activity has been shown for a SEA domain.¡€0€ª€0€ €CDD¡€ €±<¢€0€0€ €‚2pfam01391, Collagen, Collagen triple helix repeat (20 copies). Members of this family belong to the collagen superfamily. Collagens are generally extracellular structural proteins involved in formation of connective tissue structure. The alignment contains 20 copies of the G-X-Y repeat that forms a triple helix. The first position of the repeat is glycine, the second and third positions can be any residue but are frequently proline and hydroxy-proline. Collagens are post translationally modified by proline hydroxylase to form the hydroxy-proline residues. Defective hydroxylation is the cause of scurvy. Some members of the collagen superfamily are not involved in connective tissue structure but share the same triple helical structure. The family includes bacterial collagen-like triple-helix repeat proteins.¡€0€ª€0€ €CDD¡€ €æ¢€0€0€ €æpfam01392, Fz, Fz domain. Also known as the CRD (cysteine rich domain), the C6 box in MuSK receptor. This domain of unknown function has been independently identified by several groups. The domain contains 10 conserved cysteines.¡€0€ª€0€ €CDD¡€ €±=¢€0€0€ € pfam01393, Chromo_shadow, Chromo shadow domain. This domain is distantly related to pfam00385. This domain is always found in association with a chromo domain.¡€0€ª€0€ €CDD¡€ €±>¢€0€0€ €‚Åpfam01394, Clathrin_propel, Clathrin propeller repeat. Clathrin is the scaffold protein of the basket-like coat that surrounds coated vesicles. The soluble assembly unit, a triskelion, contains three heavy chains and three light chains in an extended three-legged structure. Each leg contains one heavy and one light chain. The N-terminus of the heavy chain is known as the globular domain, and is composed of seven repeats which form a beta propeller.¡€0€ª€0€ €CDD¡€ €D–¢€0€0€ €‚pfam01395, PBP_GOBP, PBP/GOBP family. The olfactory receptors of terrestrial animals exist in an aqueous environment, yet detect odorants that are primarily hydrophobic. The aqueous solubility of hydrophobic odorants is thought to be greatly enhanced via odorant binding proteins which exist in the extracellular fluid surrounding the odorant receptors. This family is composed of pheromone binding proteins (PBP), which are male-specific and associate with pheromone-sensitive neurons and general-odorant binding proteins (GOBP).¡€0€ª€0€ €CDD¡€ €±?¢€0€0€ €Fpfam01396, zf-C4_Topoisom, Topoisomerase DNA binding C4 zinc finger. ¡€0€ª€0€ €CDD¡€ €±@¢€0€0€ €‚!pfam01397, Terpene_synth, Terpene synthase, N-terminal domain. It has been suggested that this gene family be designated tps (for terpene synthase). It has been split into six subgroups on the basis of phylogeny, called tpsa-tpsf. tpsa includes vetispiridiene synthase, 5-epi- aristolochene synthase, and (+)-delta-cadinene synthase. tpsb includes (-)-limonene synthase. tpsc includes kaurene synthase A. tpsd includes taxadiene synthase, pinene synthase, and myrcene synthase. tpse includes kaurene synthase B. tpsf includes linalool synthase.¡€0€ª€0€ €CDD¡€ €±A¢€0€0€ €‚æpfam01398, JAB, JAB1/Mov34/MPN/PAD-1 ubiquitin protease. Members of this family are found in proteasome regulatory subunits, eukaryotic initiation factor 3 (eIF3) subunits and regulators of transcription factors. This family is also known as the MPN domain and PAD-1-like domain, JABP1 domain or JAMM domain. These are metalloenzymes that function as the ubiquitin isopeptidase/ deubiquitinase in the ubiquitin-based signalling and protein turnover pathways in eukaryotes. Versions of the domain in prokaryotic cognates of the ubiquitin-modification pathway are shown to have a similar role, and the archael protein from Haloferax volcanii is found to cleave ubiquitin-like small archaeal modifier proteins (SAMP1/2) from protein conjugates.¡€0€ª€0€ €CDD¡€ €Dš¢€0€0€ €tpfam01399, PCI, PCI domain. This domain has also been called the PINT motif (Proteasome, Int-6, Nip-1 and TRIP-15).¡€0€ª€0€ €CDD¡€ €±B¢€0€0€ €‚¡pfam01400, Astacin, Astacin (Peptidase family M12A). The members of this family are enzymes that cleave peptides. These proteases require zinc for catalysis. Members of this family contain two conserved disulphide bridges, these are joined 1-4 and 2-3. Members of this family have an amino terminal propeptide which is cleaved to give the active protease domain. All other linked domains are found to the carboxyl terminus of this domain. This family includes: Astacin, a digestive enzyme from Crayfish. Meprin, a multiple domain membrane component that is constructed from a homologous alpha and beta chain. Proteins involved in morphogenesis and Tolloid from drosophila.¡€0€ª€0€ €CDD¡€ €Dœ¢€0€0€ €‚ˆpfam01401, Peptidase_M2, Angiotensin-converting enzyme. Members of this family are dipeptidyl carboxydipeptidases (cleave carboxyl dipeptides) and most notably convert angiotensin I to angiotensin II. Many members of this family contain a tandem duplication of the 600 amino acid peptidase domain, both of these are catalytically active. Most members are secreted membrane bound ectoenzymes.¡€0€ª€0€ €CDD¡€ €±C¢€0€0€ €‚Upfam01402, RHH_1, Ribbon-helix-helix protein, copG family. The structure of this protein repressor, which is the shortest reported to date and the first isolated from a plasmid, has a homodimeric ribbon-helix-helix arrangement. The helix-turn-helix-like structure is involved in dimerisation and not DNA binding as might have been expected.¡€0€ª€0€ €CDD¡€ €Dž¢€0€0€ €‚pfam01403, Sema, Sema domain. The Sema domain occurs in semaphorins, which are a large family of secreted and transmembrane proteins, some of which function as repellent signals during axon guidance. Sema domains also occur in the hepatocyte growth factor receptor and plexin-A3.¡€0€ª€0€ €CDD¡€ €±D¢€0€0€ €òpfam01404, Ephrin_lbd, Ephrin receptor ligand binding domain. The Eph receptors, which bind to ephrins pfam00812 are a large family of receptor tyrosine kinases. This family represents the amino terminal domain which binds the ephrin ligand.¡€0€ª€0€ €CDD¡€ €±E¢€0€0€ €‚ipfam01405, PsbT, Photosystem II reaction centre T protein. The exact function of this protein is unknown. It probably consists of a single transmembrane spanning helix. The Chlamydomonas reinhardtii psbT protein, appears to be (i) a novel photosystem II subunit and (ii) required for maintaining optimal photosystem II activity under adverse growth conditions.¡€0€ª€0€ €CDD¡€ €D¡¢€0€0€ €~pfam01406, tRNA-synt_1e, tRNA synthetases class I (C) catalytic domain. This family includes only cysteinyl tRNA synthetases.¡€0€ª€0€ €CDD¡€ €D¢¢€0€0€ €‚Ypfam01407, Gemini_AL3, Geminivirus AL3 protein. Geminiviruses are small, ssDNA-containing plant viruses. Geminiviruses contain three ORFs (designated AL1, AL2, and AL3) that overlap and are specified by multiple polycistronic mRNAs. The AL3 protein comprises approximately 0.05% of the cellular proteins and is present in the soluble and organelle fractions. AL3 may form oligomers. Immunoprecipitation of AL3 in a baculovirus expression system extracts expressing both AL1 pfam00799 and AL3 showed that the two proteins also complex with each other. The AL3 protein is involved in viral replication.¡€0€ª€0€ €CDD¡€ €D£¢€0€0€ €´pfam01408, GFO_IDH_MocA, Oxidoreductase family, NAD-binding Rossmann fold. This family of enzymes utilize NADP or NAD. This family is called the GFO/IDH/MOCA family in swiss-prot.¡€0€ª€0€ €CDD¡€ €D¤¢€0€0€ €æpfam01409, tRNA-synt_2d, tRNA synthetases class II core domain (F). Other tRNA synthetase sub-families are too dissimilar to be included. This family includes only phenylalanyl-tRNA synthetases. This is the core catalytic domain.¡€0€ª€0€ €CDD¡€ €±F¢€0€0€ €Ëpfam01410, COLFI, Fibrillar collagen C-terminal domain. Found at C-termini of fibrillar collagens: Ephydatia muelleri procollagen EMF1 alpha, vertebrate collagens alpha(1)III, alpha(1)II, alpha(2)V etc.¡€0€ª€0€ €CDD¡€ €±G¢€0€0€ €±pfam01411, tRNA-synt_2c, tRNA synthetases class II (A). Other tRNA synthetase sub-families are too dissimilar to be included. This family includes only alanyl-tRNA synthetases.¡€0€ª€0€ €CDD¡€ €D§¢€0€0€ €ßpfam01412, ArfGap, Putative GTPase activating protein for Arf. Putative zinc fingers with GTPase activating proteins (GAPs) towards the small GTPase, Arf. The GAP of ARD1 stimulates GTPase hydrolysis for ARD1 but not ARFs.¡€0€ª€0€ €CDD¡€ €±H¢€0€0€ €Ìpfam01413, C4, C-terminal tandem repeated domain in type 4 procollagen. Duplicated domain in C-terminus of type 4 collagens. Mutations in alpha-5 collagen IV are associated with X-linked Alport syndrome.¡€0€ª€0€ €CDD¡€ €±I¢€0€0€ €'pfam01414, DSL, Delta serrate ligand. ¡€0€ª€0€ €CDD¡€ €±J¢€0€0€ €‚%pfam01415, IL7, Interleukin 7/9 family. IL-7 is a cytokine that acts as a growth factor for early lymphoid cells of both B- and T-cell lineages. IL-9 is a multi-functional cytokine that, although originally described as a T-cell growth factor, its function in T-cell response remains unclear.¡€0€ª€0€ €CDD¡€ €±K¢€0€0€ €‚¥pfam01416, PseudoU_synth_1, tRNA pseudouridine synthase. Involved in the formation of pseudouridine at the anticodon stem and loop of transfer-RNAs Pseudouridine is an isomer of uridine (5-(beta-D-ribofuranosyl) uracil, and id the most abundant modified nucleoside found in all cellular RNAs. The TruA-like proteins also exhibit a conserved sequence with a strictly conserved aspartic acid, likely involved in catalysis.¡€0€ª€0€ €CDD¡€ €±L¢€0€0€ €¿pfam01417, ENTH, ENTH domain. The ENTH (Epsin N-terminal homology) domain is found in proteins involved in endocytosis and cytoskeletal machinery. The function of the ENTH domain is unknown.¡€0€ª€0€ €CDD¡€ €±M¢€0€0€ €Épfam01418, HTH_6, Helix-turn-helix domain, rpiR family. This domain contains a helix-turn-helix motif. The best characterized member of this family is RpiR, a regulator of the expression of rpiB gene.¡€0€ª€0€ €CDD¡€ €D®¢€0€0€ €Øpfam01419, Jacalin, Jacalin-like lectin domain. Proteins containing this domain are lectins. It is found in 1 to 6 copies in these proteins. The domain is also found in the animal prostatic spermine-binding protein.¡€0€ª€0€ €CDD¡€ €±N¢€0€0€ €‚Špfam01420, Methylase_S, Type I restriction modification DNA specificity domain. This domain is also known as the target recognition domain (TRD). Restriction-modification (R-M) systems protect a bacterial cell against invasion of foreign DNA by endonucleolytic cleavage of DNA that lacks a site specific modification. The host genome is protected from cleavage by methylation of specific nucleotides in the target sites. In type I systems, both restriction and modification activities are present in one heteromeric enzyme complex composed of one DNA specificity subunit (this family), two modification (M) subunits and two restriction (R) subunits.¡€0€ª€0€ €CDD¡€ €D°¢€0€0€ €‚Ìpfam01421, Reprolysin, Reprolysin (M12B) family zinc metalloprotease. The members of this family are enzymes that cleave peptides. These proteases require zinc for catalysis. Members of this family are also known as adamalysins. Most members of this family are snake venom endopeptidases, but there are also some mammalian proteins and fertilin. Fertilin and closely related proteins appear to not have some active site residues and may not be active enzymes.¡€0€ª€0€ €CDD¡€ €±O¢€0€0€ €‚hpfam01422, zf-NF-X1, NF-X1 type zinc finger. This domain is presumed to be a zinc binding domain. The following pattern describes the zinc finger. C-X(1-6)-H-X-C-X3-C(H/C)-X(3-4)-(H/C)-X(1-10)-C Where X can be any amino acid, and numbers in brackets indicate the number of residues. Two position can be either his or cys. The zinc fingers in NFX1 bind to DNA.¡€0€ª€0€ €CDD¡€ €±P¢€0€0€ €‚1pfam01423, LSM, LSM domain. The LSM domain contains Sm proteins as well as other related LSM (Like Sm) proteins. The U1, U2, U4/U6, and U5 small nuclear ribonucleoprotein particles (snRNPs) involved in pre-mRNA splicing contain seven Sm proteins (B/B', D1, D2, D3, E, F and G) in common, which assemble around the Sm site present in four of the major spliceosomal small nuclear RNAs. The U6 snRNP binds to the LSM (Like Sm) proteins. Sm proteins are also found in archaebacteria, which do not have any splicing apparatus suggesting a more general role for Sm proteins. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker. This family also includes the bacterial Hfq (host factor Q) proteins. Hfq are also RNA-binding proteins, that form hexameric rings.¡€0€ª€0€ €CDD¡€ €±Q¢€0€0€ €Ñpfam01424, R3H, R3H domain. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the domain is predicted to be binding ssDNA.¡€0€ª€0€ €CDD¡€ €±R¢€0€0€ €pfam01425, Amidase, Amidase. ¡€0€ª€0€ €CDD¡€ €Dµ¢€0€0€ €ýpfam01426, BAH, BAH domain. This domain has been called BAH (Bromo adjacent homology) domain and has also been called ELM1 and BAM (Bromo adjacent motif) domain. The function of this domain is unknown but may be involved in protein-protein interaction.¡€0€ª€0€ €CDD¡€ €±S¢€0€0€ €4pfam01427, Peptidase_M15, D-ala-D-ala dipeptidase. ¡€0€ª€0€ €CDD¡€ €D·¢€0€0€ €‚1pfam01428, zf-AN1, AN1-like Zinc finger. Zinc finger at the C-terminus of An1, a ubiquitin-like protein in Xenopus laevis. The following pattern describes the zinc finger. C-X2-C-X(9-12)-C-X(1-2)-C-X4-C-X2-H-X5-H-X-C Where X can be any amino acid, and numbers in brackets indicate the number of residues.¡€0€ª€0€ €CDD¡€ €±T¢€0€0€ €‚ýpfam01429, MBD, Methyl-CpG binding domain. The Methyl-CpG binding domain (MBD) binds to DNA that contains one or more symmetrically methylated CpGs. DNA methylation in animals is associated with alterations in chromatin structure and silencing of gene expression. MBD has negligible non-specific affinity for DNA. In vitro foot-printing with MeCP2 showed the MBD can protect a 12 nucleotide region surrounding a methyl CpG pair. MBDs are found in several Methyl-CpG binding proteins and also DNA demethylase.¡€0€ª€0€ €CDD¡€ €±U¢€0€0€ €‚Åpfam01430, HSP33, Hsp33 protein. Hsp33 is a molecular chaperone, distinguished from all other known chaperones by its mode of functional regulation. Its activity is redox regulated. Hsp33 is a cytoplasmically localized protein with highly reactive cysteines that respond quickly to changes in the redox environment. Oxidising conditions like H2O2 cause disulfide bonds to form in Hsp33, a process that leads to the activation of its chaperone function.¡€0€ª€0€ €CDD¡€ €±V¢€0€0€ €‚>pfam01431, Peptidase_M13, Peptidase family M13. Mammalian enzymes are typically type-II membrane anchored enzymes which are known, or believed to activate or inactivate oligopeptide (pro)-hormones such as opioid peptides. The family also contains a bacterial member believed to be involved with milk protein cleavage.¡€0€ª€0€ €CDD¡€ €D»¢€0€0€ €‚Upfam01432, Peptidase_M3, Peptidase family M3. This is the Thimet oligopeptidase family, large family of mammalian and bacterial oligopeptidases that cleave medium sized peptides. The group also contains mitochondrial intermediate peptidase which is encoded by nuclear DNA but functions within the mitochondria to remove the leader sequence.¡€0€ª€0€ €CDD¡€ €±W¢€0€0€ €‚pfam01433, Peptidase_M1, Peptidase family M1. Members of this family are aminopeptidases. The members differ widely in specificity, hydrolysing acidic, basic or neutral N-terminal residues. This family includes leukotriene-A4 hydrolase, this enzyme also has an aminopeptidase activity.¡€0€ª€0€ €CDD¡€ €D½¢€0€0€ €1pfam01434, Peptidase_M41, Peptidase family M41. ¡€0€ª€0€ €CDD¡€ €±X¢€0€0€ €‚§pfam01435, Peptidase_M48, Peptidase family M48. Peptidase_M48 is the largely extracellular catalytic region of CAAX prenyl protease homologs such as Human FACE-1 protease. These are metallopeptidases, with the characteristic HExxH motif giving the two histidine-zinc-ligands and an adjacent glutamate on the next helix being the third. The whole molecule folds to form a deep groove/cleft into which the substrate can fit.¡€0€ª€0€ €CDD¡€ €±Y¢€0€0€ €‚êpfam01436, NHL, NHL repeat. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in Bos taurus PAM, proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localized to the repeats. The E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats.¡€0€ª€0€ €CDD¡€ €DÀ¢€0€0€ €‚˜pfam01437, PSI, Plexin repeat. A cysteine rich repeat found in several different extracellular receptors. The function of the repeat is unknown. Three copies of the repeat are found Plexin. Two copies of the repeat are found in mahogany protein. A related C. elegans protein contains four copies of the repeat. The Met receptor contains a single copy of the repeat. The Pfam alignment shows 6 conserved cysteine residues that may form three conserved disulphide bridges, whereas shows 8 conserved cysteines. The pattern of conservation suggests that cysteines 5 and 7 (that are not absolutely conserved) form a disulphide bridge (Personal observation. A Bateman).¡€0€ª€0€ €CDD¡€ €±Z¢€0€0€ €÷pfam01439, Metallothio_2, Metallothionein. Members of this family are metallothioneins. These proteins are cysteine rich proteins that bind to heavy metals. Members of this family appear to be closest to Class II metallothioneins, seed pfam00131.¡€0€ª€0€ €CDD¡€ €±[¢€0€0€ €‚Mpfam01440, Gemini_AL2, Geminivirus AL2 protein. Geminiviruses are small, ssDNA-containing plant viruses. Geminiviruses contain three ORFs (designated AL1, AL2, and AL3) that overlap and are specified by multiple polycistronic mRNAs. The AL2 gene product transactivates expression of TGMV coat protein gene, and BR1 movement protein.¡€0€ª€0€ €CDD¡€ €Dâ€0€0€ €pfam01441, Lipoprotein_6, Lipoprotein. Members of this family are lipoproteins that are probably involved in evasion of the host immune system by pathogens.¡€0€ª€0€ €CDD¡€ €±\¢€0€0€ €ápfam01442, Apolipoprotein, Apolipoprotein A1/A4/E domain. These proteins contain several 22 residue repeats which form a pair of alpha helices. This family includes: Apolipoprotein A-I. Apolipoprotein A-IV. Apolipoprotein E.¡€0€ª€0€ €CDD¡€ €±]¢€0€0€ €‚pfam01443, Viral_helicase1, Viral (Superfamily 1) RNA helicase. Helicase activity for this family has been demonstrated and NTPase activity. This helicase has multiple roles at different stages of viral RNA replication, as dissected by mutational analysis.¡€0€ª€0€ €CDD¡€ €±^¢€0€0€ €€pfam01445, SH, Viral small hydrophobic protein. The SH (small hydrophobic) protein is a membrane protein of uncertain function.¡€0€ª€0€ €CDD¡€ €DÇ¢€0€0€ €‚ˆpfam01446, Rep_1, Replication protein. Replication proteins (rep) are involved in plasmid replication. The Rep protein binds to the plasmid DNA and nicks it at the double strand origin (dso) of replication. The 3'-hydroxyl end created is extended by the host DNA replicase, and the 5' end is displaced during synthesis. At the end of one replication round, Rep introduces a second single stranded break at the dso and ligates the ssDNA extremities generating one double-stranded plasmid and one circular ssDNA form. Complementary strand synthesis of the circular ssDNA is usually initiated at the single-stranded origin by the host RNA polymerase.¡€0€ª€0€ €CDD¡€ €±_¢€0€0€ €Jpfam01447, Peptidase_M4, Thermolysin metallopeptidase, catalytic domain. ¡€0€ª€0€ €CDD¡€ €±`¢€0€0€ €‚Ñpfam01448, ELM2, ELM2 domain. The ELM2 (Egl-27 and MTA1 homology 2) domain is a small domain of unknown function. It is found in the MTA1 protein that is part of the NuRD complex. The domain is usually found to the N terminus of a myb-like DNA binding domain pfam00249. ELM2 is also found associated with an ARID DNA binding domain pfam01388 in ARID1. This suggests that ELM2 may also be involved in DNA binding, or perhaps is a protein-protein interaction domain.¡€0€ª€0€ €CDD¡€ €±a¢€0€0€ €‚0pfam01450, IlvC, Acetohydroxy acid isomeroreductase, catalytic domain. Acetohydroxy acid isomeroreductase catalyzes the conversion of acetohydroxy acids into dihydroxy valerates. This reaction is the second in the synthetic pathway of the essential branched side chain amino acids valine and isoleucine.¡€0€ª€0€ €CDD¡€ €±b¢€0€0€ €Mpfam01451, LMWPc, Low molecular weight phosphotyrosine protein phosphatase. ¡€0€ª€0€ €CDD¡€ €±c¢€0€0€ €‚hpfam01452, Rota_NSP4, Rotavirus non structural protein. This protein has been called NSP4, NSP5, NS28, and NCVP5. The final steps in the assembly of rotavirus occur in the lumen of the endoplasmic reticulum (ER). Targeting of the immature inner capsid particle (ICP) to this compartment is mediated by the cytoplasmic tail of NSP4, located in the ER membrane.¡€0€ª€0€ €CDD¡€ €DÍ¢€0€0€ €’pfam01453, B_lectin, D-mannose binding lectin. These proteins include mannose-specific lectins from plants as well as bacteriocins from bacteria.¡€0€ª€0€ €CDD¡€ €±d¢€0€0€ €‚Âpfam01454, MAGE, MAGE family. The MAGE (melanoma antigen-encoding gene) family are expressed in a wide variety of tumors but not in normal cells, with the exception of the male germ cells, placenta, and, possibly, cells of the developing embryo. The cellular function of this family is unknown. This family also contains the yeast protein, Nse3. The Nse3 protein is part of the Smc5-6 complex. Nse3 has been demonstrated to be important for meiosis.¡€0€ª€0€ €CDD¡€ €±e¢€0€0€ €)pfam01455, HupF_HypC, HupF/HypC family. ¡€0€ª€0€ €CDD¡€ €±f¢€0€0€ €‚Ypfam01456, Mucin, Mucin-like glycoprotein. This family of trypanosomal proteins resemble vertebrate mucins. The protein consists of three regions. The N and C terminii are conserved between all members of the family, whereas the central region is not well conserved and contains a large number of threonine residues which can be glycosylated. Indirect evidence suggested that these genes might encode the core protein of parasite mucins, glycoproteins that were proposed to be involved in the interaction with, and invasion of, mammalian host cells. This family contains an N-terminal signal peptide.¡€0€ª€0€ €CDD¡€ €Ó ¢€0€0€ €*pfam01457, Peptidase_M8, Leishmanolysin. ¡€0€ª€0€ €CDD¡€ €DÑ¢€0€0€ €?pfam01458, UPF0051, Uncharacterized protein family (UPF0051). ¡€0€ª€0€ €CDD¡€ €±g¢€0€0€ €'pfam01459, Porin_3, Eukaryotic porin. ¡€0€ª€0€ €CDD¡€ €±h¢€0€0€ €‚Opfam01462, LRRNT, Leucine rich repeat N-terminal domain. Leucine Rich Repeats pfam00560 are short sequence motifs present in a number of proteins with diverse functions and cellular locations. Leucine Rich Repeats are often flanked by cysteine rich domains. This domain is often found at the N-terminus of tandem leucine rich repeats.¡€0€ª€0€ €CDD¡€ €DÔ¢€0€0€ €‚Opfam01463, LRRCT, Leucine rich repeat C-terminal domain. Leucine Rich Repeats pfam00560 are short sequence motifs present in a number of proteins with diverse functions and cellular locations. Leucine Rich Repeats are often flanked by cysteine rich domains. This domain is often found at the C-terminus of tandem leucine rich repeats.¡€0€ª€0€ €CDD¡€ €DÕ¢€0€0€ €ªpfam01464, SLT, Transglycosylase SLT domain. This family is distantly related to pfam00062. Members are found in phages, type II, type III and type IV secretion systems.¡€0€ª€0€ €CDD¡€ €DÖ¢€0€0€ €‚ipfam01465, GRIP, GRIP domain. The GRIP (golgin-97, RanBP2alpha,Imh1p and p230/golgin-245) domain is found in many large coiled-coil proteins. It has been shown to be sufficient for targeting to the Golgi. The GRIP domain contains a completely conserved tyrosine residue. At least some of these domains have been shown to bind to GTPase Arl1, see structures in.¡€0€ª€0€ €CDD¡€ €±i¢€0€0€ €4pfam01466, Skp1, Skp1 family, dimerisation domain. ¡€0€ª€0€ €CDD¡€ €±j¢€0€0€ €âpfam01467, CTP_transf_like, Cytidylyltransferase-like. This family includes: Cholinephosphate cytidylyltransferase; glycerol-3-phosphate cytidylyltransferase. It also includes putative adenylyltransferases, and FAD synthases.¡€0€ª€0€ €CDD¡€ €±k¢€0€0€ €‚pfam01468, GA, GA module. The GA (protein G-related Albumin-binding) module is composed of three alpha helices. This module is found in a range of bacterial cell surface proteins. The GA module from peptostreptococcal albumin-binding protein shows a strong affinity for albumin.¡€0€ª€0€ €CDD¡€ €DÚ¢€0€0€ €‚àpfam01469, Pentapeptide_2, Pentapeptide repeats (8 copies). These repeats are found in many mycobacterial proteins. These repeats are most common in the pfam00823 family of proteins, where they are found in the MPTR subfamily of PPE proteins. The function of these repeats is unknown. The repeat can be approximately described as XNXGX, where X can be any amino acid. These repeats are similar to pfam00805, however it is not clear if these two families are structurally related.¡€0€ª€0€ €CDD¡€ €DÛ¢€0€0€ €3pfam01470, Peptidase_C15, Pyroglutamyl peptidase. ¡€0€ª€0€ €CDD¡€ €DÜ¢€0€0€ €‚ pfam01471, PG_binding_1, Putative peptidoglycan binding domain. This domain is composed of three alpha helices. This domain is found at the N or C terminus of a variety of enzymes involved in bacterial cell wall degradation. This domain may have a general peptidoglycan binding function. This family is found N-terminal to the catalytic domain of matrixins. The domain is found to bind peptidoglycan experimentally.¡€0€ª€0€ €CDD¡€ €±l¢€0€0€ €‚Epfam01472, PUA, PUA domain. The PUA domain named after Pseudouridine synthase and Archaeosine transglycosylase, was detected in archaeal and eukaryotic pseudouridine synthases, archaeal archaeosine synthases, a family of predicted ATPases that may be involved in RNA modification, a family of predicted archaeal and bacterial rRNA methylases. Additionally, the PUA domain was detected in a family of eukaryotic proteins that also contain a domain homologous to the translation initiation factor eIF1/SUI1; these proteins may comprise a novel type of translation factors. Unexpectedly, the PUA domain was detected also in bacterial and yeast glutamate kinases; this is compatible with the demonstrated role of these enzymes in the regulation of the expression of other genes. It is predicted that the PUA domain is an RNA binding domain.¡€0€ª€0€ €CDD¡€ €DÞ¢€0€0€ €‚pfam01473, CW_binding_1, Putative cell wall binding repeat. These repeats are characterized by conserved aromatic residues and glycines are found in multiple tandem copies in a number of proteins. The CW repeat is 20 amino acid residues long. The exact domain boundaries may not be correct. It has been suggested that these repeats in Streptococcus phage Cp-1 lysozyme might be responsible for the specific recognition of choline-containing cell walls. Similar but longer repeats are found in the glucosyltransferases and glucan-binding proteins of oral streptococci and shown to be involved in glucan binding as well as in the related dextransucrases of Leuconostoc mesenteroides. Repeats also occur in toxins of Clostridium difficile and other clostridia, though the ligands are not always known.¡€0€ª€0€ €CDD¡€ €±m¢€0€0€ €pfam01474, DAHP_synth_2, Class-II DAHP synthetase family. Members of this family are aldolase enzymes that catalyze the first step of the shikimate pathway.¡€0€ª€0€ €CDD¡€ €±n¢€0€0€ €ïpfam01475, FUR, Ferric uptake regulator family. This family includes metal ion uptake regulator proteins, that bind to the operator DNA and controls transcription of metal ion-responsive genes. This family is also known as the FUR family.¡€0€ª€0€ €CDD¡€ €Dࢀ0€0€ €‚pfam01476, LysM, LysM domain. The LysM (lysin motif) domain is about 40 residues long. It is found in a variety of enzymes involved in bacterial cell wall degradation. This domain may have a general peptidoglycan binding function. The structure of this domain is known.¡€0€ª€0€ €CDD¡€ €±o¢€0€0€ €‚.pfam01477, PLAT, PLAT/LH2 domain. This domain is found in a variety of membrane or lipid associated proteins. It is called the PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology) domain. The known structure of pancreatic lipase shows this domain binds to procolipase pfam01114, which mediates membrane association. So it appears possible that this domain mediates membrane attachment via other protein binding partners. The structure of this domain is known for many members of the family and is composed of a beta sandwich.¡€0€ª€0€ €CDD¡€ €±p¢€0€0€ €‚Gpfam01478, Peptidase_A24, Type IV leader peptidase family. Peptidase A24, or the prepilin peptidase as it is also known, processes the N-terminus of the prepilins. The processing is essential for the correct formation of the pseudopili of type IV bacterial protein secretion. The enzyme is found across eubacteria and archaea.¡€0€ª€0€ €CDD¡€ €±q¢€0€0€ €‚ pfam01479, S4, S4 domain. The S4 domain is a small domain consisting of 60-65 amino acid residues that was detected in the bacterial ribosomal protein S4, eukaryotic ribosomal S9, two families of pseudouridine synthases, a novel family of predicted RNA methylases, a yeast protein containing a pseudouridine synthetase and a deaminase domain, bacterial tyrosyl-tRNA synthetases, and a number of uncharacterized, small proteins that may be involved in translation regulation. The S4 domain probably mediates binding to RNA.¡€0€ª€0€ €CDD¡€ €D䢀0€0€ €pfam01480, PWI, PWI domain. ¡€0€ª€0€ €CDD¡€ €±r¢€0€0€ €=pfam01481, Arteri_nucleo, Arterivirus nucleocapsid protein. ¡€0€ª€0€ €CDD¡€ €±s¢€0€0€ €‚pfam01483, P_proprotein, Proprotein convertase P-domain. A unique feature of the eukaryotic subtilisin-like proprotein convertases is the presence of an additional highly conserved sequence of approximately 150 residues (P domain) located immediately downstream of the catalytic domain.¡€0€ª€0€ €CDD¡€ €±t¢€0€0€ €‚4pfam01484, Col_cuticle_N, Nematode cuticle collagen N-terminal domain. The function of this domain is unknown. It is found in the N-terminal region of nematode cuticle collagens, see pfam01391. Cuticle is a tough elastic structure secreted by hypodermal cells and is primarily composed of collagen proteins.¡€0€ª€0€ €CDD¡€ €±u¢€0€0€ €‚êpfam01485, IBR, IBR domain, a half RING-finger domain. The IBR (In Between Ring fingers) domain is often found to occur between pairs of ring fingers (pfam00097). This domain has also been called the C6HC domain and DRIL (for double RING finger linked) domain. Proteins that contain two Ring fingers and an IBR domain (these proteins are also termed RBR family proteins) are thought to exist in all eukaryotic organisms. RBR family members play roles in protein quality control and can indirectly regulate transcription. Evidence suggests that RBR proteins are often parts of cullin-containing ubiquitin ligase complexes. The ubiquitin ligase Parkin is an RBR family protein whose mutations are involved in forms of familial Parkinson's disease.¡€0€ª€0€ €CDD¡€ €±v¢€0€0€ €Õpfam01486, K-box, K-box region. The K-box region is commonly found associated with SRF-type transcription factors see pfam00319. The K-box is a possible coiled-coil structure. Possible role in multimer formation.¡€0€ª€0€ €CDD¡€ €±w¢€0€0€ €‚°pfam01487, DHquinase_I, Type I 3-dehydroquinase. Type I 3-dehydroquinase, (3-dehydroquinate dehydratase or DHQase.) catalyzes the cis-dehydration of 3-dehydroquinate via a covalent imine intermediate giving dehydroshikimate. Dehydroquinase functions in the shikimate pathway which is involved in the biosynthesis of aromatic amino acids. Type II 3-dehydroquinase catalyzes the trans-dehydration of 3-dehydroshikimate see pfam01220.¡€0€ª€0€ €CDD¡€ €±x¢€0€0€ €‚“pfam01488, Shikimate_DH, Shikimate / quinate 5-dehydrogenase. This family contains both shikimate and quinate dehydrogenases. Shikimate 5-dehydrogenase catalyzes the conversion of shikimate to 5-dehydroshikimate. This reaction is part of the shikimate pathway which is involved in the biosynthesis of aromatic amino acids. Quinate 5-dehydrogenase catalyzes the conversion of quinate to 5-dehydroquinate. This reaction is part of the quinate pathway where quinic acid is exploited as a source of carbon in prokaryotes and microbial eukaryotes. Both the shikimate and quinate pathways share two common pathway metabolites 3-dehydroquinate and dehydroshikimate.¡€0€ª€0€ €CDD¡€ €±y¢€0€0€ €‚Õpfam01490, Aa_trans, Transmembrane amino acid transporter protein. This transmembrane region is found in many amino acid transporters including UNC-47 and MTR. UNC-47 encodes a vesicular amino butyric acid (GABA) transporter, (VGAT). UNC-47 is predicted to have 10 transmembrane domains. MTR is a N system amino acid transporter system protein involved in methyltryptophan resistance. Other members of this family include proline transporters and amino acid permeases.¡€0€ª€0€ €CDD¡€ €D좀0€0€ €‚tpfam01491, Frataxin_Cyay, Frataxin-like domain. This family contains proteins that have a domain related to the globular C-terminus of Frataxin the protein that is mutated in Friedreich's ataxia. This domain is found in a family of bacterial proteins. The function of this domain is currently unknown. It has been suggested that this family is involved in iron transport.¡€0€ª€0€ €CDD¡€ €±z¢€0€0€ €‚pfam01492, Gemini_C4, Geminivirus C4 protein. This family consists of the N terminal region of geminivirus C4 or AC4 proteins. In Tomato yellow leaf curl geminivirus (TYLCV) the C4 protein is necessary for efficient spreading of the virus in tomato plants.¡€0€ª€0€ €CDD¡€ €D0€0€ €‚pfam01493, GXGXG, GXGXG motif. This domain is found in glutamate synthase, tungsten formylmethanofuran dehydrogenase subunit c (FwdC) and molybdenum formylmethanofuran dehydrogenase subunit c (FmdC). A repeated G-XX-G-XXX-G motif is seen in the alignment.¡€0€ª€0€ €CDD¡€ €±{¢€0€0€ €mpfam01494, FAD_binding_3, FAD binding domain. This domain is involved in FAD binding in a number of enzymes.¡€0€ª€0€ €CDD¡€ €±|¢€0€0€ €‚Xpfam01496, V_ATPase_I, V-type ATPase 116kDa subunit family. This family consists of the 116kDa V-type ATPase (vacuolar (H+)-ATPases) subunits, as well as V-type ATP synthase subunit i. The V-type ATPases family are proton pumps that acidify intracellular compartments in eukaryotic cells for example yeast central vacuoles, clathrin-coated and synaptic vesicles. They have important roles in membrane trafficking processes. The 116kDa subunit (subunit a) in the V-type ATPase is part of the V0 functional domain responsible for proton transport. The a subunit is a transmembrane glycoprotein with multiple putative transmembrane helices it has a hydrophilic amino terminal and a hydrophobic carboxy terminal. It has roles in proton transport and assembly of the V-type ATPase complex. This subunit is encoded by two homologous gene in yeast VPH1 and STV1.¡€0€ª€0€ €CDD¡€ €±}¢€0€0€ €¤pfam01497, Peripla_BP_2, Periplasmic binding protein. This family includes bacterial periplasmic binding proteins. Several of which are involved in iron transport.¡€0€ª€0€ €CDD¡€ €±~¢€0€0€ €‚¼pfam01498, HTH_Tnp_Tc3_2, Transposase. Transposase proteins are necessary for efficient DNA transposition. This family includes the amino-terminal region of Tc1, Tc1A, Tc1B and Tc2B transposases of C.elegans. The region encompasses the specific DNA binding and second DNA recognition domains as well as an amino-terminal region of the catalytic domain of Tc3 as described in. Tc3 is a member of the Tc1/mariner family of transposable elements.¡€0€ª€0€ €CDD¡€ €±¢€0€0€ €‚ppfam01499, Herpes_UL25, Herpesvirus UL25 family. The herpesvirus UL25 gene product is a virion component involved in virus penetration and capsid assembly. The product of the UL25 gene is required for packaging but not cleavage of replicated viral DNA. This family includes a number of herpesvirus proteins: EHV-1 36, EBV BVRF1, HCMV UL77, ILTV ORF2, and VZV gene 34.¡€0€ª€0€ €CDD¡€ €Dô¢€0€0€ €‚pfam01500, Keratin_B2, Keratin, high sulfur B2 protein. High sulfur proteins are cysteine-rich proteins synthesized during the differentiation of hair matrix cells, and form hair fibres in association with hair keratin intermediate filaments. This family has been divided up into four regions, with the second region containing 8 copies of a short repeat. This family is also known as B2 or KAP1.¡€0€ª€0€ €CDD¡€ €Dõ¢€0€0€ €‚tpfam01501, Glyco_transf_8, Glycosyl transferase family 8. This family includes enzymes that transfer sugar residues to donor molecules. Members of this family are involved in lipopolysaccharide biosynthesis and glycogen synthesis. This family includes Lipopolysaccharide galactosyltransferase, lipopolysaccharide glucosyltransferase 1, and glycogenin glucosyltransferase.¡€0€ª€0€ €CDD¡€ €Dö¢€0€0€ €¤pfam01502, PRA-CH, Phosphoribosyl-AMP cyclohydrolase. This enzyme catalyzes the third step in the histidine biosynthetic pathway. It requires Zn ions for activity.¡€0€ª€0€ €CDD¡€ €±€¢€0€0€ €‰pfam01503, PRA-PH, Phosphoribosyl-ATP pyrophosphohydrolase. This enzyme catalyzes the second step in the histidine biosynthetic pathway.¡€0€ª€0€ €CDD¡€ €±¢€0€0€ €‚Êpfam01504, PIP5K, Phosphatidylinositol-4-phosphate 5-Kinase. This family contains a region from the common kinase core found in the type I phosphatidylinositol-4-phosphate 5-kinase (PIP5K) family as described in. The family consists of various type I, II and III PIP5K enzymes. PIP5K catalyzes the formation of phosphoinositol-4,5-bisphosphate via the phosphorylation of phosphatidylinositol-4-phosphate a precursor in the phosphinositide signaling pathway.¡€0€ª€0€ €CDD¡€ €±‚¢€0€0€ €‚pfam01505, Vault, Major Vault Protein repeat. The vault is a ubiquitous and highly conserved ribonucleoprotein particle of approximately 13 mDa of unknown function. This family corresponds to a repeat found in the amino terminal half of the major vault protein.¡€0€ª€0€ €CDD¡€ €±ƒ¢€0€0€ €‚ðpfam01506, HCV_NS5a, Hepatitis C virus non-structural 5a protein membrane anchor. The molecular function of the non-structural 5a protein is uncertain. The NS5a protein is phosphorylated when expressed in mammalian cells. It is thought to interact with the ds RNA dependent (interferon inducible) kinase PKR. The N-terminal region of the NS5a protein has been used in the construction of the alignment for this family. The C-terminal region has not been included because it is too heterogeneous.¡€0€ª€0€ €CDD¡€ €±„¢€0€0€ €‚-pfam01507, PAPS_reduct, Phosphoadenosine phosphosulfate reductase family. This domain is found in phosphoadenosine phosphosulfate (PAPS) reductase enzymes or PAPS sulfotransferase. PAPS reductase is part of the adenine nucleotide alpha hydrolases superfamily also including N type ATP PPases and ATP sulphurylases. The enzyme uses thioredoxin as an electron donor for the reduction of PAPS to phospho-adenosine-phosphate (PAP). It is also found in NodP nodulation protein P from Rhizobium which has ATP sulfurylase activity (sulfate adenylate transferase).¡€0€ª€0€ €CDD¡€ €±…¢€0€0€ €Ãpfam01508, Paramecium_SA, Paramecium surface antigen domain. This domain is a cysteine rich extracellular repeat found in surface antigens of Paramecium. The domain contains 8 cysteine residues.¡€0€ª€0€ €CDD¡€ €Dý¢€0€0€ €‚…pfam01509, TruB_N, TruB family pseudouridylate synthase (N terminal domain). Members of this family are involved in modifying bases in RNA molecules. They carry out the conversion of uracil bases to pseudouridine. This family includes TruB, a pseudouridylate synthase that specifically converts uracil 55 to pseudouridine in most tRNAs. This family also includes Cbf5p that modifies rRNA.¡€0€ª€0€ €CDD¡€ €±†¢€0€0€ €‚ªpfam01510, Amidase_2, N-acetylmuramoyl-L-alanine amidase. This family includes zinc amidases that have N-acetylmuramoyl-L-alanine amidase activity EC:3.5.1.28. This enzyme domain cleaves the amide bond between N-acetylmuramoyl and L-amino acids in bacterial cell walls (preferentially: D-lactyl-L-Ala). The structure is known for the bacteriophage T7 structure and shows that two of the conserved histidines are zinc binding.¡€0€ª€0€ €CDD¡€ €±‡¢€0€0€ €Npfam01512, Complex1_51K, Respiratory-chain NADH dehydrogenase 51 Kd subunit. ¡€0€ª€0€ €CDD¡€ €±ˆ¢€0€0€ €‚-pfam01513, NAD_kinase, ATP-NAD kinase. Members of this family include ATP-NAD kinases EC:2.7.1.23, which catalyzes the phosphorylation of NAD to NADP utilising ATP and other nucleoside triphosphates as well as inorganic polyphosphate as a source of phosphorus. Also includes NADH kinases EC:2.7.1.86.¡€0€ª€0€ €CDD¡€ €E¢€0€0€ €‚¥pfam01514, YscJ_FliF, Secretory protein of YscJ/FliF family. This family includes proteins that are related to the YscJ lipoprotein, and the amino terminus of FliF, the flageller M-ring protein. The members of the YscJ family are thought to be involved in secretion of several proteins. The FliF protein ring is thought to be part of the export apparatus for flageller proteins, based on the similarity to YscJ proteins.¡€0€ª€0€ €CDD¡€ €±‰¢€0€0€ €çpfam01515, PTA_PTB, Phosphate acetyl/butaryl transferase. This family contains both phosphate acetyltransferase and phosphate butaryltransferase. These enzymes catalyze the transfer of an acetyl or butaryl group to orthophosphate.¡€0€ª€0€ €CDD¡€ €E¢€0€0€ €‡pfam01516, Orbi_VP6, Orbivirus helicase VP6. The VP6 protein a minor protein in the core of the virion is probably the viral helicase.¡€0€ª€0€ €CDD¡€ €E¢€0€0€ €‚6pfam01517, HDV_ag, Hepatitis delta virus delta antigen. The hepatitis delta virus (HDV) encodes a single protein, the hepatitis delta antigen (HDAg). The central region of this protein has been shown to bind RNA. Several interactions are also mediated by a coiled-coil region at the N terminus of the protein.¡€0€ª€0€ €CDD¡€ €E¢€0€0€ €ppfam01518, PolyG_pol, Sigma NS protein. This viral protein has a poly(C)-dependent poly(G) polymerase activity.¡€0€ª€0€ €CDD¡€ €Ó7¢€0€0€ €ùpfam01519, DUF16, Protein of unknown function DUF16. The function of this protein is unknown. It appears to only occur in Mycoplasma pneumoniae. The crystal structure revealed that this domain is composed of two separated homotrimeric coiled-coils.¡€0€ª€0€ €CDD¡€ €E¢€0€0€ €¨pfam01520, Amidase_3, N-acetylmuramoyl-L-alanine amidase. This enzyme domain cleaves the amide bond between N-acetylmuramoyl and L-amino acids in bacterial cell walls.¡€0€ª€0€ €CDD¡€ €±Š¢€0€0€ €æpfam01521, Fe-S_biosyn, Iron-sulphur cluster biosynthesis. This family is involved in iron-sulphur cluster biosynthesis. Its members include proteins that are involved in nitrogen fixation such as the HesB and HesB-like proteins.¡€0€ª€0€ €CDD¡€ €±‹¢€0€0€ €‚fpfam01522, Polysacc_deac_1, Polysaccharide deacetylase. This domain is found in polysaccharide deacetylase. This family of polysaccharide deacetylases includes NodB (nodulation protein B from Rhizobium) which is a chitooligosaccharide deacetylase. It also includes chitin deacetylase from yeast, and endoxylanases which hydrolyses glucosidic bonds in xylan.¡€0€ª€0€ €CDD¡€ €±Œ¢€0€0€ €‚7pfam01523, PmbA_TldD, Putative modulator of DNA gyrase. tldD and pmbA were found to suppress mutations in letD and inhibitor of DNA gyrase. Therefore it has been hypothesized that the TldD and PmbA proteins modulate the activity of DNA gyrase. It has also been suggested that PmbA may be involved in secretion.¡€0€ª€0€ €CDD¡€ €±¢€0€0€ €‚pfam01524, Gemini_V2, Geminivirus V2 protein. Disruption of the V2 gene in Tomato yellow leaf curl virus (TYLCV) stopped its ability to systemically infect tomato plants, suggesting that the V2 gene product is required for successful infection of the host.¡€0€ª€0€ €CDD¡€ €E ¢€0€0€ €mpfam01525, Rota_NS26, Rotavirus NS26. Gene 11 product is a non-structural phosphoprotein designated as NS26.¡€0€ª€0€ €CDD¡€ €6'¢€0€0€ €‚­pfam01526, DDE_Tnp_Tn3, Tn3 transposase DDE domain. This family includes transposases of Tn3, Tn21, Tn1721, Tn2501, Tn3926 transposons from E-coli. The specific binding of the Tn3 transposase to DNA has been demonstrated. Sequence analysis has suggested that the invariant triad of Asp689, Asp765, Glu895 (numbering as in Tn3) may correspond to the D-D-35-E motif previously implicated in the catalysis of numerous transposases.¡€0€ª€0€ €CDD¡€ €±Ž¢€0€0€ €ðpfam01527, HTH_Tnp_1, Transposase. Transposase proteins are necessary for efficient DNA transposition. This family consists of various E. coli insertion elements and other bacterial transposases some of which are members of the IS3 family.¡€0€ª€0€ €CDD¡€ €±¢€0€0€ €àpfam01528, Herpes_glycop, Herpesvirus glycoprotein M. The herpesvirus glycoprotein M (gM) is an integral membrane protein predicted to contain 8 transmembrane segments. Glycoprotein M is not essential for viral replication.¡€0€ª€0€ €CDD¡€ €E¢€0€0€ €ßpfam01529, zf-DHHC, DHHC palmitoyltransferase. This family includes the well known DHHC zinc binding domain as well as three of the four conserved transmembrane regions found in this family of palmitoyltransferase enzymes.¡€0€ª€0€ €CDD¡€ €±¢€0€0€ €Vpfam01530, zf-C2HC, Zinc finger, C2HC type. This is a DNA binding zinc finger domain.¡€0€ª€0€ €CDD¡€ €±‘¢€0€0€ €vpfam01531, Glyco_transf_11, Glycosyl transferase family 11. This family contains several fucosyl transferase enzymes.¡€0€ª€0€ €CDD¡€ €ÓA¢€0€0€ €ðpfam01532, Glyco_hydro_47, Glycosyl hydrolase family 47. Members of this family are alpha-mannosidases that catalyze the hydrolysis of the terminal 1,2-linked alpha-D-mannose residues in the oligo-mannose oligosaccharide Man(9)(GlcNAc)(2).¡€0€ª€0€ €CDD¡€ €±’¢€0€0€ €‚pfam01533, Tospo_nucleocap, Tospovirus nucleocapsid protein. The tospovirus genome consists of three linear ssRNA segments, denoted L, M and S complexed with the nucleocapsid protein. The S RNA encodes the nucleocapsid protein and another non-structural protein.¡€0€ª€0€ €CDD¡€ €E¢€0€0€ €‚^pfam01534, Frizzled, Frizzled/Smoothened family membrane region. This family contains the membrane spanning region of frizzled and smoothened receptors. This membrane region is predicted to contain seven transmembrane alpha helices. Proteins related to Drosophila frizzled are receptors for Wnt (mediating the beta-catenin signalling pathway), but also the planar cell polarity (PCP) pathway and the Wnt/calcium pathway. The predominantly alpha-helical Cys-rich ligand-binding region (CRD) of Frizzled is both necessary and sufficient for Wnt binding. The smoothened receptor mediates hedgehog signalling.¡€0€ª€0€ €CDD¡€ €±“¢€0€0€ €‚çpfam01535, PPR, PPR repeat. This repeat has no known function. It is about 35 amino acids long and found in up to 18 copies in some proteins. This family appears to be greatly expanded in plants. This repeat occurs in PET309 that may be involved in RNA stabilisation. This domain occurs in crp1 that is involved in RNA processing. This repeat is associated with a predicted plant protein that has a domain organisation similar to the human BRCA1 protein. The repeat has been called PPR.¡€0€ª€0€ €CDD¡€ €±”¢€0€0€ €‚®pfam01536, SAM_decarbox, Adenosylmethionine decarboxylase. This is a family of S-adenosylmethionine decarboxylase (SAMDC) proenzymes. In the biosynthesis of polyamines SAMDC produces decarboxylated S-adenosylmethionine, which serves as the aminopropyl moiety necessary for spermidine and spermine biosynthesis from putrescine. The Pfam alignment contains both the alpha and beta chains that are cleaved to form the active enzyme.¡€0€ª€0€ €CDD¡€ €±•¢€0€0€ €‚úpfam01537, Herpes_glycop_D, Herpesvirus glycoprotein D/GG/GX domain. This domain is found in several Herpes viruses glycoproteins. This is a family includes glycoprotein-D (gD or gIV) which is common to herpes simplex virus types 1 and 2, as well as equine herpes, bovine herpes and Marek's disease virus. Glycoprotein-D has been found on the viral envelope and the plasma membrane of infected cells. and gD immunisation can produce an immune response to bovine herpes virus (BHV-1). This response is stronger than that of the other major glycoproteins gB (gI) and gC (gIII) in BHV-1. Glycoprotein G (gG)is one of the seven external glycoproteins of HSV1 and HSV2. This family also contains the glycoprotein GX, (gX), initially identified in Pseudorabies virus.¡€0€ª€0€ €CDD¡€ €±–¢€0€0€ €‚spfam01538, HCV_NS2, Hepatitis C virus non-structural protein NS2. The viral genome is translated into a single polyprotein of about 3000 amino acids. Generation of the mature non-structural proteins relies on the activity of viral proteases. Cleavage at the NS2/NS3 junction is accomplished by a metal-dependent autoprotease encoded within NS2 and the N-terminus of NS3.¡€0€ª€0€ €CDD¡€ €v¢€0€0€ €Apfam01539, HCV_env, Hepatitis C virus envelope glycoprotein E1. ¡€0€ª€0€ €CDD¡€ €¯È¢€0€0€ €‚±pfam01540, Lipoprotein_7, Adhesin lipoprotein. This family consists of the p50 and variable adherence-associated antigen (Vaa) adhesins from Mycoplasma hominis. M. hominis is a mycoplasma associated with human urogenital diseases, pneumonia, and septic arthritis. An adhesin is a cell surface molecule that mediates adhesion to other cells or to the surrounding surface or substrate. The Vaa antigen is a 50-kDa surface lipoprotein that has four tandem repetitive DNA sequences encoding a periodic peptide structure, and is highly immunogenic in the human host. p50 is also a 50-kDa lipoprotein, having three repeats A,B and C, that may be a tetramer of 191-kDa in its native environment.¡€0€ª€0€ €CDD¡€ €¯É¢€0€0€ €‚Spfam01541, GIY-YIG, GIY-YIG catalytic domain. This domain called GIY-YIG is found in the amino terminal region of excinuclease abc subunit c (uvrC), bacteriophage T4 endonucleases segA, segB, segC, segD and segE; it is also found in putative endonucleases encoded by group I introns of fungi and phage. The structure of I-TevI a GIY-YIG endonuclease, reveals a novel alpha/beta-fold with a central three-stranded antiparallel beta-sheet flanked by three helices. The most conserved and putative catalytic residues are located on a shallow, concave surface and include a metal coordination site.¡€0€ª€0€ €CDD¡€ €E¢€0€0€ €‚{pfam01542, HCV_core, Hepatitis C virus core protein. The viral core protein forms the internal viral coat that encapsidates the genomic RNA and is enveloped in a host cell-derived lipid membrane. The core protein has been shown, by yeast two-hybrid assay to interact with cellular DEAD box helicases. The N terminus of the core protein is involved in transcriptional repression.¡€0€ª€0€ €CDD¡€ €E¢€0€0€ €:pfam01543, HCV_capsid, Hepatitis C virus capsid protein. ¡€0€ª€0€ €CDD¡€ €63¢€0€0€ €‚êpfam01544, CorA, CorA-like Mg2+ transporter protein. The CorA transport system is the primary Mg2+ influx system of Salmonella typhimurium and Escherichia coli. CorA is virtually ubiquitous in the Bacteria and Archaea. There are also eukaryotic relatives of this protein. The family includes the MRS2 protein from yeast that is thought to be an RNA splicing protein. However its membership of this family suggests that its effect on splicing is due to altered magnesium levels in the cell.¡€0€ª€0€ €CDD¡€ €±—¢€0€0€ €‚pfam01545, Cation_efflux, Cation efflux family. Members of this family are integral membrane proteins, that are found to increase tolerance to divalent metal ions such as cadmium, zinc, and cobalt. These proteins are thought to be efflux pumps that remove these ions from cells.¡€0€ª€0€ €CDD¡€ €±˜¢€0€0€ €‚ pfam01546, Peptidase_M20, Peptidase family M20/M25/M40. This family includes a range of zinc metallopeptidases belonging to several families in the peptidase classification. Family M20 are Glutamate carboxypeptidases. Peptidase family M25 contains X-His dipeptidases.¡€0€ª€0€ €CDD¡€ €±™¢€0€0€ €¥pfam01547, SBP_bac_1, Bacterial extracellular solute-binding protein. This family also includes the bacterial extracellular solute-binding protein family POTD/POTF.¡€0€ª€0€ €CDD¡€ €±š¢€0€0€ €‚¢€0€0€ €‚„pfam01590, GAF, GAF domain. This domain is present in cGMP-specific phosphodiesterases, adenylyl and guanylyl cyclases, phytochromes, FhlA and NifA. Adenylyl and guanylyl cyclases catalyze ATP and GTP to the second messengers cAMP and cGMP, respectively, these products up-regulating catalytic activity by binding to the regulatory GAF domain(s). The opposite hydrolysis reaction is catalyzed by phosphodiesterase. cGMP-dependent 3',5'-cyclic phosphodiesterase catalyzes the conversion of guanosine 3',5'-cyclic phosphate to guanosine 5'-phosphate. Here too, cGMP regulates catalytic activity by GAF-domain binding. Phytochromes are regulatory photoreceptors in plants and bacteria which exist in two thermally-stable states that are reversibly inter-convertible by light: the Pr state absorbs maximally in the red region of the spectrum, while the Pfr state absorbs maximally in the far-red region. This domain is also found in FhlA (formate hydrogen lyase transcriptional activator) and NifA, a transcriptional activator which is required for activation of most Nif operons which are directly involved in nitrogen fixation. NifA interacts with sigma-54.¡€0€ª€0€ €CDD¡€ €±²¢€0€0€ €‚$pfam01591, 6PF2K, 6-phosphofructo-2-kinase. This enzyme occurs as a bifunctional enzyme with fructose-2,6-bisphosphatase. The bifunctional enzyme catalyzes both the synthesis and degradation of fructose-2,6-bisphosphate, a potent regulator of glycolysis. This enzyme contains a P-loop motif.¡€0€ª€0€ €CDD¡€ €E@¢€0€0€ €‚µpfam01592, NifU_N, NifU-like N terminal domain. This domain is found in NifU in combination with pfam01106. This domain is found on isolated in several bacterial species. The nif genes are responsible for nitrogen fixation. However this domain is found in bacteria that do not fix nitrogen, so it may have a broader significance in the cell than nitrogen fixation. These proteins appear to be scaffold proteins for iron-sulfur clusters.¡€0€ª€0€ €CDD¡€ €±³¢€0€0€ €‚Õpfam01593, Amino_oxidase, Flavin containing amine oxidoreductase. This family consists of various amine oxidases, including maze polyamine oxidase (PAO) and various flavin containing monoamine oxidases (MAO). The aligned region includes the flavin binding site of these enzymes. The family also contains phytoene dehydrogenases and related enzymes. In vertebrates MAO plays an important role regulating the intracellular levels of amines via there oxidation; these include various neurotransmitters, neurotoxins and trace amines. In lower eukaryotes such as aspergillus and in bacteria the main role of amine oxidases is to provide a source of ammonium. PAOs in plants, bacteria and protozoa oxidase spermidine and spermine to an aminobutyral, diaminopropane and hydrogen peroxide and are involved in the catabolism of polyamines. Other members of this family include tryptophan 2-monooxygenase, putrescine oxidase, corticosteroid binding proteins and antibacterial glycoproteins.¡€0€ª€0€ €CDD¡€ €±´¢€0€0€ €‚ápfam01594, AI-2E_transport, AI-2E family transporter. This family includes four different proteins from E. coli alone. One of them, YdgG or TqsA, has been shown to mediate transport of the quorum-sensing signal autoinducer 2 (AI-2). It is not clear if TqsA enhances secretion of AI-2 or inhibits AI-2 uptake. By altering the intracellular concentration of AI-2, TqsA affects gene expression in biofilms and biofilm formation. TsqA belongs to the AI-2 exporter (AI-2E) superfamily.¡€0€ª€0€ €CDD¡€ €EC¢€0€0€ €‚„pfam01595, DUF21, Domain of unknown function DUF21. This transmembrane region has no known function. Many of the sequences in this family are annotated as hemolysins, however this is due to a similarity to Brachyspira hyodysenteriae hemolysin C that does not contain this domain. This domain is found in the N-terminus of the proteins adjacent to two intracellular CBS domains pfam00571.¡€0€ª€0€ €CDD¡€ €±µ¢€0€0€ €‚pfam01596, Methyltransf_3, O-methyltransferase. Members of this family are O-methyltransferases. The family includes catechol o-methyltransferase, caffeoyl-CoA O-methyltransferase and a family of bacterial O-methyltransferases that may be involved in antibiotic production.¡€0€ª€0€ €CDD¡€ €±¶¢€0€0€ €‚¥pfam01597, GCV_H, Glycine cleavage H-protein. This is a family of glycine cleavage H-proteins, part of the glycine cleavage multienzyme complex (GCV) found in bacteria and the mitochondria of eukaryotes. GCV catalyzes the catabolism of glycine in eukaryotes. A lipoyl group is attached to a completely conserved lysine residue. The H protein shuttles the methylamine group of glycine from the P protein to the T protein.¡€0€ª€0€ €CDD¡€ €±·¢€0€0€ €‚×pfam01599, Ribosomal_S27, Ribosomal protein S27a. This family of ribosomal proteins consists mainly of the 40S ribosomal protein S27a which is synthesized as a C-terminal extension of ubiquitin (CEP). The S27a domain compromises the C-terminal half of the protein. The synthesis of ribosomal proteins as extensions of ubiquitin promotes their incorporation into nascent ribosomes by a transient metabolic stabilisation and is required for efficient ribosome biogenesis. The ribosomal extension protein S27a contains a basic region that is proposed to form a zinc finger; its fusion gene is proposed as a mechanism to maintain a fixed ratio between ubiquitin necessary for degrading proteins and ribosomes a source of proteins.¡€0€ª€0€ €CDD¡€ €±¸¢€0€0€ €‚ pfam01600, Corona_S1, Coronavirus S1 glycoprotein. The coronavirus spike glycoprotein forms the characteristic 'corona' after which the group is named. The Spike glycoprotein is translated as a large polypeptide that is subsequently cleaved to S1 and S2 pfam01601.¡€0€ª€0€ €CDD¡€ €EH¢€0€0€ €‚ pfam01601, Corona_S2, Coronavirus S2 glycoprotein. The coronavirus spike glycoprotein forms the characteristic 'corona' after which the group is named. The Spike glycoprotein is translated as a large polypeptide that is subsequently cleaved to S1 pfam01600 and S2.¡€0€ª€0€ €CDD¡€ €EI¢€0€0€ €‚7pfam01602, Adaptin_N, Adaptin N terminal region. This family consists of the N terminal region of various alpha, beta and gamma subunits of the AP-1, AP-2 and AP-3 adaptor protein complexes. The adaptor protein (AP) complexes are involved in the formation of clathrin-coated pits and vesicles. The N-terminal region of the various adaptor proteins (APs) is constant by comparison to the C-terminal which is variable within members of the AP-2 family; and it has been proposed that this constant region interacts with another uniform component of the coated vesicles.¡€0€ª€0€ €CDD¡€ €±¹¢€0€0€ €‚àpfam01603, B56, Protein phosphatase 2A regulatory B subunit (B56 family). Protein phosphatase 2A (PP2A) is a major intracellular protein phosphatase that regulates multiple aspects of cell growth and metabolism. The ability of this widely distributed heterotrimeric enzyme to act on a diverse array of substrates is largely controlled by the nature of its regulatory B subunit. There are multiple families of B subunits (See also pfam01240), this family is called the B56 family.¡€0€ª€0€ €CDD¡€ €±º¢€0€0€ €‚fpfam01606, Arteri_env, Arterivirus envelope protein. This family consists of viral envelope proteins from the arterivirus genus; this includes porcine reproductive and respiratory virus (PRRSV) envelope protein GP3 and lactate dehydrogenase elevating virus (LDV) structural glycoprotein. Arteriviruses consists of positive ssRNA and do not have a DNA stage.¡€0€ª€0€ €CDD¡€ €Ós¢€0€0€ €‚$pfam01607, CBM_14, Chitin binding Peritrophin-A domain. This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains.¡€0€ª€0€ €CDD¡€ €±»¢€0€0€ €‚âpfam01608, I_LWEQ, I/LWEQ domain. I/LWEQ domains bind to actin. It has been shown that the I/LWEQ domains from mouse talin and yeast Sla2p interact with F-actin. I/LWEQ domains can be placed into four major groups based on sequence similarity: (1) Metazoan talin; (2) Dictyostelium TalA/TalB and SLA110; (3) metazoan Hip1p and (4) yeast Sla2p. The domain has four conserved blocks, the name of the domain is derived from the initial conserved amino acid of each of the four blocks.¡€0€ª€0€ €CDD¡€ €±¼¢€0€0€ €‚"pfam01609, DDE_Tnp_1, Transposase DDE domain. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction. This family contains transposases for IS4, IS421, IS5377, IS427, IS402, IS1355, IS5, which was original isolated in bacteriophage lambda.¡€0€ª€0€ €CDD¡€ €±½¢€0€0€ €¦pfam01610, DDE_Tnp_ISL3, Transposase. Transposase proteins are necessary for efficient DNA transposition. Contains transposases for IS204, IS1001, IS1096 and IS1165.¡€0€ª€0€ €CDD¡€ €±¾¢€0€0€ €‚pfam01611, Filo_glycop, Filovirus glycoprotein. This family includes an extracellular region from the envelope glycoprotein of Ebola and Marburg viruses. This region is also produced as a separate transcript that gives rise to a non-structural, secreted glycoprotein, which is produced in large amounts and has an unknown function. Processing of this protein may be involved in viral pathogenicity.¡€0€ª€0€ €CDD¡€ €EP¢€0€0€ €‚†pfam01612, DNA_pol_A_exo1, 3'-5' exonuclease. This domain is responsible for the 3'-5' exonuclease proofreading activity of E. coli DNA polymerase I (polI) and other enzymes, it catalyzes the hydrolysis of unpaired or mismatched nucleotides. This domain consists of the amino-terminal half of the Klenow fragment in E. coli polI it is also found in the Werner syndrome helicase (WRN), focus forming activity 1 protein (FFA-1) and ribonuclease D (RNase D). Werner syndrome is a human genetic disorder causing premature aging; the WRN protein has helicase activity in the 3'-5' direction. The FFA-1 protein is required for formation of a replication foci and also has helicase activity; it is a homolog of the WRN protein. RNase D is a 3'-5' exonuclease involved in tRNA processing. Also found in this family is the autoantigen PM/Scl thought to be involved in polymyositis-scleroderma overlap syndrome.¡€0€ª€0€ €CDD¡€ €±¿¢€0€0€ €‚ãpfam01613, Flavin_Reduct, Flavin reductase like domain. This is a flavin reductase family consisting of enzymes known to be flavin reductases as well as various oxidoreductase and monooxygenase components. VlmR is a flavin reductase that functions in a two-component enzyme system to provide isobutylamine N-hydroxylase with reduced flavin and may be involved in the synthesis of valanimycin. SnaC is a flavin reductase that provides reduced flavin for the oxidation of pristinamycin IIB to pristinamycin IIA as catalyzed by SnaA, SnaB heterodimer. This flavin reductase region characterized by enzymes of the family is present in the C-terminus of potential FMN proteins from Synechocystis sp. suggesting it is a flavin reductase domain.¡€0€ª€0€ €CDD¡€ €±À¢€0€0€ €‚§pfam01614, IclR, Bacterial transcriptional regulator. This family of bacterial transcriptional regulators includes the glycerol operon regulatory protein and acetate operon repressor both of which are members of the iclR family. These proteins have a Helix-Turn-Helix motif at the N-terminus. However this family covers the C-terminal region that may bind to the regulatory substrate (unpublished observation, Bateman A.).¡€0€ª€0€ €CDD¡€ €±Á¢€0€0€ €°pfam01616, Orbi_NS3, Orbivirus NS3. The function of this Orbivirus non structural protein is uncertain. However it may play a role on release of the virus from infected cells.¡€0€ª€0€ €CDD¡€ €ET¢€0€0€ €pfam01617, Surface_Ag_2, Surface antigen. This family includes a number of bacterial surface antigens expressed on the surface of pathogens.¡€0€ª€0€ €CDD¡€ €±Â¢€0€0€ €‚3pfam01618, MotA_ExbB, MotA/TolQ/ExbB proton channel family. This family groups together integral membrane proteins that appear to be involved translocation of proteins across a membrane. These proteins are probably proton channels. MotA is an essential component of the flageller motor that uses a proton gradient to generate rotational motion in the flageller. ExbB is part of the TonB-dependent transduction complex. The TonB complex uses the proton gradient across the inner bacterial membrane to transport large molecules across the outer bacterial membrane.¡€0€ª€0€ €CDD¡€ €±Ã¢€0€0€ €+pfam01619, Pro_dh, Proline dehydrogenase. ¡€0€ª€0€ €CDD¡€ €±Ä¢€0€0€ €Ípfam01620, Pollen_allerg_2, Ribonuclease (pollen allergen). This family contains grass pollen proteins of group V. Phleum pratense pollen allergen Phl p 5b has been shown to possess ribonuclease activity.¡€0€ª€0€ €CDD¡€ €EX¢€0€0€ €«pfam01621, Fusion_gly_K, Cell fusion glycoprotein K. This protein is probably an integral membrane bound glycoprotein that is involved in viral fusion with the host cell.¡€0€ª€0€ €CDD¡€ €EY¢€0€0€ €‚ðpfam01623, Carla_C4, Carlavirus putative nucleic acid binding protein. This family of carlavirus nucleic acid binding proteins includes a motif for a potential C-4 type zinc finger this has four highly conserved cysteine residues and is a conserved feature of the carlaviruses 3' terminal ORF. These proteins may function as viral transcriptional regulators. The carlavirus family includes garlic latent virus and potato virus S and M, these viruses are positive strand, ssRNA with no DNA stage.¡€0€ª€0€ €CDD¡€ €Ó¢€0€0€ €‚µpfam01624, MutS_I, MutS domain I. This domain is found in proteins of the MutS family (DNA mismatch repair proteins) and is found associated with pfam00488, pfam05188, pfam05192 and pfam05190. The MutS family of proteins is named after the Salmonella typhimurium MutS protein involved in mismatch repair; other members of the family included the eukaryotic MSH 1,2,3, 4,5 and 6 proteins. These have various roles in DNA repair and recombination. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein. The aligned region corresponds with globular domain I, which is involved in DNA binding, in Thermus aquaticus MutS as characterized in.¡€0€ª€0€ €CDD¡€ €±Å¢€0€0€ €špfam01625, PMSR, Peptide methionine sulfoxide reductase. This enzyme repairs damaged proteins. Methionine sulfoxide in proteins is reduced to methionine.¡€0€ª€0€ €CDD¡€ €±Æ¢€0€0€ €‚©pfam01627, Hpt, Hpt domain. The histidine-containing phosphotransfer (HPt) domain is a novel protein module with an active histidine residue that mediates phosphotransfer reactions in the two-component signaling systems. A multistep phosphorelay involving the HPt domain has been suggested for these signaling pathways. The crystal structure of the HPt domain of the anaerobic sensor kinase ArcB has been determined. The domain consists of six alpha helices containing a four-helix bundle-folding. The pattern of sequence similarity of the HPt domains of ArcB and components in other signaling systems can be interpreted in light of the three-dimensional structure and supports the conclusion that the HPt domains have a common structural motif both in prokaryotes and eukaryotes. In S. cerevisiae ypd1p this domain has been shown to contain a binding surface for Ssk1p (response regulator receiver domain containing protein pfam00072).¡€0€ª€0€ €CDD¡€ €±Ç¢€0€0€ €ïpfam01628, HrcA, HrcA protein C terminal domain. HrcA is found to negatively regulate the transcription of heat shock genes. HrcA contains an amino terminal helix-turn-helix domain, however this corresponds to the carboxy terminal domain.¡€0€ª€0€ €CDD¡€ €±È¢€0€0€ €Üpfam01629, DUF22, Domain of unknown function DUF22. This domain is found in 1 to 3 copies in archaebacterial proteins. The function of the domain is unknown. This family appears to be expanded in Archaeoglobus fulgidus.¡€0€ª€0€ €CDD¡€ €±É¢€0€0€ €+pfam01630, Glyco_hydro_56, Hyaluronidase. ¡€0€ª€0€ €CDD¡€ €±Ê¢€0€0€ €3pfam01632, Ribosomal_L35p, Ribosomal protein L35. ¡€0€ª€0€ €CDD¡€ €±Ë¢€0€0€ €‚Ãpfam01633, Choline_kinase, Choline/ethanolamine kinase. Choline kinase catalyzes the committed step in the synthesis of phosphatidylcholine by the CDP-choline pathway. This alignment covers the protein kinase portion of the protein. The divergence of this family makes it very difficult to create a model that specifically predicts choline/ethanolamine kinases only. However if pfam01633 is also present then it is definitely a member of this family.¡€0€ª€0€ €CDD¡€ €±Ì¢€0€0€ €1pfam01634, HisG, ATP phosphoribosyltransferase. ¡€0€ª€0€ €CDD¡€ €±Í¢€0€0€ €‚rpfam01635, Corona_M, Coronavirus M matrix/glycoprotein. This family consists of various coronavirus matrix proteins which are transmembrane glycoproteins. The M protein or E1 glycoprotein is The coronavirus M protein is implicated in virus assembly. The E1 viral membrane protein is required for formation of the viral envelope and is transported via the Golgi complex.¡€0€ª€0€ €CDD¡€ €Ec¢€0€0€ €‚"pfam01636, APH, Phosphotransferase enzyme family. This family consists of bacterial antibiotic resistance proteins, which confer resistance to various aminoglycosides they include: aminoglycoside 3'-phosphotransferase or kanamycin kinase / neomycin-kanamycin phosphotransferase and streptomycin 3''-kinase or streptomycin 3''-phosphotransferase. The aminoglycoside phosphotransferases inactivate aminoglycoside antibiotics via phosphorylation. This family also includes homoserine kinase. This family is related to fructosamine kinase pfam03881.¡€0€ª€0€ €CDD¡€ €Ed¢€0€0€ €Õpfam01637, ATPase_2, ATPase domain predominantly from Archaea. This family contain a conserved P-loop motif that is involved in binding ATP. There are eukaryote members as well as archaeal members in this family.¡€0€ª€0€ €CDD¡€ €±Î¢€0€0€ €Èpfam01638, HxlR, HxlR-like helix-turn-helix. HxlR, a member of this family, is a DNA-binding protein that acts as a positive regulator of the formaldehyde-inducible hxlAB operon in Bacillus subtilis.¡€0€ª€0€ €CDD¡€ €Ee¢€0€0€ €‚/pfam01639, v110, Viral family 110. This family of viral proteins is known as the 110 family. The function of members of this family is unknown. The family contains a central cysteine rich region with eight conserved cysteines. Some members of the family contains two copies of the cysteine rich region.¡€0€ª€0€ €CDD¡€ €Ef¢€0€0€ €‚„pfam01640, Peptidase_C10, Peptidase C10 family. This family represents just the active peptide part of these proteins. Residues 1-120 are not part of the model as they form the pro-peptide, which before cleavage blocks the active site from the substrate. The catalytic residues of histidine and cysteine are brought close together at the active site by the folding of the active peptide.¡€0€ª€0€ €CDD¡€ €±Ï¢€0€0€ €‚pfam01641, SelR, SelR domain. Methionine sulfoxide reduction is an important process, by which cells regulate biological processes and cope with oxidative stress. MsrA, a protein involved in the reduction of methionine sulfoxides in proteins, has been known for four decades and has been extensively characterized with respect to structure and function. However, recent studies revealed that MsrA is only specific for methionine-S-sulfoxides. Because oxidised methionines occur in a mixture of R and S isomers in vivo, it was unclear how stereo-specific MsrA could be responsible for the reduction of all protein methionine sulfoxides. It appears that a second methionine sulfoxide reductase, SelR, evolved that is specific for methionine-R-sulfoxides, the activity that is different but complementary to that of MsrA. Thus, these proteins, working together, could reduce both stereoisomers of methionine sulfoxide. This domain is found both in SelR proteins and fused with the peptide methionine sulfoxide reductase enzymatic domain pfam01625. The domain has two conserved cysteine and histidines. The domain binds both selenium and zinc. The final cysteine is found to be replaced by the rare amino acid selenocysteine in some members of the family. This family has methionine-R-sulfoxide reductase activity.¡€0€ª€0€ €CDD¡€ €±Ð¢€0€0€ €‚npfam01642, MM_CoA_mutase, Methylmalonyl-CoA mutase. The enzyme methylmalonyl-CoA mutase is a member of a class of enzymes that uses coenzyme B12 (adenosylcobalamin) as a cofactor. The enzyme induces the formation of an adenosyl radical from the cofactor. This radical then initiates a free-radical rearrangement of its substrate, succinyl-CoA, to methylmalonyl-CoA.¡€0€ª€0€ €CDD¡€ €±Ñ¢€0€0€ €Üpfam01643, Acyl-ACP_TE, Acyl-ACP thioesterase. This family consists of various acyl-acyl carrier protein (ACP) thioesterases (TE) these terminate fatty acyl group extension via hydrolysing an acyl group on a fatty acid.¡€0€ª€0€ €CDD¡€ €±Ò¢€0€0€ €‚Cpfam01644, Chitin_synth_1, Chitin synthase. This region is found commonly in chitin synthases classes I, II and III. Chitin a linear homopolymer of GlcNAc residues, it is an important component of the cell wall of fungi and is synthesized on the cytoplasmic surface of the cell membrane by membrane bound chitin synthases.¡€0€ª€0€ €CDD¡€ €±Ó¢€0€0€ €‚†pfam01645, Glu_synthase, Conserved region in glutamate synthase. This family represents a region of the glutamate synthase protein. This region is expressed as a separate subunit in the glutamate synthase alpha subunit from archaebacteria, or part of a large multidomain enzyme in other organisms. The aligned region of these proteins contains a putative FMN binding site and Fe-S cluster.¡€0€ª€0€ €CDD¡€ €El¢€0€0€ €‚ºpfam01646, Herpes_UL24, Herpes virus proteins UL24 and UL76. This family consists of various herpes virus proteins; the gene 20 product, U49 protein, UL24 and UL76 proteins and BXRF1. The UL24 gene (product of the 24th ORF) is not essential for virus replication, and mutants with lesions in UL24 show a reduced ability to replicate in tissue culture and have reduced thymidine kinase activity, as the UL24 gene overlaps with thymidine kinase. The family of proteins is involved in viral production, latency, and reactivation. Protein UL76 presents as globular aggresomes in the nuclei of transiently transfected cells. Bioinformatic analyses predict that UL76 has a propensity for aggregation and targets cellular proteins implicated in protein folding and ubiquitin-proteasome systems. UL76 interacts with the VWA domain of S5a, the 26S proteasome non-ATPase regulatory subunit 4 (or PSMD4, or Rpn10), forming a complex in the late phase of infection.¡€0€ª€0€ €CDD¡€ €±Ô¢€0€0€ €‚bpfam01648, ACPS, 4'-phosphopantetheinyl transferase superfamily. Members of this family transfers the 4'-phosphopantetheine (4'-PP) moiety from coenzyme A (CoA) to the invariant serine of pfam00550. This post-translational modification renders holo-ACP capable of acyl group activation via thioesterification of the cysteamine thiol of 4'-PP. This superfamily consists of two subtypes: The ACPS type and the Sfp type. The structure of the Sfp type is known, which shows the active site accommodates a magnesium ion. The most highly conserved regions of the alignment are involved in binding the magnesium ion.¡€0€ª€0€ €CDD¡€ €±Õ¢€0€0€ €kpfam01649, Ribosomal_S20p, Ribosomal protein S20. Bacterial ribosomal protein S20 interacts with 16S rRNA.¡€0€ª€0€ €CDD¡€ €±Ö¢€0€0€ €‚&pfam01650, Peptidase_C13, Peptidase C13 family. Members of this family are asparaginyl peptidases. The blood fluke parasite Schistosoma mansoni has at least five Clan CA cysteine peptidases in its digestive tract including cathepsins B (2 isoforms), C, F and L. All have been recombinantly expressed as active enzymes, albeit in various stages of activation. In addition, a Clan CD peptidase, termed asparaginyl endopeptidase or 'legumain' has been identified. This has formerly been characterized as a 'haemoglobinase', but this term is probably incorrect. Two cDNAs have been described for Schistosoma mansoni legumain; one encodes an active enzyme whereas the active site cysteine residue encoded by the second cDNA is substituted by an asparagine residue. Both forms have been recombinantly expressed.¡€0€ª€0€ €CDD¡€ €±×¢€0€0€ €3pfam01652, IF4E, Eukaryotic initiation factor 4E. ¡€0€ª€0€ €CDD¡€ €±Ø¢€0€0€ €‚êpfam01653, DNA_ligase_aden, NAD-dependent DNA ligase adenylation domain. DNA ligases catalyze the crucial step of joining the breaks in duplex DNA during DNA replication, repair and recombination, utilising either ATP or NAD(+) as a cofactor. This domain is the catalytic adenylation domain. The NAD+ group is covalently attached to this domain at the lysine in the KXDG motif of this domain. This enzyme- adenylate intermediate is an important feature of the proposed catalytic mechanism.¡€0€ª€0€ €CDD¡€ €±Ù¢€0€0€ €‚¾pfam01654, Cyt_bd_oxida_I, Cytochrome bd terminal oxidase subunit I. This family are the alternative oxidases found in many bacteria which oxidise ubiquinol and reduce oxygen as part of the electron transport chain. This family is the subunit I of the oxidase E. coli has two copies of the oxidase, bo and bd', both of which are represented here In some nitrogen fixing bacteria, e.g. Klebsiella pneumoniae this oxidase is responsible for removing oxygen in microaerobic conditions, making the oxidase required for nitrogen fixation. This subunit binds a single b-haem, through ligands at His186 and Met393 (using SW:P11026 numbering). In addition His19 is a ligand for the haem b found in subunit II.¡€0€ª€0€ €CDD¡€ €±Ú¢€0€0€ €pfam01655, Ribosomal_L32e, Ribosomal protein L32. This family includes ribosomal protein L32 from eukaryotes and archaebacteria.¡€0€ª€0€ €CDD¡€ €±Û¢€0€0€ €‚ˆpfam01656, CbiA, CobQ/CobB/MinD/ParA nucleotide binding domain. This family consists of various cobyrinic acid a,c-diamide synthases. These include CbiA and CbiP from S.typhimurium, and CobQ from R. capsulatus. These amidases catalyze amidations to various side chains of hydrogenobyrinic acid or cobyrinic acid a,c-diamide in the biosynthesis of cobalamin (vitamin B12) from uroporphyrinogen III. Vitamin B12 is an important cofactor and an essential nutrient for many plants and animals and is primarily produced by bacteria. The family also contains dethiobiotin synthetases as well as the plasmid partitioning proteins of the MinD/ParA family.¡€0€ª€0€ €CDD¡€ €±Ü¢€0€0€ €‚Kpfam01657, Stress-antifung, Salt stress response/antifungal. This domain is often found in association with the kinase domains pfam00069 or pfam07714. In many proteins it is duplicated. It contains six conserved cysteines which are involved in disulphide bridges. It has a role in salt stress response and has antifungal activity.¡€0€ª€0€ €CDD¡€ €±Ý¢€0€0€ €‚Dpfam01658, Inos-1-P_synth, Myo-inositol-1-phosphate synthase. This is a family of myo-inositol-1-phosphate synthases. Inositol-1-phosphate catalyzes the conversion of glucose-6- phosphate to inositol-1-phosphate, which is then dephosphorylated to inositol. Inositol phosphates play an important role in signal transduction.¡€0€ª€0€ €CDD¡€ €±Þ¢€0€0€ €‚«pfam01659, Luteo_Vpg, Luteovirus putative VPg genome linked protein. This family consists of several putative genome linked proteins. The genomic RNA of luteoviruses are linked to virally encoded genome proteins (VPg). Open reading frame 4 is thought to encode the VPg in Soybean dwarf luteovirus. Luteoviruses have isometric capsids that contain a positive stand ssRNA genome, they have no DNA stage during their replication.¡€0€ª€0€ €CDD¡€ €Ex¢€0€0€ €‚:pfam01660, Vmethyltransf, Viral methyltransferase. This RNA methyltransferase domain is found in a wide range of ssRNA viruses, including Hordei-, Tobra-, Tobamo-, Bromo-, Clostero- and Caliciviruses. This methyltransferase is involved in mRNA capping. Capping of mRNA enhances its stability. This usually occurs in the nucleus. Therefore, many viruses that replicate in the cytoplasm encode their own. This is a specific guanine-7-methyltransferase domain involved in viral mRNA cap0 synthesis. Specificity for guanine 7 position is shown by NMR in and in vivo role in cap synthesis. Based on secondary structure prediction, the basic fold is believed to be similar to the common AdoMet-dependent methyltransferase fold. A curious feature of this methyltransferase domain is that it together with flanking sequences seems to have guanylyltransferase activity coupled to the methyltransferase activity. The domain is found throughout the so-called Alphavirus superfamily, (including alphaviruses and several other groups). It forms the defining, unique feature of this superfamily.¡€0€ª€0€ €CDD¡€ €±ß¢€0€0€ €‚§pfam01661, Macro, Macro domain. This domain is an ADP-ribose binding module. It is found in a number of otherwise unrelated proteins. It is found at the C-terminus of the macro-H2A histone protein. This domain is found in the non-structural proteins of several types of ssRNA viruses such as NSP3 from alphaviruses. This domain is also found on its own in a family of proteins from bacteria, archaebacteria and eukaryotes.¡€0€ª€0€ €CDD¡€ €±à¢€0€0€ €‚ppfam01663, Phosphodiest, Type I phosphodiesterase / nucleotide pyrophosphatase. This family consists of phosphodiesterases, including human plasma-cell membrane glycoprotein PC-1 / alkaline phosphodiesterase i / nucleotide pyrophosphatase (nppase). These enzymes catalyze the cleavage of phosphodiester and phosphosulfate bonds in NAD, deoxynucleotides and nucleotide sugars. Also in this family is ATX an autotaxin, tumor cell motility-stimulating protein which exhibits type I phosphodiesterases activity. The alignment encompasses the active site. Also present with in this family is 60-kDa Ca2+-ATPase form F. odoratum.¡€0€ª€0€ €CDD¡€ €±á¢€0€0€ €‚Ppfam01664, Reo_sigma1, Reovirus viral attachment protein sigma 1. This family consists of the reovirus sigma 1 hemagglutinin, cell attachment protein. This glycoprotein is a minor capsid protein and also determines the serotype-specific humoral immune response. Sigma 1 consist of a fibrous tail and a globular head. The head has important roles in the cell attachment function of sigma 1 and determinant of the type-specific humoral immune response. Reovirus is part of the orthoreovirus group of retroviruses with, a dsRNA genome. Also present in this family is bacteriophage SF6 Lysozyme.¡€0€ª€0€ €CDD¡€ €±â¢€0€0€ €‚Rpfam01665, Rota_NSP3, Rotavirus non-structural protein NSP3. This family consist of rotaviral non-structural RNA binding protein 34 (NS34 or NSP3). The NSP3 protein has been shown to bind viral RNA. The NSP3 protein consists of 3 conserved functional domains; a basic region which binds ssRNA, a region containing heptapeptide repeats mediating oligomerization and a leucine zipper motif. NSP3 may play a central role in replication and assembly of genomic RNA structures. Rotaviruses have a dsRNA genome and are a major cause cause of acute gastroenteritis in the young of many species. The rotavirus non-structural protein NSP3 is a sequence-specific RNA binding protein that binds the nonpolyadenylated 3' end of the rotavirus mRNAs. NSP3 also interacts with the translation initiation factor eIF4GI and competes with the poly(A) binding protein.¡€0€ª€0€ €CDD¡€ €E}¢€0€0€ €Àpfam01666, DX, DX module. This domain has no known function. It is found in several C. elegans proteins. The domain contains 6 conserved cysteines that probably form three disulphide bridges.¡€0€ª€0€ €CDD¡€ €±ã¢€0€0€ €3pfam01667, Ribosomal_S27e, Ribosomal protein S27. ¡€0€ª€0€ €CDD¡€ €E¢€0€0€ € pfam01668, SmpB, SmpB protein. ¡€0€ª€0€ €CDD¡€ €±ä¢€0€0€ €.pfam01669, Myelin_MBP, Myelin basic protein. ¡€0€ª€0€ €CDD¡€ €±å¢€0€0€ €:pfam01670, Glyco_hydro_12, Glycosyl hydrolase family 12. ¡€0€ª€0€ €CDD¡€ €±æ¢€0€0€ €‚ƒpfam01671, ASFV_360, African swine fever virus multigene family 360 protein. The multigene family 360 protein are found within the African swine fever virus (ASF) genome which consist of dsDNA and has similar structural features to the poxyviruses. The biological function of this family is not known. Although African swine fever virus Protein MGF 360-9L is a major structural protein.¡€0€ª€0€ €CDD¡€ €Eƒ¢€0€0€ €åpfam01672, Plasmid_parti, Putative plasmid partition protein. This family consists of conserved hypothetical proteins from Borrelia burgdorferi the lyme disease spirochaete, some of which are putative plasmid partition proteins.¡€0€ª€0€ €CDD¡€ €±ç¢€0€0€ €‚bpfam01673, Herpes_env, Herpesvirus putative major envelope glycoprotein. This family consists of probable major envelope glycoproteins from members of the herpesviridae including herpes simplex virus, human cytomegalovirus and varicella-zoster virus. Members of the herpesviridae have a dsDNA genome and do not have a RNA stage during there replication.¡€0€ª€0€ €CDD¡€ €E…¢€0€0€ €‚Dpfam01674, Lipase_2, Lipase (class 2). This family consists of hypothetical C. elegans proteins and lipases. Lipases or triacylglycerol acylhydrolases hydrolyse ester bonds in triacylglycerol giving diacylglycerol, monoacylglycerol, glycerol and free fatty acids. Lipase EstA is a extracellular lipase from B. subtilis 168.¡€0€ª€0€ €CDD¡€ €Ó¥¢€0€0€ €‚6pfam01676, Metalloenzyme, Metalloenzyme superfamily. This family includes phosphopentomutase and 2,3-bisphosphoglycerate-independent phosphoglycerate mutase. This family is also related to pfam00245. The alignment contains the most conserved residues that are probably involved in metal binding and catalysis.¡€0€ª€0€ €CDD¡€ €±è¢€0€0€ €‚Ñpfam01677, Herpes_UL7, Herpesvirus UL7 like. This family consists of various functionally undefined proteins from the herpesviridae and UL7 from bovine herpes virus. UL7 is not essential for virus replication in cell culture, and is found localized in the cytoplasm of infected cells accumulated around the nucleus but could not be detected in purified virions. Members of the herpesviridae have a dsDNA genome and do not have a RNA stage during there replication.¡€0€ª€0€ €CDD¡€ €E‡¢€0€0€ € pfam01678, DAP_epimerase, Diaminopimelate epimerase. Diaminopimelate epimerase contains two domains of the same alpha/beta fold, both contained in this family.¡€0€ª€0€ €CDD¡€ €±é¢€0€0€ €‚’pfam01679, Pmp3, Proteolipid membrane potential modulator. Pmp3 is an evolutionarily conserved proteolipid in the plasma membrane which, in S. pombe, is transcriptionally regulated by the Spc1 stress MAPK (mitogen-activated protein kinases) pathway. It functions to modulate the membrane potential, particularly to resist high cellular cation concentration. In eukaryotic organisms, stress-activated mitogen-activated protein kinases play crucial roles in transmitting environmental signals that will regulate gene expression for allowing the cell to adapt to cellular stress. Pmp3-like proteins are highly conserved in bacteria, yeast, nematode and plants.¡€0€ª€0€ €CDD¡€ €±ê¢€0€0€ €¾pfam01680, SOR_SNZ, SOR/SNZ family. Members of this family are enzymes involved in a new pathway of pyridoxine/pyridoxal 5-phosphate biosynthesis. This family was formerly known as UPF0019.¡€0€ª€0€ €CDD¡€ €±ë¢€0€0€ €‚dpfam01681, C6, C6 domain. This domain of unknown function is found in a hypothetical C. elegans protein. It is presumed to be an extracellular domain. The C6 domain contains six conserved cysteine residues in most copies of the domain. However some copies of the domain are missing cysteine residues 1 and 3 suggesting that these form a disulphide bridge.¡€0€ª€0€ €CDD¡€ €±ì¢€0€0€ €‚2pfam01682, DB, DB module. This domain has no known function. It is found in several C. elegans proteins. The domain contains 12 conserved cysteines that probably form six disulphide bridges. This domain is found associated with ig pfam00047 and fn3 pfam00041 domains, as well as in some lipases pfam00657.¡€0€ª€0€ €CDD¡€ €±í¢€0€0€ €þpfam01683, EB, EB module. This domain has no known function. It is found in several C. elegans proteins. The domain contains 8 conserved cysteines that probably form four disulphide bridges. This domain is found associated with kunitz domains pfam00014.¡€0€ª€0€ €CDD¡€ €±î¢€0€0€ €‚³pfam01684, ET, ET module. This domain has no known function. It is found in several C. elegans proteins. The domain contains 8-10 conserved cysteines that probably form 4-5 disulphide bridges. By inspection of the conservation of cysteines it looks like cysteines 1,2,3,4,9 and 10 are always present and that sometimes the pair 5 and 8 or the pair 6 and 7 are missing. This suggests that cysteines 5/8 and 6/7 make disulphide bridges.¡€0€ª€0€ €CDD¡€ €±ï¢€0€0€ €‚pfam01686, Adeno_Penton_B, Adenovirus penton base protein. This family consists of various adenovirus penton base proteins, from both the Mastadenoviradae having mammalian hosts and the Aviadenoviradae having avian hosts. The penton base is a major structural protein forming part of the penton which consists of a base and a fibre, the pentons hold a morphologically prominent position at the vertex capsomer in the adenovirus particle. In mammalian adenovirus there is only one tail on each base where as in avian adenovirus there are two.¡€0€ª€0€ €CDD¡€ €E¢€0€0€ €‚œpfam01687, Flavokinase, Riboflavin kinase. This family represents the C-terminal region of the bifunctional riboflavin biosynthesis protein known as RibC in Bacillus subtilis. The RibC protein from Bacillus subtilis has both flavokinase and flavin adenine dinucleotide synthetase (FAD-synthetase) activities. RibC plays an essential role in the flavin metabolism. This domain is thought to have kinase activity.¡€0€ª€0€ €CDD¡€ €±ð¢€0€0€ €‚Ãpfam01688, Herpes_gI, Alphaherpesvirus glycoprotein I. This family consists of glycoprotein I form various members of the alphaherpesvirinae these include herpesvirus, varicella-zoster virus and pseudorabies virus. Glycoprotein I (gI) is important during natural infection, mutants lacking gI produce smaller lesions at the site of infection and show reduced neuronal spread. gI forms a heterodimeric complex with gE; this complex displays Fc receptor activity (binds to the Fc region of immunoglobulin). Glycoproteins are also important in the production of virus-neutralising antibodies and cell mediated immunity. The alphaherpesvirinae have a dsDNA gnome and have no RNA stage during viral replication.¡€0€ª€0€ €CDD¡€ €E‘¢€0€0€ €‚žpfam01690, PLRV_ORF5, Potato leaf roll virus readthrough protein. This family consists mainly of the potato leaf roll virus readthrough protein. This is generated via a readthrough of open reading frame 3 a coat protein allowing transcription of open reading frame 5 to give an extended coat protein with a large c-terminal addition or read through domain. The readthrough protein is thought to play a role in the circulative aphid transmission of potato leaf roll virus. Also in the family is open reading frame 6 from beet western yellows virus and potato leaf roll virus both luteovirus and an unknown protein from cucurbit aphid-borne yellows virus a closterovirus.¡€0€ª€0€ €CDD¡€ €E’¢€0€0€ €‚›pfam01691, Adeno_E1B_19K, Adenovirus E1B 19K protein / small t-antigen. This family consists of adenovirus E1B 19K protein or small t-antigen. The E1B 19K protein inhibits E1A induced apoptosis and hence prolongs the viability of the host cell. It can also inhibit apoptosis mediated by tumor necrosis factor alpha and Fas antigen. E1B 19K blocks apoptosis by interacting with and inhibiting the p53-inducible and death- promoting Bax protein. The E1B region of adenovirus encodes two proteins E1B 19K the small t-antigen as found in this family and E1B 55K the large t-antigen which is not found in this family; both of these proteins inhibit E1A induced apoptosis.¡€0€ª€0€ €CDD¡€ €E“¢€0€0€ €‚3pfam01692, Paramyxo_C, Paramyxovirus non-structural protein C. This family consist of the C proteins (C', C, Y1, Y2) found in Paramyxovirinae; human parainfluenza, and sendai virus. The C proteins effect viral RNA synthesis having both a positive and negative effect during the course of infection. Paramyxovirus have a negative strand ssRNA genome of 15.3kb form which six mRNAs are transcribed, five of these are monocistronic. The P/C mRNA is polycistronic and has two overlapping open reading frames P and C, C encodes the nested C proteins C', C, Y1 and Y2.¡€0€ª€0€ €CDD¡€ €E”¢€0€0€ €‚Øpfam01693, Cauli_VI, Caulimovirus viroplasmin. This family consists of various caulimovirus viroplasmin proteins. The viroplasmin protein is encoded by gene VI and is the main component of viral inclusion bodies or viroplasms. Inclusions are the site of viral assembly, DNA synthesis and accumulation. Two domains exist within gene VI corresponding approximately to the 5' third and middle third of gene VI, these influence systemic infection in a light-dependent manner.¡€0€ª€0€ €CDD¡€ €±ñ¢€0€0€ €‚%pfam01694, Rhomboid, Rhomboid family. This family contains integral membrane proteins that are related to Drosophila rhomboid protein. Members of this family are found in bacteria and eukaryotes. Rhomboid promotes the cleavage of the membrane-anchored TGF-alpha-like growth factor Spitz, allowing it to activate the Drosophila EGF receptor. Analysis has shown that Rhomboid-1 is an intramembrane serine protease (EC:3.4.21.105). Parasite-encoded rhomboid enzymes are also important for invasion of host cells by Toxoplasma and the malaria parasite.¡€0€ª€0€ €CDD¡€ €±ò¢€0€0€ €ýpfam01695, IstB_IS21, IstB-like ATP binding protein. This protein contains an ATP/GTP binding P-loop motif. It is found associated with IS21 family insertion sequences. The function of this protein is unknown, but it may perform a transposase function.¡€0€ª€0€ €CDD¡€ €±ó¢€0€0€ €‚pfam01696, Adeno_E1B_55K, Adenovirus EB1 55K protein / large t-antigen. This family consists of adenovirus E1B 55K protein or large t-antigen. E1B 55K binds p53 the tumor suppressor protein converting it from a transcriptional activator which responds to damaged DNA in to an unregulated repressor of genes with a p53 binding site. This protects the virus against p53 induced host antiviral responses and prevents apoptosis as induced by the adenovirus E1A protein. The E1B region of adenovirus encodes two proteins E1B 55K the large t-antigen as found in this family and E1B 19K pfam01691 the small t-antigen which is not found in this family; both of these proteins inhibit E1A induced apoptosis. This family shows distant similarities to the pectate lyase superfamily.¡€0€ª€0€ €CDD¡€ €±ô¢€0€0€ €‚zpfam01697, Glyco_transf_92, Glycosyltransferase family 92. Members of this family act as galactosyltransferases, belonging to glycosyltransferase family 92. The aligned region contains several conserved cysteine residues and several charged residues that may be catalytic residues. This is supported by the inclusion of this family in the GT-A glycosyl transferase superfamily.¡€0€ª€0€ €CDD¡€ €±õ¢€0€0€ €‚)pfam01698, FLO_LFY, Floricaula / Leafy protein. This family consists of various plant development proteins which are homologs of floricaula (FLO) and Leafy (LFY) proteins which are floral meristem identity proteins. Mutations in the sequences of these proteins affect flower and leaf development.¡€0€ª€0€ €CDD¡€ €±ö¢€0€0€ €‚üpfam01699, Na_Ca_ex, Sodium/calcium exchanger protein. This is a family of sodium/calcium exchanger integral membrane proteins. This family covers the integral membrane regions of the proteins. Sodium/calcium exchangers regulate intracellular Ca2+ concentrations in many cells; cardiac myocytes, epithelial cells, neurons retinal rod photoreceptors and smooth muscle cells. Ca2+ is moved into or out of the cytosol depending on Na+ concentration. In humans and rats there are 3 isoforms; NCX1 NCX2 and NCX3.¡€0€ª€0€ €CDD¡€ €±÷¢€0€0€ €‚Ipfam01700, Orbi_VP3, Orbivirus VP3 (T2) protein. The orbivirus VP3 protein is part of the virus core and makes a 'subcore' shell made up of 120 copies of the 100K protein. VP3 particles can also bind RNA and are fundamental in the early stages of viral core formation. Also found in the family is structural core protein VP2 from broadhaven virus which is similar to VP3 in bluetongue virus. Orbivirus are part of the larger reoviridae which have a dsRNA genome of 10-12 linear segments; orbivirus found in this family include bluetongue virus and epizootic hemorrhagic disease virus.¡€0€ª€0€ €CDD¡€ €Eœ¢€0€0€ €‚pfam01701, PSI_PsaJ, Photosystem I reaction centre subunit IX / PsaJ. This family consists of the photosystem I reaction centre subunit IX or PsaJ from various organisms including Synechocystis sp. (strain pcc 6803), Pinus thunbergii (green pine) and Zea mays (maize). PsaJ is a small 4.4kDa, chloroplastal encoded, hydrophobic subunit of the photosystem I reaction complex its function is not yet fully understood. PsaJ can be cross-linked to PsaF and has a single predicted transmembrane domain it has a proposed role in maintaining PsaF in the correct orientation to allow for fast electron transfer from soluble donor proteins to P700+.¡€0€ª€0€ €CDD¡€ €±ø¢€0€0€ €‚­pfam01702, TGT, Queuine tRNA-ribosyltransferase. This is a family of queuine tRNA-ribosyltransferases EC:2.4.2.29, also known as tRNA-guanine transglycosylase and guanine insertion enzyme. Queuine tRNA-ribosyltransferase modifies tRNAs for asparagine, aspartic acid, histidine and tyrosine with queuine. It catalyzes the exchange of guanine-34 at the wobble position with 7-aminomethyl-7-deazaguanine, and the addition of a cyclopentenediol moiety to 7-aminomethyl-7-deazaguanine-34 tRNA; giving a hypermodified base queuine in the wobble position. The aligned region contains a zinc binding motif C-x-C-x2-C-x29-H, and important tRNA and 7-aminomethyl-7deazaguanine binding residues.¡€0€ª€0€ €CDD¡€ €±ù¢€0€0€ €‚êpfam01704, UDPGP, UTP--glucose-1-phosphate uridylyltransferase. This family consists of UTP--glucose-1-phosphate uridylyltransferases, EC:2.7.7.9. Also known as UDP-glucose pyrophosphorylase (UDPGP) and Glucose-1-phosphate uridylyltransferase. UTP--glucose-1-phosphate uridylyltransferase catalyzes the interconversion of MgUTP + glucose-1-phosphate and UDP-glucose + MgPPi. UDP-glucose is an important intermediate in mammalian carbohydrate interconversion involved in various metabolic roles depending on tissue type. In Dictyostelium (slime mold) mutants in this enzyme abort the development cycle. Also within the family is UDP-N-acetylglucosamine or AGX1 and two hypothetical proteins from Borrelia burgdorferi the lyme disease spirochaete.¡€0€ª€0€ €CDD¡€ €±ú¢€0€0€ €Àpfam01705, CX, CX module. This domain has no known function. It is found in several C. elegans proteins. The domain contains 6 conserved cysteines that probably form three disulphide bridges.¡€0€ª€0€ €CDD¡€ €±û¢€0€0€ €·pfam01706, FliG_C, FliG C-terminal domain. FliG is a component of the flageller rotor, present in about 25 copies per flagellum. This domain functions specifically in motor rotation.¡€0€ª€0€ €CDD¡€ €±ü¢€0€0€ €/pfam01707, Peptidase_C9, Peptidase family C9. ¡€0€ª€0€ €CDD¡€ €E¢¢€0€0€ €špfam01708, Gemini_mov, Geminivirus putative movement protein. This family consists of putative movement proteins from Maize streak and wheat dwarf virus.¡€0€ª€0€ €CDD¡€ €±ý¢€0€0€ €‚pfam01709, Transcrip_reg, Transcriptional regulator. This is a family of transcriptional regulators. In mammals, it activates the transcription of mitochondrially-encoded COX1. In bacteria, it negatively regulates the quorum-sensing response regulator by binding to its promoter region.¡€0€ª€0€ €CDD¡€ €±þ¢€0€0€ €‚7pfam01710, HTH_Tnp_IS630, Transposase. Transposase proteins are necessary for efficient DNA transposition. This family includes insertion sequences from Synechocystis PCC 6803 three of which are characterized as homologous to bacterial IS5- and IS4- and to several members of the IS630-Tc1-mariner superfamily.¡€0€ª€0€ €CDD¡€ €E¥¢€0€0€ €‚Žpfam01712, dNK, Deoxynucleoside kinase. This family consists of various deoxynucleoside kinases cytidine EC:2.7.1.74, guanosine EC:2.7.1.113, adenosine EC:2.7.1.76 and thymidine kinase EC:2.7.1.21 (which also phosphorylates deoxyuridine and deoxycytosine.) These enzymes catalyze the production of deoxynucleotide 5'-monophosphate from a deoxynucleoside. Using ATP and yielding ADP in the process.¡€0€ª€0€ €CDD¡€ €E¦¢€0€0€ €‚¢€0€0€ €‚ƒpfam01814, Hemerythrin, Hemerythrin HHE cation binding domain. Iteration of the HHE family found it to be related to Hemerythrin. It also demonstrated that what has been described as a single domain in fact consists of two cation binding domains. Members of this family occur all across nature and are involved in a variety of processes. For instance, in Nereis diversicolor hemerythrin binds Cadmium so as to protect the organism from toxicity. However Hemerythrin is classically described as Oxygen-binding through two attached Fe2+ ions. And the bacterial NorA is a regulator of response to NO, which suggests yet another set-up for its metal ligands. In Staphylococcus aureus the iron-sulfur cluster repair protein ScdA has been noted to be important when the organism switches to living in environments with low oxygen concentrations; perhaps this protein acts as an oxygen store or scavenger.¡€0€ª€0€ €CDD¡€ €²?¢€0€0€ €pfam01815, Rop, Rop protein. ¡€0€ª€0€ €CDD¡€ €²@¢€0€0€ €Àpfam01816, LRV, Leucine rich repeat variant. The function of this repeat is unknown. It has an unusual structure of two helices. One is an alpha helix, the other is the much rarer 3-10 helix.¡€0€ª€0€ €CDD¡€ €Ô¢€0€0€ €‚ pfam01817, CM_2, Chorismate mutase type II. Chorismate mutase EC:5.4.99.5 catalyzes the conversion of chorismate to prephenate in the pathway of tyrosine and phenylalanine biosynthesis. This enzyme is negatively regulated by tyrosine, tryptophan and phenylalanine.¡€0€ª€0€ €CDD¡€ €²A¢€0€0€ €‚Ypfam01818, Translat_reg, Bacteriophage translational regulator. The translational regulator protein regA is encoded by the T4 bacteriophage and binds to a region of messenger RNA (mRNA) that includes the initiator codon. RegA is unusual in that it represses the translation of about 35 early T4 mRNAs but does not affect nearly 200 other mRNAs.¡€0€ª€0€ €CDD¡€ €F¢€0€0€ €‚wpfam01819, Levi_coat, Levivirus coat protein. The Levivirus coat protein forms the bacteriophage coat that encapsidates the viral RNA. 180 copies of this protein form the virion shell. The MS2 bacteriophage coat protein controls two distinct processes: sequence-specific RNA encapsidation and repression of replicase translation-by binding to an RNA stem-loop structure of 19 nucleotides containing the initiation codon of the replicase gene. The binding of a coat protein dimer to this hairpin shuts off synthesis of the viral replicase, switching the viral replication cycle to virion assembly rather than continued replication.¡€0€ª€0€ €CDD¡€ €F¢€0€0€ €‚zpfam01820, Dala_Dala_lig_N, D-ala D-ala ligase N-terminus. This family represents the N-terminal region of the D-alanine--D-alanine ligase enzyme EC:6.3.2.4 which is thought to be involved in substrate binding. D-Alanine is one of the central molecules of the cross-linking step of peptidoglycan assembly. There are three enzymes involved in the D-alanine branch of peptidoglycan biosynthesis: the pyridoxal phosphate-dependent D-alanine racemase (Alr), the ATP-dependent D-alanine:D-alanine ligase (Ddl), and the ATP-dependent D-alanine:D-alanine-adding enzyme (MurF). This domain is structurally related to the PreATP-grasp domain.¡€0€ª€0€ €CDD¡€ €²B¢€0€0€ €‚(pfam01821, ANATO, Anaphylotoxin-like domain. C3a, C4a and C5a anaphylatoxins are protein fragments generated enzymatically in serum during activation of complement molecules C3, C4, and C5. They induce smooth muscle contraction. These fragments are homologous to a three-fold repeat in fibulins.¡€0€ª€0€ €CDD¡€ €²C¢€0€0€ €Qpfam01822, WSC, WSC domain. This domain may be involved in carbohydrate binding.¡€0€ª€0€ €CDD¡€ €²D¢€0€0€ €‚ppfam01823, MACPF, MAC/Perforin domain. The membrane-attack complex (MAC) of the complement system forms transmembrane channels. These channels disrupt the phospholipid bilayer of target cells, leading to cell lysis and death. A number of proteins participate in the assembly of the MAC. Freshly activated C5b binds to C6 to form a C5b-6 complex, then to C7 forming the C5b-7 complex. The C5b-7 complex binds to C8, which is composed of three chains (alpha, beta, and gamma), thus forming the C5b-8 complex. C5b-8 subsequently binds to C9 and acts as a catalyst in the polymerization of C9. Active MAC has a subunit composition of C5b-C6-C7-C8-C9{n}. Perforin is a protein found in cytolytic T-cell and killer cells. In the presence of calcium, perforin polymerizes into transmembrane tubules and is capable of lysing, non-specifically, a variety of target cells. There are a number of regions of similarity in the sequences of complement components C6, C7, C8-alpha, C8-beta, C9 and perforin. The X-ray crystal structure of a MACPF domain reveals that it shares a common fold with bacterial cholesterol dependent cytolysins (pfam01289) such as perfringolysin O. Three key pieces of evidence suggests that MACPF domains and CDCs are homologous: Functional similarity (pore formation), conservation of three glycine residues at a hinge in both families and conservation of a complex core fold.¡€0€ª€0€ €CDD¡€ €²E¢€0€0€ €\pfam01824, MatK_N, MatK/TrnK amino terminal region. The function of this region is unknown.¡€0€ª€0€ €CDD¡€ €F¢€0€0€ €‚¼pfam01825, GPS, GPCR proteolysis site, GPS, motif. The GPS motif is found in GPCRs, and is the site for auto-proteolysis, so is thus named, GPS. The GPS motif is a conserved sequence of ~40 amino acids containing canonical cysteine and tryptophan residues, and is the most highly conserved part of the domain. In most, if not all, cell-adhesion GPCRs these undergo autoproteolysis in the GPS between a conserved aliphatic residue (usually a leucine) and a threonine, serine, or cysteine residue. In higher eukaryotes this motif is found embedded in the C-terminal beta-stranded part of a GAIN domain - GPCR-Autoproteolysis INducing (GAIN). The GAIN-GPS domain adopts a fold in which the GPS motif, at the C-terminus, forms five beta-strands that are tightly integrated into the overall GAIN domain. The GPS motif, evolutionarily conserved from tetrahymena to mammals, is the only extracellular domain shared by all human cell-adhesion GPCRs and PKD proteins, and is the locus of multiple human disease mutations. The GAIN-GPS domain is both necessary and sufficient functionally for autoproteolysis, suggesting an autoproteolytic mechanism whereby the overall GAIN domain fine-tunes the chemical environment in the GPS to catalyze peptide bond hydrolysis. In the cell-adhesion GPCRs and PKD proteins, the GPS motif is always located at the end of their long N-terminal extracellular regions, immediately before the first transmembrane helix of the respective protein.¡€0€ª€0€ €CDD¡€ €²F¢€0€0€ €‚Jpfam01826, TIL, Trypsin Inhibitor like cysteine rich domain. This family contains trypsin inhibitors as well as a domain found in many extracellular proteins. The domain typically contains ten cysteine residues that form five disulphide bonds. The cysteine residues that form the disulphide bonds are 1-7, 2-6, 3-5, 4-10 and 8-9.¡€0€ª€0€ €CDD¡€ €²G¢€0€0€ €‚pfam01827, FTH, FTH domain. This presumed domain is likely to be a protein-protein interaction module. It is found in many proteins from C. elegans. The domain is found associated with the F-box pfam00646. This domain is named FTH after FOG-2 homology domain.¡€0€ª€0€ €CDD¡€ €²H¢€0€0€ €/pfam01828, Peptidase_A4, Peptidase A4 family. ¡€0€ª€0€ €CDD¡€ €²I¢€0€0€ €/pfam01829, Peptidase_A6, Peptidase A6 family. ¡€0€ª€0€ €CDD¡€ €²J¢€0€0€ €/pfam01830, Peptidase_C7, Peptidase C7 family. ¡€0€ª€0€ €CDD¡€ €F ¢€0€0€ €1pfam01831, Peptidase_C16, Peptidase C16 family. ¡€0€ª€0€ €CDD¡€ €F ¢€0€0€ €‚pfam01832, Glucosaminidase, Mannosyl-glycoprotein endo-beta-N-acetylglucosaminidase. This family includes Mannosyl-glycoprotein endo-beta-N-acetylglucosaminidase EC:3.2.1.96. As well as the flageller protein J that has been shown to hydrolyse peptidoglycan.¡€0€ª€0€ €CDD¡€ €²K¢€0€0€ €‚Wpfam01833, TIG, IPT/TIG domain. This family consists of a domain that has an immunoglobulin like fold. These domains are found in cell surface receptors such as Met and Ron as well as in intracellular transcription factors where it is involved in DNA binding. CAUTION: This family does not currently recognize a significant number of members.¡€0€ª€0€ €CDD¡€ €²L¢€0€0€ €.pfam01834, XRCC1_N, XRCC1 N terminal domain. ¡€0€ª€0€ €CDD¡€ €²M¢€0€0€ €_pfam01835, A2M_N, MG2 domain. This is the MG2 (macroglobulin) domain of alpha-2-macroglobulin.¡€0€ª€0€ €CDD¡€ €²N¢€0€0€ €‚Ûpfam01837, HcyBio, Homocysteine biosynthesis enzyme, sulfur-incorporation. This presumed domain is about is about 360 residues long. The function of this domain is unknown. It is found in some proteins that have two C-terminal CBS pfam00571 domains. There are also proteins that contain two inserted Fe4S domains near the C-terminal end of the domain. The Methanothermobacter thermautotrophicus gene MTH_855 product has been misannotated as an inosine monophosphate dehydrogenase based on the similarity to the CBS domains. Based on genetic analyses in the methanogen Methanosarcina acetivorans, this family is a key component of the metabolic network for sulfide assimilation and trafficking in methanogens. It is essential to a novel, O-acetylhomoserine sulfhydrylase-independent pathway for homocysteine biosynthesis, and may catalyze sulfur incorporation into the side chain of an as yet unidentified amino acid precursor. The DUF39-CBS and DUF39-ferredoxin architectures repeatedly occur together in the genomes of methanogenic Archaea, suggesting they may be of diverged function. This is consistent with a phylogenetic reconstruction of the DUF39 family, which clearly distinguishes the CBS-associated and ferredoxin-associated DUF39s.¡€0€ª€0€ €CDD¡€ €²O¢€0€0€ €‚êpfam01839, FG-GAP, FG-GAP repeat. This family contains the extracellular repeat that is found in up to seven copies in alpha integrins. This repeat has been predicted to fold into a beta propeller structure. The repeat is called the FG-GAP repeat after two conserved motifs in the repeat. The FG-GAP repeats are found in the N terminus of integrin alpha chains, a region that has been shown to be important for ligand binding. A putative Ca2+ binding motif is found in some of the repeats.¡€0€ª€0€ €CDD¡€ €²P¢€0€0€ €‚_pfam01840, TCL1_MTCP1, TCL1/MTCP1 family. Two related oncogenes, TCL-1 and MTCP-1, are overexpressed in T cell prolymphocytic leukaemias as a result of chromosomal rearrangements that involve the translocation of one T cell receptor gene to either chromosome 14q32 or Xq28. This family contains two repeated motifs that form a single globular domain.¡€0€ª€0€ €CDD¡€ €²Q¢€0€0€ €‚Ôpfam01841, Transglut_core, Transglutaminase-like superfamily. This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologs of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease.¡€0€ª€0€ €CDD¡€ €²R¢€0€0€ €‚!pfam01842, ACT, ACT domain. This family of domains generally have a regulatory role. ACT domains are linked to a wide range of metabolic enzymes that are regulated by amino acid concentration. Pairs of ACT domains bind specifically to a particular amino acid leading to regulation of the linked enzyme. The ACT domain is found in: D-3-phosphoglycerate dehydrogenase EC:1.1.1.95, which is inhibited by serine. Aspartokinase EC:2.7.2.4, which is regulated by lysine. Acetolactate synthase small regulatory subunit, which is inhibited by valine. Phenylalanine-4-hydroxylase EC:1.14.16.1, which is regulated by phenylalanine. Prephenate dehydrogenase EC:4.2.1.51. formyltetrahydrofolate deformylase EC:3.5.1.10, which is activated by methionine and inhibited by glycine. GTP pyrophosphokinase EC:2.7.6.5.¡€0€ª€0€ €CDD¡€ €²S¢€0€0€ €Bpfam01843, DIL, DIL domain. The DIL domain has no known function.¡€0€ª€0€ €CDD¡€ €²T¢€0€0€ €#pfam01844, HNH, HNH endonuclease. ¡€0€ª€0€ €CDD¡€ €F¢€0€0€ € pfam01845, CcdB, CcdB protein. ¡€0€ª€0€ €CDD¡€ €²U¢€0€0€ €‚ pfam01846, FF, FF domain. This domain has been predicted to be involved in protein-protein interaction. This domain was recently shown to bind the hyperphosphorylated C-terminal repeat domain of RNA polymerase II, confirming its role in protein-protein interactions.¡€0€ª€0€ €CDD¡€ €²V¢€0€0€ €Ðpfam01847, VHL, VHL beta domain. VHL forms a ternary complex with the elonginB and elonginC proteins. This complex binds Cul2, which then is involved in regulation of vascular endothelial growth factor mRNA.¡€0€ª€0€ €CDD¡€ €²W¢€0€0€ €%pfam01848, HOK_GEF, Hok/gef family. ¡€0€ª€0€ €CDD¡€ €²X¢€0€0€ €pfam01849, NAC, NAC domain. ¡€0€ª€0€ €CDD¡€ €²Y¢€0€0€ €pfam01850, PIN, PIN domain. ¡€0€ª€0€ €CDD¡€ €F¢€0€0€ €1pfam01851, PC_rep, Proteasome/cyclosome repeat. ¡€0€ª€0€ €CDD¡€ €F¢€0€0€ €!pfam01852, START, START domain. ¡€0€ª€0€ €CDD¡€ €F ¢€0€0€ €}pfam01853, MOZ_SAS, MOZ/SAS family. This region of these proteins has been suggested to be homologous to acetyltransferases.¡€0€ª€0€ €CDD¡€ €²Z¢€0€0€ €‚Øpfam01855, POR_N, Pyruvate flavodoxin/ferredoxin oxidoreductase, thiamine diP-bdg. This family includes the N terminal structural domain of the pyruvate ferredoxin oxidoreductase. This domain binds thiamine diphosphate, and along with domains II and IV, is involved in inter subunit contacts. The family also includes pyruvate flavodoxin oxidoreductase as encoded by the nifJ gene in cyanobacterium which is required for growth on molecular nitrogen when iron is limited.¡€0€ª€0€ €CDD¡€ €²[¢€0€0€ €ãpfam01856, HP_OMP, Helicobacter outer membrane protein. This family seems confined to Helicobacter. It is predicted to be an outer membrane protein based on its pattern of alternating hydrophobic amino acids similar to porins.¡€0€ª€0€ €CDD¡€ €F#¢€0€0€ €‚Ipfam01857, RB_B, Retinoblastoma-associated protein B domain. The crystal structure of the Rb pocket bound to a nine-residue E7 peptide containing the LxCxE motif, shared by other Rb-binding viral and cellular proteins, shows that the LxCxE peptide binds a highly conserved groove on the B domain. The B domain has a cyclin fold.¡€0€ª€0€ €CDD¡€ €²\¢€0€0€ €kpfam01858, RB_A, Retinoblastoma-associated protein A domain. This domain has the cyclin fold as predicted.¡€0€ª€0€ €CDD¡€ €²]¢€0€0€ €¦pfam01861, DUF43, Protein of unknown function DUF43. This family includes archaebacterial proteins of unknown function. All the members are 350-400 amino acids long.¡€0€ª€0€ €CDD¡€ €F&¢€0€0€ €‚Þpfam01862, PvlArgDC, Pyruvoyl-dependent arginine decarboxylase (PvlArgDC). Methanococcus jannaschii contains homologs of most genes required for spermidine polyamine biosynthesis. Yet genomes from neither this organism nor any other euryarchaeon have orthologues of the pyridoxal 5'-phosphate- dependent ornithine or arginine decarboxylase genes, required to produce putrescine. Instead,these organisms have a new class of arginine decarboxylase (PvlArgDC) formed by the self-cleavage of a proenzyme into a 5-kDa subunit and a 12-kDa subunit that contains a reactive pyruvoyl group. Although this extremely thermostable enzyme has no significant sequence similarity to previously characterized proteins, conserved active site residues are similar to those of the pyruvoyl-dependent histidine decarboxylase enzyme, and its subunits form a similar (alpha-beta)(3) complex. homologs of PvlArgDC are found in several bacterial genomes, including those of Chlamydia spp., which have no agmatine ureohydrolase enzyme to convert agmatine (decarboxylated arginine) into putrescine. In these intracellular pathogens, PvlArgDC may function analogously to pyruvoyl-dependent histidine decarboxylase; the cells are proposed to import arginine and export agmatine, increasing the pH and affecting the host cell's metabolism. Phylogenetic analysis of Pvl- ArgDC proteins suggests that this gene has been recruited from the euryarchaeal polyamine biosynthetic pathway to function as a degradative enzyme in bacteria.¡€0€ª€0€ €CDD¡€ €²^¢€0€0€ €‚”pfam01863, DUF45, Protein of unknown function DUF45. This protein has no known function. Members are found in some archaebacteria, as well as Helicobacter pylori. The proteins are 190-240 amino acids long, with the C terminus being the most conserved region, containing three conserved histidines. This motif is similar to that found in Zinc proteases, suggesting that this family may also be proteases.¡€0€ª€0€ €CDD¡€ €²_¢€0€0€ €‚spfam01864, CarS-like, CDP-archaeol synthase. CDP-archaeol synthase functions in the archaeal lipid biosynthetic pathway. It catalyzes the transfer of the nucleotide to its specific archaeal lipid substrate, leading to the formation of a CDP-activated precursor (CDP-archaeol) to which polar head groups are attached. Bacterial members of this family are uncharacterized.¡€0€ª€0€ €CDD¡€ €F)¢€0€0€ €‚rpfam01865, PhoU_div, Protein of unknown function DUF47. This family includes prokaryotic proteins of unknown function, as well as a protein annotated as the pit accessory protein from Sinorhizobium meliloti. However, the function of this protein is also unknown (Pit stands for Phosphate transport). It is probably distantly related to pfam01895 (personal obs:Yeats C).¡€0€ª€0€ €CDD¡€ €F*¢€0€0€ €‚Ípfam01866, Diphthamide_syn, Putative diphthamide synthesis protein. Diphthamide_syn, diphthamide synthase, catalyzes the last amidation step of diphthamide biosynthesis using ammonium and ATP. Human DPH1 is a candidate tumor suppressor gene. DPH2 from yeast, which confers resistance to diphtheria toxin has been found to be involved in diphthamide synthesis. Diphtheria toxin inhibits eukaryotic protein synthesis by ADP-ribosylating diphthamide, a post-translationally modified histidine residue present in EF2. Diphthamide synthase is evolutionarily conserved in eukaryotes. Diphthamide is a post-translationally modified histidine residue found on archaeal and eukaryotic translation elongation factor 2 (eEF-2).¡€0€ª€0€ €CDD¡€ €²`¢€0€0€ €‚Apfam01867, Cas_Cas1, CRISPR associated protein Cas1. Clustered regularly interspaced short palindromic repeats (CRISPRs) are a family of DNA direct repeats found in many prokaryotic genomes. This family of proteins corresponds to Cas1, a CRISPR-associated protein. Cas1 may be involved in linking DNA segments to CRISPR.¡€0€ª€0€ €CDD¡€ €²a¢€0€0€ €‚pfam01868, UPF0086, Domain of unknown function UPF0086. This family consists of several archaeal and eukaryotic proteins. The archaeal proteins are found to be expressed within ribosomal operons and several of the sequences are described as ribonuclease P protein subunit p29 proteins.¡€0€ª€0€ €CDD¡€ €²b¢€0€0€ €‚Äpfam01869, BcrAD_BadFG, BadF/BadG/BcrA/BcrD ATPase family. This family includes the BadF and BadG proteins that are two subunits of Benzoyl-CoA reductase, that may be involved in ATP hydrolysis. The family also includes an activase subunit from the enzyme 2-hydroxyglutaryl-CoA dehydratase. Aquifex aeolicus aq_278 contains two copies of this region suggesting that the family may structurally dimerize. This family appears to be related to pfam00370.¡€0€ª€0€ €CDD¡€ €F.¢€0€0€ €‚.pfam01870, Hjc, Archaeal holliday junction resolvase (hjc). This family of archaebacterial proteins are holliday junction resolvases (hjc gene). The Holliday junction is an essential intermediate of homologous recombination. This protein is the archaeal equivalent of RuvC but is not sequence similar.¡€0€ª€0€ €CDD¡€ €²c¢€0€0€ €‚hpfam01871, AMMECR1, AMMECR1. This family consists of several AMMECR1 as well as several uncharacterized proteins. The contiguous gene deletion syndrome AMME is characterized by Alport syndrome, midface hypoplasia, mental retardation and elliptocytosis and is caused by a deletion in Xq22.3, comprising several genes including COL4A5, FACL4 and AMMECR1. This family contains sequences from several eukaryotic species as well as archaebacteria and it has been suggested that the AMMECR1 protein may have a basic cellular function, potentially in either the transcription, replication, repair or translation machinery.¡€0€ª€0€ €CDD¡€ €²d¢€0€0€ €‚Ppfam01872, RibD_C, RibD C-terminal domain. The function of this domain is not known, but it is thought to be involved in riboflavin biosynthesis. This domain is found in the C terminus of RibD/RibG, in combination with pfam00383, as well as in isolation in some archaebacterial proteins. This family appears to be related to pfam00186.¡€0€ª€0€ €CDD¡€ €²e¢€0€0€ €‚ pfam01873, eIF-5_eIF-2B, Domain found in IF2B/IF5. This family includes the N terminus of eIF-5, and the C terminus of eIF-2 beta. This region corresponds to the whole of the archaebacterial eIF-2 beta homolog. The region contains a putative zinc binding C4 finger.¡€0€ª€0€ €CDD¡€ €²f¢€0€0€ €ìpfam01874, CitG, ATP:dephospho-CoA triphosphoribosyl transferase. The citG gene is found in a gene cluster with citrate lyase subunits. The function of the CitG protein was elucidated as ATP:dephospho-CoA triphosphoribosyl transferase.¡€0€ª€0€ €CDD¡€ €²g¢€0€0€ €‚tpfam01875, Memo, Memo-like protein. This family contains members from all branches of life. The molecular function of this protein is unknown, but Memo (mediator of ErbB2-driven cell motility) a human protein is included in this family. It has been suggested that Memo controls cell migration by relaying extracellular chemotactic signals to the microtubule cytoskeleton.¡€0€ª€0€ €CDD¡€ €F4¢€0€0€ €~pfam01876, RNase_P_p30, RNase P subunit p30. This protein is part of the RNase P complex that is involved in tRNA maturation.¡€0€ª€0€ €CDD¡€ €²h¢€0€0€ €‚³pfam01877, RNA_binding, RNA binding. PH1010 is composed of five alpha-helices (1-5) and eight beta-strands (1-8) with the following topology: beta-1, alpha-1, beta-2, beta-3, alpha-2, alpha-3, beta-4, beta-5, alpha-4, beta-6, alpha-5, beta-7, beta-8. The first six beta-strands (1-6) form a slightly twisted antiparallel beta-sheet and face five alpha-helices on one side. The last two beta-strands form an antiparallel beta-sheet in the C-terminus. PH1010 forms a characteristic homodimer structure in the crystal. Dimerisation of the molecule is crucial for function. The structure resembles that of some ribosomal proteins such as the 50S ribosomal protein L5. Although the structure resembles that of the RRM-type RNA-binding domain of the ribosomal L5 protein, the residues involved in RNA-binding in the L5 protein are not conserved in this family. Despite this, these proteins bind to double-stranded RNA in a non-sequence specific manner.¡€0€ª€0€ €CDD¡€ €²i¢€0€0€ €Õpfam01878, EVE, EVE domain. This domain was formerly known as DUF55. Crystal structures have shown that this domain is part of the PUA superfamily. This domain has been named EVE and is thought to be RNA-binding.¡€0€ª€0€ €CDD¡€ €²j¢€0€0€ €‚‡pfam01880, Desulfoferrodox, Desulfoferrodoxin. Desulfoferrodoxins contains two types of iron: an Fe-S4 site very similar to that found in desulforedoxin from Desulfovibrio gigas and an octahedral coordinated high-spin ferrous site most probably with nitrogen/oxygen-containing ligands. Due to this rather unusual combination of active centers, this novel protein is named desulfoferrodoxin.¡€0€ª€0€ €CDD¡€ €²k¢€0€0€ €‚pfam01881, Cas_Cas6, CRISPR associated protein Cas6. This group of families is one of several protein families that are always found associated with prokaryotic CRISPRs, themselves a family of clustered regularly interspaced short palindromic repeats, DNA repeats found in nearly half of all bacterial and archaeal genomes. These DNA repeat regions have a remarkably regular structure: unique sequences of constant size, called spacers, sit between each pair of repeats. It has been shown that the CRISPRs are virus-derived sequences acquired by the host to enable them to resist viral infection. The Cas proteins from the host use the CRISPRs to mediate an antiviral response. After transcription of the CRISPR, a complex of Cas proteins termed Cascade cleaves a CRISPR RNA precursor in each repeat and retains the cleavage products containing the virus-derived sequence. Assisted by the helicase Cas3, these mature CRISPR RNAs then serve as small guide RNAs that enable Cascade to interfere with virus proliferation. Cas5 contains an endonuclease motif, whose inactivation leads to loss of resistance, even in the presence of phage-derived spacers.¡€0€ª€0€ €CDD¡€ €²l¢€0€0€ €‚pfam01882, DUF58, Protein of unknown function DUF58. This family of prokaryotic proteins have no known function. Caldicellulosiruptor saccharolyticus PepX, a protein of unknown function in the family, has been misannotated as alpha-dextrin 6-glucanohydrolase.¡€0€ª€0€ €CDD¡€ €²m¢€0€0€ €‚ûpfam01883, DUF59, Domain of unknown function DUF59. This family has an alpha/beta topology, with 13 conserved hydrophobic residues at its core and a putative active site containing a highly conserved cysteine. Members of this family are involved in a range of physiological functions. The family includes PaaJ (PhaH) from Pseudomonas putida. PaaJ forms a complex with PaaG (PhaF), PaaI (PhaG) and PaaK (PhaI), which hydroxylates phenylacetic acid to 2-hydroxyphenylacetic acid. It also includes PaaD from Escherichia coli, a member of a multicomponent oxygenase involved in phenylacetyl-CoA hydroxylation. It is found near the N-terminus of the chloroplast scaffold protein HCF101, involved in the assembly of [4Fe-4S] clusters and their transfer to apoproteins.¡€0€ª€0€ €CDD¡€ €²n¢€0€0€ €pfam01884, PcrB, PcrB family. This family contains proteins that are related to PcrB. The function of these proteins is unknown.¡€0€ª€0€ €CDD¡€ €F<¢€0€0€ €‚çpfam01885, PTS_2-RNA, RNA 2'-phosphotransferase, Tpt1 / KptA family. Tpt1 catalyzes the last step of tRNA splicing in yeast. It transfers the splice junction 2'-phosphate from ligated tRNA to NAD, to produce ADP-ribose 1"-2"-cyclic phosphate. This is presumed to be followed by a transesterification step to release the RNA. The first step of this reaction is similar to that catalyzed by some bacterial toxins. E. coli KptA and mouse Tpt1 are likely to use the same reaction mechanism.¡€0€ª€0€ €CDD¡€ €²o¢€0€0€ €}pfam01886, DUF61, Protein of unknown function DUF61. Protein found in Archaebacteria. These proteins have no known function.¡€0€ª€0€ €CDD¡€ €²p¢€0€0€ €‚öpfam01887, SAM_adeno_trans, S-adenosyl-l-methionine hydroxide adenosyltransferase. This is a family of proteins, previously known as DUF62, found in archaebacteria and bacteria. The structure of proteins in this family is similar to that of a bacterial fluorinating enzyme. S-adenosyl-l-methionine hydroxide adenosyltransferases utilizes a rigorously conserved amino acid side chain triad (Asp-Arg-His) which may have a role in activating water to hydroxide ion. This family used to be known as DUF62.¡€0€ª€0€ €CDD¡€ €²q¢€0€0€ €‚pfam01888, CbiD, CbiD. CbiD is essential for cobalamin biosynthesis in both S. typhimurium and B. megaterium, no functional role has been ascribed to the protein. The CbiD protein has a putative S-AdoMet binding site. It is possible that CbiD might have the same role as CobF in undertaking the C-1 methylation and deacylation reactions required during the ring contraction process.¡€0€ª€0€ €CDD¡€ €²r¢€0€0€ €¨pfam01889, DUF63, Membrane protein of unknown function DUF63. Proteins found in Archaebacteria of unknown function. These proteins are probably transmembrane proteins.¡€0€ª€0€ €CDD¡€ €²s¢€0€0€ €‚Äpfam01890, CbiG_C, Cobalamin synthesis G C-terminus. Members of this family are involved in cobalamin synthesis. The protein encoded by Synechocystis sp.cbiH represents a fusion between cbiH and cbiG. As other multi-functional proteins involved in cobalamin biosynthesis catalyze adjacent steps in the pathway, including CysG, CobL (CbiET), CobIJ and CobA-HemD, it is therefore possible that CbiG catalyzes a reaction step adjacent to CbiH. In the anaerobic pathway such a step could be the formation of a gamma lactone, which is thought to help to mediate the anaerobic ring contraction process. Within the cobalamin synthesis pathway CbiG catalyzes the both the opening of the lactone ring and the extrusion of the two-carbon fragment of cobalt-precorrin-5A from C-20 and its associated methyl group (deacylation) to give cobalt-precorrin-5B. This family is the C-terminal region, and the mid- and N-termival parts are conserved independently in other families.¡€0€ª€0€ €CDD¡€ €²t¢€0€0€ €‚wpfam01891, CbiM, Cobalt uptake substrate-specific transmembrane region. This family of proteins forms part of the cobalt-transport complex in prokaryotes, CbiMNQO. CbiMNQO and NikMNQO are the most widespread groups of microbial transporters for cobalt and nickel ions and are unusual uptake systems as they consist of eg two transmembrane components (CbiM and CbiQ), a small membrane-bound component (CbiN) and an ATP-binding protein (CbiO) but no extracytoplasmic solute-binding protein. Similar components constitute the nickel transporters with some variability in the small membrane-bound component, either NikN or NikL, which are not similar to CbiN at the sequence level. CbiM is the substrate-specific component of the complex and is a seven-transmembrane protein. The CbiMNQO and NikMNQO systems form part of the coenzyme B12 biosynthesis pathway. The NikM protein is pfam10670.¡€0€ª€0€ €CDD¡€ €²u¢€0€0€ €ppfam01893, UPF0058, Uncharacterized protein family UPF0058. This archaebacterial protein has no known function.¡€0€ª€0€ €CDD¡€ €²v¢€0€0€ €¾pfam01894, UPF0047, Uncharacterized protein family UPF0047. This family has no known function. The alignment contains a conserved aspartate and histidine that may be functionally important.¡€0€ª€0€ €CDD¡€ €²w¢€0€0€ €‚^pfam01895, PhoU, PhoU domain. This family contains phosphate regulatory proteins including PhoU. PhoU proteins are known to play a role in the regulation of phosphate uptake. The PhoU domain is composed of a three helix bundle. The PhoU protein contains two copies of this domain. The domain binds to an iron cluster via its conserved E/DXXXD motif.¡€0€ª€0€ €CDD¡€ €²x¢€0€0€ €‚…pfam01896, DNA_primase_S, DNA primase small subunit. DNA primase synthesizes the RNA primers for the Okazaki fragments in lagging strand DNA synthesis. DNA primase is a heterodimer of large and small subunits. This family also includes baculovirus late expression factor 1 or LEF-1 proteins. Baculovirus LEF-1 is a DNA primase enzyme. The family also contains many bacterial DNA primases.¡€0€ª€0€ €CDD¡€ €²y¢€0€0€ €dpfam01899, MNHE, Na+/H+ ion antiporter subunit. Subunit of a Na+/H+ Prokaryotic antiporter complex.¡€0€ª€0€ €CDD¡€ €²z¢€0€0€ €‚pfam01900, RNase_P_Rpp14, Rpp14/Pop5 family. tRNA processing enzyme ribonuclease P (RNase P) consists of an RNA molecule associated with at least eight protein subunits, hPop1, Rpp14, Rpp20, Rpp25, Rpp29, Rpp30, Rpp38, and Rpp40. This protein is known as Pop5 in eukaryotes.¡€0€ª€0€ €CDD¡€ €²{¢€0€0€ €öpfam01901, O_anti_polymase, Putative O-antigen polymerase. Archaebacterial proteins of unknown function. Members of this family may be transmembrane proteins. These are potentially O-antigen assembly enzymes, with up to 11 transmembrane regions.¡€0€ª€0€ €CDD¡€ €²|¢€0€0€ €‚ípfam01902, Diphthami_syn_2, Diphthamide synthase. Diphthamide_syn, diphthamide synthase, catalyzes the last amidation step of diphthamide biosynthesis using ammonium and ATP. Diphthamide synthase is evolutionarily conserved in eukaryotes. Diphthamide is a post-translationally modified histidine residue found on archaeal and eukaryotic translation elongation factor 2 (eEF-2). In some members of this family this domain is associated with pfam01042. The enzyme classification is EC:6.3.1.14.¡€0€ª€0€ €CDD¡€ €FK¢€0€0€ €‚"pfam01903, CbiX, CbiX. The function of CbiX is uncertain, however it is found in cobalamin biosynthesis operons and so may have a related function. Some CbiX proteins contain a striking histidine-rich region at their C-terminus, which suggests that it might be involved in metal chelation.¡€0€ª€0€ €CDD¡€ €²}¢€0€0€ €]pfam01904, DUF72, Protein of unknown function DUF72. The function of this family is unknown.¡€0€ª€0€ €CDD¡€ €²~¢€0€0€ €‚õpfam01905, DevR, CRISPR-associated negative auto-regulator DevR/Csa2. This group of families is one of several protein families that are always found associated with prokaryotic CRISPRs, themselves a family of clustered regularly interspaced short palindromic repeats, DNA repeats found in nearly half of all bacterial and archaeal genomes. These DNA repeat regions have a remarkably regular structure: unique sequences of constant size, called spacers, sit between each pair of repeats. It has been shown that the CRISPRs are virus-derived sequences acquired by the host to enable them to resist viral infection. The Cas proteins from the host use the CRISPRs to mediate an antiviral response. After transcription of the CRISPR, a complex of Cas proteins termed Cascade cleaves a CRISPR RNA precursor in each repeat and retains the cleavage products containing the virus-derived sequence. Assisted by the helicase Cas3, these mature CRISPR RNAs then serve as small guide RNAs that enable Cascade to interfere with virus proliferation. Cas5 contains an endonuclease motif, whose inactivation leads to loss of resistance, even in the presence of phage-derived spacers. This family used to be known as DUF73. DevR appears to be negative auto-regulator within the system.¡€0€ª€0€ €CDD¡€ €²¢€0€0€ €þpfam01906, YbjQ_1, Putative heavy-metal-binding. From comparative structural analysis, this family is likely to be a heavy-metal binding domain. The domain oligomerises as a pentamer. The domain is about 100 amino acids long and is found in prokaryotes.¡€0€ª€0€ €CDD¡€ €²€¢€0€0€ €ípfam01907, Ribosomal_L37e, Ribosomal protein L37e. This family includes ribosomal protein L37 from eukaryotes and archaebacteria. The family contains many conserved cysteines and histidines suggesting that this protein may bind to zinc.¡€0€ª€0€ €CDD¡€ €²¢€0€0€ €‚‹pfam01909, NTP_transf_2, Nucleotidyltransferase domain. Members of this family belong to a large family of nucleotidyltransferases. This family includes kanamycin nucleotidyltransferase (KNTase) which is a plasmid-coded enzyme responsible for some types of bacterial resistance to aminoglycosides. KNTase in-activates antibiotics by catalyzing the addition of a nucleotidyl group onto the drug.¡€0€ª€0€ €CDD¡€ €²‚¢€0€0€ €‚òpfam01910, Thiamine_BP, Thiamine-binding protein. The crystal structure of two of these members shows that this domain has a ferredoxin like fold and is likely to exists as at least homodimers. Sulphate ions are are located at the dimer interfaces, which are thought to confer additional stability. Although the function of this domain remains to be identified, its structure suggests a role in protein-protein interactions possibly regulated by the binding of small-molecule ligands. Solution of the structure of the hyperthermophilic anaerobic Thermotoga maritima sequence, UniProtKB:Q9WYV6, shows that this has a beta-alpha-beta-beta-alpha-beta ferredoxin-like fold and assembles as a homotetramer. It was possible to identify a pocket in each monmer that bound an unidentified ligand. It was also found that it bound charged thiamine though not hydroxymethyl pyrimidine. It is proposed that it is transporting charged thiamine around the cytoplasm. Under oxidative conditions this bacterium is under stress, and the transcriiptional unit within which this protein is expressed is up-regulated in these conditions, suggesting that the chelation of cytoplasmic thaimine is part of the response mechanism to such oxidatvie stress, which is mediated by this family.¡€0€ª€0€ €CDD¡€ €²ƒ¢€