0€0€ €xpfam09374, PG_binding_3, Predicted Peptidoglycan domain. This family contains a potential peptidoglycan binding domain.¡€0€ª€0€ €CDD¡€ €^þ¢€0€0€ €‚pfam09375, Peptidase_M75, Imelysin. The imelysin peptidase was first identified in Pseudomonas aeruginosa. The active site residues have not been identified. However, His201 and Glu204 are completely conserved in the family and occur in an HXXE motif that is also found in family M14.¡€0€ª€0€ €CDD¡€ €ÅÀ¢€0€0€ €‚pfam09376, NurA, NurA domain. This family includes NurA a nuclease exhibiting both single-stranded endonuclease activity and 5'-3' exonuclease activity on single-stranded and double-stranded DNA from the hyperthermophilic archaeon Sulfolobus acidocaldarius.¡€0€ª€0€ €CDD¡€ €ÅÁ¢€0€0€ €‚êpfam09377, SBDS_C, SBDS protein C-terminal domain. This family is highly conserved in species ranging from archaea to vertebrates and plants. The family contains several Shwachman-Bodian-Diamond syndrome (SBDS) proteins from both mouse and humans. Shwachman-Diamond syndrome is an autosomal recessive disorder with clinical features that include pancreatic exocrine insufficiency, haematological dysfunction and skeletal abnormalities. Members of this family play a role in RNA metabolism.¡€0€ª€0€ €CDD¡€ €Å¢€0€0€ €‚§pfam09378, HAS-barrel, HAS barrel domain. The HAS barrel is named after HerA-ATP Synthase. In ATP synthases, this domain is implicated in the assembly of the catalytic toroid and docking of accessory subunits, such as the subunit of the ATP synthase complex. Similar roles in docking of the functional partner, the NurA nuclease, and assembly of the HerA toroid complex appear likely for the HAS-barrel of the HerA family.¡€0€ª€0€ €CDD¡€ €Åâ€0€0€ €~pfam09379, FERM_N, FERM N-terminal domain. This domain is the N-terminal ubiquitin-like structural domain of the FERM domain.¡€0€ª€0€ €CDD¡€ €_¢€0€0€ €4pfam09380, FERM_C, FERM C-terminal PH-like domain. ¡€0€ª€0€ €CDD¡€ €ÅÄ¢€0€0€ €‚ëpfam09381, Porin_OmpG, Outer membrane protein G (OmpG). Porins are channel proteins in the outer membrane of gram negative bacteria which mediate the uptake of molecules required for growth and survival. Escherichia coli OmpG forms a 14 stranded beta-barrel and in contrast to most porins, appears to function as a monomer. The central pore of OmpG is wider than other E. coli porins and it is speculated that it may form a non-specific channel for the transport of larger oligosaccharides.¡€0€ª€0€ €CDD¡€ €_¢€0€0€ €âpfam09382, RQC, RQC domain. This DNA-binding domain is found in the RecQ helicase among others and has a helix-turn-helix structure. The RQC domain, found only in RecQ family enzymes, is a high affinity G4 DNA binding domain.¡€0€ª€0€ €CDD¡€ €ÅÅ¢€0€0€ €‚Cpfam09383, NIL, NIL domain. This domain is found at the C-terminus of ABC transporter proteins involved in D-methionine transport as well as a number of ferredoxin-like proteins. This domain is likely to act as a substrate binding domain. The domain has been named after a conserved sequence in some members of the family.¡€0€ª€0€ €CDD¡€ €ÅÆ¢€0€0€ €‚Ðpfam09384, UTP15_C, UTP15 C terminal. U3 snoRNA is ubiquitous in eukaryotes and is required for nucleolar processing of pre-18S ribosomal RNA. It is a component of the ribosomal small subunit (SSU) processome. UTP15 is needed for optimal pre-ribosomal RNA transcription by RNA polymerase I, together with a subset of U3 proteins required for transcription (t-UTPs). This entry represents the C terminal of UTP15, and is found adjacent to WD40 repeats (pfam00400).¡€0€ª€0€ €CDD¡€ €ÅÇ¢€0€0€ €|pfam09385, HisK_N, Histidine kinase N terminal. This domain is found at the N terminal of sensor histidine kinase proteins.¡€0€ª€0€ €CDD¡€ €_ ¢€0€0€ €‚ pfam09386, ParD, Antitoxin ParD. ParD is a plasmid anti-toxin than forms a ribbon-helix-helix DNA binding structure. It stabilizes plasmids by inhibiting ParE toxicity in cells that express ParD and ParE. ParD forms a dimer and also regulates its own promoter (parDE).¡€0€ª€0€ €CDD¡€ €_ ¢€0€0€ €‚Zpfam09387, MRP, Mitochondrial RNA binding protein MRP. MRP1 and MRP2 are mitochondrial RNA binding proteins that form a heteromeric complex. The MRP1/MRP2 heterotetrameric complex binds to guide RNAs and stabilizes them in an unfolded conformation suitable for RNA-RNA hybridisation. Each MRP subunit adopts a 'whirly' transcription factor fold.¡€0€ª€0€ €CDD¡€ €ÅÈ¢€0€0€ €‚_pfam09388, SpoOE-like, Spo0E like sporulation regulatory protein. Spore formation is an extreme response to starvation and can also be a component of disease transmission. Sporulation is controlled by an expanded two-component system where starvation signals result in sensor kinase activation and phosphorylation of the master sporulation response regulator Spo0A. Phosphatases such as Spo0E dephosphorylate Spo0A thereby inhibiting sporulation. This is a family of Spo0E-like phosphatases. The structure of a Bacillus anthracis member of this family has revealed an anti-parallel alpha-helical structure.¡€0€ª€0€ €CDD¡€ €ÅÉ¢€0€0€ €¡pfam09390, DUF1999, Protein of unknown function (DUF1999). This family contains a putative Fe-S binding reductase whose structure adopts an alpha and beta fold.¡€0€ª€0€ €CDD¡€ €ÅÊ¢€0€0€ €Ípfam09391, DUF2000, Protein of unknown function (DUF2000). This is a family of proteins of unknown function. The structure of one of the proteins in this family has been shown to adopt an alpha beta fold.¡€0€ª€0€ €CDD¡€ €ÅË¢€0€0€ €‚Àpfam09392, T3SS_needle_F, Type III secretion needle MxiH, YscF, SsaG, EprI, PscF, EscF. Type III secretion systems are essential virulence determinants for many gram-negative bacterial pathogens. MxiH is an extracellular alpha helical needle that is required for translocation of effector proteins into host cells. Once inside, the effector proteins subvert normal cell function to aid infection. The needle protein F, polymerizes to form a shaft.¡€0€ª€0€ €CDD¡€ €ÅÌ¢€0€0€ €ßpfam09393, DUF2001, Phage tail tube protein. This is a family of phage tail tube proteins including protein XkdM from phage-like element PBSX protein whose structure adopts a beta barrel flanked with alpha helical regions.¡€0€ª€0€ €CDD¡€ €ÅÍ¢€0€0€ €“pfam09394, Inhibitor_I42, Chagasin family peptidase inhibitor I42. Chagasin is a cysteine peptidase inhibitor which forms a beta barrel structure.¡€0€ª€0€ €CDD¡€ €Å΢€0€0€ €‚•pfam09396, Thrombin_light, Thrombin light chain. Thrombin is an enzyme that cleaves bonds after Arg and Lys, converts fibrinogen to fibrin and activates factors V, VII, VIII. Prothrombin is activated on the surface of a phospholipid membrane where factor Xa removes the activation peptide and cleaves the remaining part into light and heavy chains. This domain corresponds to the light chain of thrombin.¡€0€ª€0€ €CDD¡€ €ÅÏ¢€0€0€ €æpfam09397, Ftsk_gamma, Ftsk gamma domain. This domain directs oriented DNA translocation and forms a winged helix structure. Mutated proteins with substitutions in the FtsK gamma DNA-recognition helix are impaired in DNA binding.¡€0€ª€0€ €CDD¡€ €ÅТ€0€0€ €‚Ipfam09398, FOP_dimer, FOP N terminal dimerisation domain. Fibroblast growth factor receptor 1 (FGFR1) oncogene partner (FOP) is a centrosomal protein that is involved in anchoring microtubules to subcellular structures. This domain includes a Lis-homology motif. It forms an alpha helical bundle and is involved in dimerisation.¡€0€ª€0€ €CDD¡€ €ÅÑ¢€0€0€ €‚Spfam09399, SARS_lipid_bind, SARS lipid binding protein. This is a family of proteins found in SARS coronavirus. The protein has a novel fold which forms a dimeric tent-like beta structure with an amphipathic surface, and a central hydrophobic cavity that binds lipid molecules. This cavity is likely to be involved in membrane attachment.¡€0€ª€0€ €CDD¡€ €_¢€0€0€ €Úpfam09400, DUF2002, Protein of unknown function (DUF2002). This is a family of putative cytoplasmic proteins. The structure of these proteins form an antiparallel beta and sheet and contain some alpha helical regions.¡€0€ª€0€ €CDD¡€ €_¢€0€0€ €‚¤pfam09401, NSP10, RNA synthesis protein NSP10. Non-structural protein 10 (NSP10) is involved in RNA synthesis. it is synthesized as a polyprotein whose cleavage generates many non-structural proteins. NSP10 contains two zinc binding motifs and forms two anti-parallel helices which are stacked against an irregular beta sheet. A cluster of basic residues on the protein surface suggests a nucleic acid-binding function.¡€0€ª€0€ €CDD¡€ €_¢€0€0€ €‚¤pfam09402, MSC, Man1-Src1p-C-terminal domain. MAN1 is an integral protein of the inner nuclear membrane which binds to chromatin associated proteins and plays a role in nuclear organisation. The C terminal nucleoplasmic region forms a DNA binding winged helix and binds to Smad. This C-terminal tail is also found in S. cerevisiae and is thought to consist of three conserved helices followed by two downstream strands.¡€0€ª€0€ €CDD¡€ €ÅÒ¢€0€0€ €tpfam09403, FadA, Adhesion protein FadA. FadA (Fusobacterium adhesin A) is an adhesin which forms two alpha helices.¡€0€ª€0€ €CDD¡€ €_¢€0€0€ €®pfam09404, DUF2003, Eukaryotic protein of unknown function (DUF2003). This is a family of proteins of unknown function which adopt an alpha helical and beta sheet structure.¡€0€ª€0€ €CDD¡€ €_¢€0€0€ €‚ðpfam09405, Btz, CASC3/Barentsz eIF4AIII binding. This domain is found on CASC3 (cancer susceptibility candidate gene 3 protein) which is also known as Barentsz (Btz). CASC3 is a component of the EJC (exon junction complex) which is a complex that is involved in post-transcriptional regulation of mRNA in metazoa. The complex is formed by the association of four proteins (eIF4AIII, Barentsz, Mago, and Y14), mRNA, and ATP. This domain wraps around eIF4AIII and stacks against the 5' nucleotide.¡€0€ª€0€ €CDD¡€ €ÅÓ¢€0€0€ €Épfam09406, DUF2004, Protein of unknown function (DUF2004). This is a family of proteins with unknown function. The structure of one of the proteins in this family has revealed a novel alpha-beta fold.¡€0€ª€0€ €CDD¡€ €ÅÔ¢€0€0€ €‚Ëpfam09407, AbiEi_1, AbiEi antitoxin C-terminal domain. AbiEi_1 is the cognate antitoxin of the type IV toxin-antitoxin 'innate immunity' bacterial abortive infection (Abi) system that protects bacteria from the spread of a phage infection. The Abi system is activated upon infection with phage to abort the cell thus preventing the spread of phage through viral replication. There are some 20 or more Abis, and they are predominantly plasmid-encoded lactococcal systems. TA, toxin-antitoxin, systems on plasmids function by killing cells that lose the plasmid upon division. AbiE phage resistance systems function as novel Type IV TAs and are widespread in bacteria and archaea. The cognate antitoxin is pfam13338.¡€0€ª€0€ €CDD¡€ €ÅÕ¢€0€0€ €‚pfam09408, Spike_rec_bind, Spike receptor binding domain. Spike is an envelope glycoprotein which aids viral entry into the host cell. This domain corresponds is the immunogenic receptor binding domain of the protein which binds to angiotensin-converting enzyme 2 (ACE2).¡€0€ª€0€ €CDD¡€ €ÅÖ¢€0€0€ €épfam09409, PUB, PUB domain. The PUB (also known as PUG) domain is found in peptide N-glycanase where it functions as a AAA ATPase binding domain. This domain is also found on other proteins linked to the ubiquitin-proteasome system.¡€0€ª€0€ €CDD¡€ €Å×¢€0€0€ €¤pfam09411, PagL, Lipid A 3-O-deacylase (PagL). PagL is an outer membrane protein with lipid A 3-O-deacylase activity. It forms an 8 stranded beta barrel structure.¡€0€ª€0€ €CDD¡€ €ÅØ¢€0€0€ €‚ pfam09412, XendoU, Endoribonuclease XendoU. This is a family of endoribonucleases involved in RNA biosynthesis which has been named XendoU in Xenopus laevis. XendoU is a U-specific metal dependent enzyme that produces products with a 2'-3' cyclic phosphate termini.¡€0€ª€0€ €CDD¡€ €ÅÙ¢€0€0€ €—pfam09413, DUF2007, Putative prokaryotic signal transducing protein. This is a family of putative prokaryotic signal transducing proteins of Pii-type.¡€0€ª€0€ €CDD¡€ €ÅÚ¢€0€0€ €¤pfam09414, RNA_ligase, RNA ligase. This is a family of RNA ligases. The enzyme repairs RNA strand breaks in nicked DNA:RNA and RNA:RNA but not in DNA:DNA duplexes.¡€0€ª€0€ €CDD¡€ €ÅÛ¢€0€0€ €‚8pfam09415, CENP-X, CENP-S associating Centromere protein X. The centromere, essential for faithful chromosome segregation during mitosis, has a network of constitutive centromere-associated (CCAN) proteins associating with it during mitosis. So far in vertebrates at least 15 centromere proteins have been identified, which are divided into several subclasses based on functional and biochemical analyses. These provide a platform for the formation of a functional kinetochore during mitosis. CENP-S is one that does not associate with the CENP-H-containing complex but rather interacts with CENP-X to form a stable assembly of outer kinetochore proteins that functions downstream of other components of the CCAN. This complex may directly allow efficient and stable formation of the outer kinetochore on the CCAN platform.¡€0€ª€0€ €CDD¡€ €ÅÜ¢€0€0€ €‚Opfam09416, UPF1_Zn_bind, RNA helicase (UPF2 interacting domain). UPF1 is an essential RNA helicase that detects mRNAs containing premature stop codons and triggers their degradation. This domain contains 3 zinc binding motifs and forms interactions with another protein (UPF2) that is also involved nonsense-mediated mRNA decay (NMD).¡€0€ª€0€ €CDD¡€ €ÅÝ¢€0€0€ €zpfam09418, DUF2009, Protein of unknown function (DUF2009). This is a eukaryotic family of proteins with unknown function.¡€0€ª€0€ €CDD¡€ €ÅÞ¢€0€0€ €‚Lpfam09419, PGP_phosphatase, Mitochondrial PGP phosphatase. This is a family of proteins that acts as a mitochondrial phosphatase in cardiolipin biosynthesis. Cardiolipin is a unique dimeric phosphoglycerolipid predominantly present in mitochondrial membranes. The inverted phosphatase motif includes the highly conserved DKD triad.¡€0€ª€0€ €CDD¡€ €_&¢€0€0€ €ipfam09420, Nop16, Ribosome biogenesis protein Nop16. Nop16 is a protein involved in ribosome biogenesis.¡€0€ª€0€ €CDD¡€ €Åߢ€0€0€ €‚ pfam09421, FRQ, Frequency clock protein. The frequency clock protein, is the central component of the frq-based circadian negative feedback loop, regulates various aspects of the circadian clock in Neurospora crassa. This protein has been shown to interact with itself via a coiled-coil.¡€0€ª€0€ €CDD¡€ €Åࢀ0€0€ €—pfam09422, WTX, WTX protein. The WTX protein is found to be inactivated in one third of Wilms tumors. The WTX protein is functionally uncharacterized.¡€0€ª€0€ €CDD¡€ €Åᢀ0€0€ €)pfam09423, PhoD, PhoD-like phosphatase. ¡€0€ª€0€ €CDD¡€ €Å⢀0€0€ €hpfam09424, YqeY, Yqey-like protein. The function of this domain found in the YqeY protein is uncertain.¡€0€ª€0€ €CDD¡€ €_+¢€0€0€ €‚*pfam09425, CCT_2, Divergent CCT motif. This short motif is found in a number of plant proteins. It appears to be related to the N-terminal half of the CCT motif. The CCT motif is about 45 amino acids long and contains a putative nuclear localization signal within the second half of the CCT motif.¡€0€ª€0€ €CDD¡€ €Å㢀0€0€ €‚pfam09426, Nyv1_N, Vacuolar R-SNARE Nyv1 N terminal. This domain corresponds to the N terminal domain of vacuolar R-SNARE Nyv1 which adopts a longin fold. In yeast it has been shown that this domain is sufficient to direct the transport of Nyv1 to limiting membrane of the vacuole.¡€0€ª€0€ €CDD¡€ €Å䢀0€0€ €Ópfam09427, DUF2014, Domain of unknown function (DUF2014). This domain is found at the C terminal of a family of ER membrane bound transcription factors called sterol regulatory element binding proteins (SREBP).¡€0€ª€0€ €CDD¡€ €Å墀0€0€ €pfam09428, DUF2011, Fungal protein of unknown function (DUF2011). This is a family of fungal proteins whose function is unknown.¡€0€ª€0€ €CDD¡€ €Å梀0€0€ €°pfam09429, Wbp11, WW domain binding protein 11. The WW domain is a small protein module with a triple-stranded beta-sheet fold. This is a family of WW domain binding proteins.¡€0€ª€0€ €CDD¡€ €Å碀0€0€ €tpfam09430, DUF2012, Protein of unknown function (DUF2012). This is a eukaryotic family of uncharacterized proteins.¡€0€ª€0€ €CDD¡€ €Å袀0€0€ €‡pfam09431, DUF2013, Protein of unknown function (DUF2013). This region is found at the C terminal of a group of cytoskeletal proteins.¡€0€ª€0€ €CDD¡€ €Å颀0€0€ €³pfam09432, THP2, Tho complex subunit THP2. The THO complex plays a role in coupling transcription elongation to mRNA export. It is composed of subunits THP2, HPR1, THO2 and MFT1.¡€0€ª€0€ €CDD¡€ €_3¢€0€0€ €wpfam09435, DUF2015, Fungal protein of unknown function (DUF2015). This is a fungal family of uncharacterized proteins.¡€0€ª€0€ €CDD¡€ €Åꢀ0€0€ €‚~pfam09436, DUF2016, Domain of unknown function (DUF2016). A predicted alpha+beta domain that is usually fused N-terminal to the JAB metallopeptidase. This protein in turn is found in conserved gene neighborhoods that include genes encoding the bacterial homologs of the ubiquitin modification system such as the E1, E2 and Ub proteins. The domain is also known as the JAB-N domain.¡€0€ª€0€ €CDD¡€ €Å뢀0€0€ €3pfam09437, Pombe_5TM, Pombe specific 5TM protein. ¡€0€ª€0€ €CDD¡€ €ÌØ¢€0€0€ €Äpfam09438, DUF2017, Domain of unknown function (DUF2017). This is an alpha-helical domain found in gene neighborhoods that contain genes encoding ubiquitin, cysteine synthases and JAB peptidases.¡€0€ª€0€ €CDD¡€ €Å좀0€0€ €épfam09439, SRPRB, Signal recognition particle receptor beta subunit. The beta subunit of the signal recognition particle receptor (SRP) is a transmembrane GTPase which anchors the alpha subunit to the endoplasmic reticulum membrane.¡€0€ª€0€ €CDD¡€ €Åí¢€0€0€ €„pfam09440, eIF3_N, eIF3 subunit 6 N terminal domain. This is the N terminal domain of subunit 6 translation initiation factor eIF3.¡€0€ª€0€ €CDD¡€ €Å0€0€ €Ópfam09441, Abp2, ARS binding protein 2. This DNA-binding protein binds to the autonomously replicating sequence (ARS) binding element. It may play a role in regulating the cell cycle response to stress signals.¡€0€ª€0€ €CDD¡€ €Å0€0€ €‚˜pfam09442, DUF2018, Domain of unknown function (DUF2018). Acid-adaptive protein possibly of physiological significance when H.pylori colonises the human stomach, which adopts a unique four alpha-helical triangular conformations. The biologically active form is thought to be a tetramer. The protein is expressed along with six other proteins, some of which are related to iron storage and haem biosynthesis.¡€0€ª€0€ €CDD¡€ €Åð¢€0€0€ €‚–pfam09443, CFC, Cripto_Frl-1_Cryptic (CFC). CFC domain is one half of the membrane protein Cripto, a protein overexpressed in many tumors and structurally similar to the C-terminal extracellular portions of Jagged 1 and Jagged 2. CFC is approx 40-residues long, compacted by three internal disulphide bridges, and binds Alk4 via a hydrophobic patch. CFC is structurally homologous to the VWFC-like domain.¡€0€ª€0€ €CDD¡€ €Åñ¢€0€0€ €’pfam09444, MRC1, MRC1-like domain. This putative domain is found to be the most conserved region in mediator of replication checkpoint protein 1.¡€0€ª€0€ €CDD¡€ €Åò¢€0€0€ €ípfam09445, Methyltransf_15, RNA cap guanine-N2 methyltransferase. RNA cap guanine-N2 methyltransferases such as Schizosaccharomyces pombe Tgs1 and Giardia lamblia Tgs2 catalyze methylation of the exocyclic N2 amine of 7-methylguanosine.¡€0€ª€0€ €CDD¡€ €Åó¢€0€0€ €øpfam09446, VMA21, VMA21-like domain. This presumed short domain appears to contain two potential transmembrane helices. VMA21 is localized in the ER where it is needed as an accessory factor for assembly of the V0 component of the vacuolar ATPase.¡€0€ª€0€ €CDD¡€ €Åô¢€0€0€ €cpfam09447, Cnl2_NKP2, Cnl2/NKP2 family protein. This family includes the Cnl2 kinetochore protein.¡€0€ª€0€ €CDD¡€ €Åõ¢€0€0€ €‚¤pfam09448, MmlI, Methylmuconolactone methyl-isomerase. MmlI is a short, approx 115 residue, protein of two alpha helices and four beta strands. It is involved in the catabolism of methyl-substituted aromatics via a modified oxo-adipate pathway in bacteria. The enzyme appears to be monomeric in some species and tetrameric in others. The known structure shows two copies of the protein form a dimeric alpha beta barrel.¡€0€ª€0€ €CDD¡€ €Åö¢€0€0€ €ipfam09449, DUF2020, Domain of unknown function (DUF2020). Protein of unknown function found in bacteria.¡€0€ª€0€ €CDD¡€ €Å÷¢€0€0€ €ipfam09450, DUF2019, Domain of unknown function (DUF2019). Protein of unknown function found in bacteria.¡€0€ª€0€ €CDD¡€ €Åø¢€0€0€ €1pfam09451, ATG27, Autophagy-related protein 27. ¡€0€ª€0€ €CDD¡€ €Åù¢€0€0€ €‚‡pfam09452, Mvb12, ESCRT-I subunit Mvb12. The endosomal sorting complex required for transport (ESCRT) complexes play a critical role in receptor down-regulation and retroviral budding. A new component of the ESCRT-I complex was identified, multivesicular body sorting factor of 12 kD (Mvb12), which binds to the coiled-coil domain of the ESCRT-I subunit vacuolar protein sorting 23 (Vps23).¡€0€ª€0€ €CDD¡€ €_B¢€0€0€ €‚Ûpfam09453, HIRA_B, HIRA B motif. The HirA B (Histone regulatory homolog A binding) motif is the essential binding interface between HIRA pfam07569 and ASF1a, of approx. 40 residues. It forms an antiparallel beta-hairpin that binds perpendicular to the strands of the beta-sandwich of ASF1a N-terminal core domain, via beta-sheet, salt bridge and van der Waals interactions. The two histone chaperone proteins, HIRA and ASF1a, form a heterodimer with histones H3 and H4. HIRA is the human orthologue of Hir proteins known to silence histone gene expression and create transcriptionally silent heterochromatin in yeast, flies, plants and humans. The yeast CAF1B proteins which bind H3 also carry this motif at their very C-terminus.¡€0€ª€0€ €CDD¡€ €Åú¢€0€0€ €‚zpfam09454, Vps23_core, Vps23 core domain. ESCRT complexes form the main machinery driving protein sorting from endosomes to lysosomes. The core domain of the Vps23 subunit of the heterotrimeric ESCRT-I complex is a helical hairpin sandwiched in a fan-like formation between two other helical hairpins from Vps28 (pfam03997) and Vps37. Vps23 gives ESCRT-I complex its stability.¡€0€ª€0€ €CDD¡€ €Åû¢€0€0€ €‚pfam09455, Cas_DxTHG, CRISPR-associated (Cas) DxTHG family. CRISPR is a term for Clustered Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR associated) proteins. The family describes Cas proteins of about 400 residues that include the motif [VIL]-D-x-[ST]-H-[GS]. The CRISPR and associated proteins are thought to be involved in the evolution of host resistance. The exact molecular function of this family is currently unknown.¡€0€ª€0€ €CDD¡€ €Åü¢€0€0€ €Æpfam09456, RcsC, RcsC Alpha-Beta-Loop (ABL). This domain is found in the C-terminus of the phospho-relay kinase RcsC between pfam00512 and pfam00072, and forms a discrete alpha/beta/loop structure.¡€0€ª€0€ €CDD¡€ €Åý¢€0€0€ €‚¤pfam09457, RBD-FIP, FIP domain. The FIP domain is the Rab11-binding domain (RBD) at the C-terminus of a family of Rab11-interacting proteins (FIPs). The Rab proteins constitute the largest family of small GTPases (>60 members in mammals). Among them Rab11 is a well characterized regulator of endocytic and recycling pathways. Rab11 associates with a broad range of post-Golgi organelles, including recycling endosomes.¡€0€ª€0€ €CDD¡€ €Åþ¢€0€0€ €‚'pfam09458, H_lectin, H-type lectin domain. The H-type lectin domain is a unit of six beta chains, combined into a homo-hexamer. It is involved in self/non-self recognition of cells, through binding with carbohydrates. It is sometimes found in association with the F5_F8_type_C domain pfam00754.¡€0€ª€0€ €CDD¡€ €Åÿ¢€0€0€ €‚Spfam09459, EB_dh, Ethylbenzene dehydrogenase. Eythylbenzene dehydrogenase is a heterotrimer of three subunits that catalyzes the anaerobic degradation of hydrocarbons. The alpha subunit contains the catalytic centre as a Molybdenum cofactor-complex. This removes an electron-pair from the hydrocarbon and passes it along an electron transport system involving iron-sulphur complexes held in the beta subunit and a Haem b molecule contained in the gamma subunit. The electron-pair is then subsequently passed to an as yet unknown receiver. The enzyme is found in a variety of different bacteria.¡€0€ª€0€ €CDD¡€ €Æ¢€0€0€ €‚zpfam09460, Saf-Nte_pilin, Saf-pilin pilus formation protein. This domain consists of the adjacent Saf-Nte and Saf-pilin chains of the pilus-forming complex. Pilus assembly in Gram-negative bacteria involves a Donor-strand exchange mechanism between the C- and the N-termini of this domain. The C-terminal subunit forms an incomplete Ig-fold which is then complemented by the 10-18 residue N-terminus of another, incoming, pilus subunit which is not involved in the Ig-fold. The N-terminus sequences contain a motif of alternating hydrophobic residues that occupy the P2 to P5 binding pockets in the groove of the first pilus subunit.¡€0€ª€0€ €CDD¡€ €_I¢€0€0€ €‚Êpfam09461, PcF, Phytotoxin PcF protein. PcF is a 52 residue protein factor of two alpha helices, containing a 4-hydroxyproline and three cysteine bridges. The presence of the hydroxyproline is unique in relation to other fungal phytotoxic proteins. The protein has a high content of acidic side-chains implying a lack of binding with lipid-rich components of membranes and appears to be an extracellular phytotoxin that causes leaf necrosis in strawberries.¡€0€ª€0€ €CDD¡€ €Æ¢€0€0€ €‚pfam09462, Mus7, Mus7/MMS22 family. This family includes a conserved region from the Mus7 protein. Mus7 is involved in the repair of replication-associated DNA damage in the fission yeast Schizosaccharomyces pombe. Mus7 functions in the same pathway as Mus81, a subunit of the Mus81-Eme1 structure-specific endonuclease, which has been implicated in the repair of the replication-associated DNA damage. The MMS22 proteins are involved in repairing double-stranded DNA breaks created by the cleavage reaction of topoisomerase II.¡€0€ª€0€ €CDD¡€ €Æ¢€0€0€ €^pfam09463, Opy2, Opy2 protein. Opy2p acts as a membrane anchor in the HOG signalling pathway.¡€0€ª€0€ €CDD¡€ €Æ¢€0€0€ €‚=pfam09465, LBR_tudor, Lamin-B receptor of TUDOR domain. The Lamin-B receptor, found on the TUDOR domain pfam00567, is a chromatin and lamin binding protein in the inner nuclear membrane. It is one of the integral inner Nuclear Envelope membrane proteins responsible for targeting nuclear membranes to chromatin, being a downstream effector of Ran, a small Ras-like nuclear GTPase which regulates NE assembly. Lamin-B receptor interacts with Importin beta, a Ran-binding protein, thereby directly contributing to the fusion of membrane vesicles and the formation of the NE.¡€0€ª€0€ €CDD¡€ €Æ¢€0€0€ €Úpfam09466, Yqai, Hypothetical protein Yqai. This hypothetical protein is expressed in bacteria, particularly Bacillus subtilis. It forms a homo-dimer, with each monomer containing an alpha helix and four beta strands.¡€0€ª€0€ €CDD¡€ €Æ¢€0€0€ €Þpfam09467, Yopt, Hypothetical protein Yopt. This hypothetical protein is expressed in bacteria, particularly Bacillus subtilis. It forms homo-dimers, with each monomer consisting of one alpha helix and three beta strands.¡€0€ª€0€ €CDD¡€ €_O¢€0€0€ €‚´pfam09468, RNase_H2-Ydr279, Ydr279p protein family (RNase H2 complex component). RNases H are enzymes that specifically hydrolyse RNA when annealed to a complementary DNA and are present in all living organisms. In yeast RNase H2 is composed of a complex of three proteins (Rnh2Ap, Ydr279p and Ylr154p), this family represents the homologs of Ydr279p. It is not known whether non yeast proteins in this family fulfil the same function.¡€0€ª€0€ €CDD¡€ €Æ¢€0€0€ €‚pfam09469, Cobl, Cordon-bleu ubiquitin-like domain. The Cordon-bleu protein domain is highly conserved among vertebrates. The sequence contains three repeated lysine, arginine, and proline-rich regions, the KKRAP motif. The exact function of the protein is unknown but it is thought to be involved in mid-brain neural tube closure. It is expressed specifically in the node. This domain has a ubiquitin-like fold.¡€0€ª€0€ €CDD¡€ €Æ¢€0€0€ €‚Upfam09470, Telethonin, Telethonin protein. Telethonin is a 167-residue protein which complexes with the large muscle protein, titin. The very N-terminus of titin, composed of two immunoglobulin-like (Ig) domains, referred to as Z1 and Z2, interacts with the N-terminal region (residues 1-53) of telethonin, mediating the antiparallel assembly of two Z1Z2 domains. The C-terminus of the telethonin appears to induce dimerisation of this 2:1 titin/telethonin structure which thus forms a complex necessary for myofibril assembly and maintenance of the intact Z-disk of skeletal and cardiac muscles.¡€0€ª€0€ €CDD¡€ €Æ¢€0€0€ €üpfam09471, Peptidase_M64, IgA Peptidase M64. This is a family of highly selective metallo-endopeptidases. The primary structure of the Clostridium ramosum IgA proteinase shows no significant overall similarity to any other known metallo-endopeptidase.¡€0€ª€0€ €CDD¡€ €Æ ¢€0€0€ €‚pfam09472, MtrF, Tetrahydromethanopterin S-methyltransferase, F subunit (MtrF). Many archaea have evolved energy-yielding pathways marked by one-carbon biochemistry featuring novel cofactors and enzymes. This domain is mostly found in MtrF, where it covers the entire length of the protein. This polypeptide is one of eight subunits of the N5-methyltetrahydromethanopterin: coenzyme M methyltransferase complex found in methanogenic archaea. This is a membrane-associated enzyme complex that uses methyl-transfer reactions to drive a sodium-ion pump. MtrF itself is involved in the transfer of the methyl group from N5-methyltetrahydromethanopterin to coenzyme M. Subsequently, methane is produced by two-electron reduction of the methyl moiety in methyl-coenzyme M by another enzyme, methyl-coenzyme M reductase. In some organisms this domain is found at the C terminal region of what appears to be a fusion of the MtrA and MtrF proteins. The function of these proteins is unknown, though it is likely that they are involved in C1 metabolism.¡€0€ª€0€ €CDD¡€ €Æ ¢€0€0€ €‚âpfam09474, Type_III_YscX, Type III secretion system YscX (type_III_YscX). Members of this family are encoded within bacterial type III secretion gene clusters. Among all species with type III secretion, those with this protein are found among those that target animal rather than plant cells. The member of this family in Yersinia was shown by mutation to be required for type III secretion of Yops effector proteins and therefore is believed to be part of the secretion machinery.¡€0€ª€0€ €CDD¡€ €_U¢€0€0€ €‚Ðpfam09475, Dot_icm_IcmQ, Dot/Icm secretion system protein (dot_icm_IcmQ). Proteins in this entry are the IcmQ component of Dot/Icm secretion systems, as found in the obligate intracellular pathogens Legionella pneumophila and Coxiella burnetii. While this system resembles type IV secretion systems and has been called a form of type IV, the literature now seems to favour calling this the Dot/Icm system. This protein was shown to be essential for translocation.¡€0€ª€0€ €CDD¡€ €_V¢€0€0€ €‚5pfam09476, Pilus_CpaD, Pilus biogenesis CpaD protein (pilus_cpaD). Proteins in this entry consist of a pilus biogenesis protein, CpaD, from Caulobacter, and homologs in other bacteria, including three in the root nodule bacterium Bradyrhizobium japonicum. The molecular function of the homologs is not known.¡€0€ª€0€ €CDD¡€ €Æ ¢€0€0€ €‚pfam09477, Type_III_YscG, Bacterial type II secretion system chaperone protein (type_III_yscG). YscG is a molecular chaperone for YscE, where both are part of the type III secretion system that in Yersinia is designated Ysc (Yersinia secretion). The secretion system delivers effector proteins, designated Yops (Yersinia outer proteins), in Yersinia. This entry consists of YscG from Yersinia and functionally equivalent type III secretion proteins in other species: e.g. AscG in Aeromonas and LscG in Photorhabdus luminescens.¡€0€ª€0€ €CDD¡€ €_X¢€0€0€ €¸pfam09478, CBM49, Carbohydrate binding domain CBM49. This domain is found at the C terminal of cellulases and in vitro binding studies have shown it to binds to crystalline cellulose.¡€0€ª€0€ €CDD¡€ €_Y¢€0€0€ €‚üpfam09479, Flg_new, Listeria-Bacteroides repeat domain (List_Bact_rpt). This model describes a conserved core region of about 43 residues, which occurs in at least two families of tandem repeats. These include 78-residue repeats which occur from 2 to 15 times in some proteins of Bacteroides forsythus ATCC 43037, and 70-residue repeats found in families of internalins of Listeria species. Single copies are found in proteins of Fibrobacter succinogenes, Geobacter sulfurreducens, and a few other bacteria.¡€0€ª€0€ €CDD¡€ €Æ ¢€0€0€ €‚ˆpfam09480, PrgH, Type III secretion system protein PrgH-EprH (PrgH). In Salmonella, the gene encoding this protein is part of a four-gene operon PrgHIJK, while in other organisms it is found in type III secretion operons. PrgH has been shown to be required for type III secretion and is a structural component of the needle complex, which is the core component of type III secretion systems.¡€0€ª€0€ €CDD¡€ €Æ ¢€0€0€ €‚Çpfam09481, CRISPR_Cse1, CRISPR-associated protein Cse1 (CRISPR_cse1). Clusters of short DNA repeats with non-homologous spacers, which are found at regular intervals in the genomes of phylogenetically distinct prokaryotic species, comprise a family with recognisable features. This family is known as CRISPR (short for Clustered, Regularly Interspaced Short Palindromic Repeats). A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This entry, represented by CT1972 from Chlorobaculum tepidum, is found in the CRISPR/Cas subtype Ecoli regions of many bacteria (most of which are mesophiles), and not in Archaea. It is designated Cse1.¡€0€ª€0€ €CDD¡€ €Æ¢€0€0€ €‚Vpfam09482, OrgA_MxiK, Bacterial type III secretion apparatus protein (OrgA_MxiK). This protein is encoded by genes which are found in type III secretion operons, and has been shown to be essential for the invasion phenotype in Salmonella and a component of the secretion apparatus. The protein is known as OrgA in Salmonella due to its oxygen-dependent expression pattern in which low-oxygen levels up-regulate the gene. In Shigella the gene is called MxiK and has been shown to be essential for the proper assembly of the needle complex, which is the core component of type III secretion systems.¡€0€ª€0€ €CDD¡€ €Æ¢€0€0€ €‚dpfam09483, HpaP, Type III secretion protein (HpaP). This entry represents proteins encoded by genes which are always found in type III secretion operons, although their function in the processes of secretion and virulence is unclear. Hpa stands for Hrp-associated gene, where Hrp stands for hypersensitivity response and virulence. see also PMID:18584024.¡€0€ª€0€ €CDD¡€ €Æ¢€0€0€ €‚ pfam09484, Cas_TM1802, CRISPR-associated protein TM1802 (cas_TM1802). Clusters of short DNA repeats with non-homologous spacers, which are found at regular intervals in the genomes of phylogenetically distinct prokaryotic species, comprise a family with recognisable features. This family is known as CRISPR (short for Clustered, Regularly Interspaced Short Palindromic Repeats). A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This minor cas protein is found in at least five prokaryotic genomes: Methanosarcina mazei, Sulfurihydrogenibium azorense, Thermotoga maritima, Carboxydothermus hydrogenoformans, and Dictyoglomus thermophilum, the first of which is archaeal while the rest are bacterial.¡€0€ª€0€ €CDD¡€ €Æ¢€0€0€ €‚åpfam09485, CRISPR_Cse2, CRISPR-associated protein Cse2 (CRISPR_cse2). Clusters of short DNA repeats with non-homologous spacers, which are found at regular intervals in the genomes of phylogenetically distinct prokaryotic species, comprise a family with recognisable features. This family is known as CRISPR (short for Clustered, Regularly Interspaced Short Palindromic Repeats). A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This family of proteins, represented by CT1973 from Chlorobaculum tepidum, is encoded by genes found in the CRISPR/Cas subtype Ecoli regions of many bacteria (most of which are mesophiles), and not in Archaea. It is designated Cse2.¡€0€ª€0€ €CDD¡€ €Æ¢€0€0€ €îpfam09486, HrpB7, Bacterial type III secretion protein (HrpB7). This entry represents proteins encoded by genes which are found in type III secretion operons in a narrow range of species including Xanthomonas, Burkholderia and Ralstonia.¡€0€ª€0€ €CDD¡€ €_a¢€0€0€ €îpfam09487, HrpB2, Bacterial type III secretion protein (HrpB2). This entry represents proteins encoded by genes which are found in type III secretion operons in a narrow group of species including Xanthomonas, Burkholderia and Ralstonia.¡€0€ª€0€ €CDD¡€ €Æ¢€0€0€ €‚Apfam09488, Osmo_MPGsynth, Mannosyl-3-phosphoglycerate synthase (osmo_MPGsynth). This family consists of examples of mannosyl-3-phosphoglycerate synthase (MPGS), which together with mannosyl-3-phosphoglycerate phosphatase (MPGP) EC:2.4.1.217, comprises a two-step pathway for mannosylglycerate biosynthesis. Mannosylglycerate is a compatible solute that tends to be restricted to extreme thermophiles of archaea and bacteria. Note that in Rhodothermus marinus, this pathway is one of two; the other is condensation of GDP-mannose with D-glycerate by mannosylglycerate synthase.¡€0€ª€0€ €CDD¡€ €Æ¢€0€0€ €‚œpfam09489, CbtB, Probable cobalt transporter subunit (CbtB). This entry represents a family of proteins which have been proposed to act as cobalt transporters acting in concert with vitamin B12 biosynthesis systems. Evidence for this assignment includes 1) prediction of a single transmembrane segment and a C-terminal histidine-rich motif likely to be a metal-binding site, 2) positional gene linkage with known B12 biosynthesis genes, 3) upstream proximity of B12 transcriptional regulatory sites, 4) the absence of other known cobalt import systems and 5) the obligate co-localization with a protein (CbtA) predicted to have five additional transmembrane segments.¡€0€ª€0€ €CDD¡€ €Æ¢€0€0€ €‚—pfam09490, CbtA, Probable cobalt transporter subunit (CbtA). This entry represents a family of proteins which have been proposed to act as cobalt transporters acting in concert with vitamin B12 biosynthesis systems. Evidence for this assignment includes 1) prediction of five transmembrane segments, 2) positional gene linkage with known B12 biosynthesis genes, 3) upstream proximity of B12 transcriptional regulatory sites, 4) the absence of other known cobalt import systems and 5) the obligate co-localization with a small protein (CbtB) having a single additional transmembrane segment and a C-terminal histidine-rich motif likely to be a metal-binding site.¡€0€ª€0€ €CDD¡€ €Æ¢€0€0€ €épfam09491, RE_AlwI, AlwI restriction endonuclease. This family includes the AlwI (recognizes GGATC), Bsp6I (recognizes GC^NGC), BstNBI (recognizes GASTC), PleI(recognizes GAGTC) and MlyI (recognizes GAGTC) restriction endonucleases.¡€0€ª€0€ €CDD¡€ €Æ¢€0€0€ €Épfam09492, Pec_lyase, Pectic acid lyase. Members of this family are isozymes of pectate lyase (EC:4.2.2.2), also called polygalacturonic transeliminase and alpha-1,4-D-endopolygalacturonic acid lyase.¡€0€ª€0€ €CDD¡€ €Æ¢€0€0€ €‚¹pfam09493, DUF2389, Tryptophan-rich protein (DUF2389). Members of this family are small hypothetical proteins of 60 to 100 residues from Cyanobacteria and some Proteobacteria. Prochlorococcus marinus strains have two members, other species one only. Interestingly, of the eight most conserved residues, four are aromatic and three are invariant tryptophans. It appears all species that encode this protein can synthesize tryptophan de novo.¡€0€ª€0€ €CDD¡€ €Æ¢€0€0€ €‚«pfam09494, Slx4, Slx4 endonuclease. The Slx4 protein is a heteromeric structure-specific endonuclease found from fungi to mammals. Slx4 with Slx1 acts as a nuclease on branched DNA substrates, particularly simple-Y, 5'-flap, or replication fork structures by cleaving the strand bearing the 5' non-homologous arm at the branch junction and thus generating ligatable nicked products from 5'-flap or replication fork substrates.¡€0€ª€0€ €CDD¡€ €Æ¢€0€0€ €ýpfam09495, DUF2462, Protein of unknown function (DUF2462). This protein is highly conserved, but its function is unknown. It can be isolated from HeLa cell nucleoli and is found to be homologous with Leydig cell tumor protein whose function is unknown.¡€0€ª€0€ €CDD¡€ €Æ¢€0€0€ €‚ pfam09496, CENP-O, Cenp-O kinetochore centromere component. This eukaryotic protein is a component of the inner kinetochore subcomplex of the centromere. It has been shown to be involved in chromosome segregation via regulation of the spindle in both yeast and human.¡€0€ª€0€ €CDD¡€ €Æ¢€0€0€ €‚•pfam09497, Med12, Transcription mediator complex subunit Med12. Med12 is a negative regulator of the Gli3-dependent sonic hedgehog signalling pathway via its interaction with Gli3 within the RNA polymerase II transcriptional Mediator. A complex is formed between Med12, Med13, CDK8 and CycC which is responsible for suppression of transcription. This subunit forms part of the Kinase section of Mediator.¡€0€ª€0€ €CDD¡€ €Æ¢€0€0€ €‚€pfam09498, DUF2388, Protein of unknown function (DUF2388). This family consists of small hypothetical proteins, about 100 amino acids in length. The family includes five members (three in tandem) in Pseudomonas aeruginosa PAO1 and in Pseudomonas putida (strain KT2440), four in Pseudomonas syringae DC3000, and single members in several other Proteobacteria. The function is unknown.¡€0€ª€0€ €CDD¡€ €Æ¢€0€0€ €¯pfam09499, RE_ApaLI, ApaLI-like restriction endonuclease. This family includes R.ApaLI and R.XbaI restriction endonucleases. ApaLI recognizes and cleaves the sequence GTGCAC.¡€0€ª€0€ €CDD¡€ €_n¢€0€0€ €‚”pfam09500, YiiD_C, Putative thioesterase (yiiD_Cterm). This entry consists of a broadly distributed uncharacterized domain often found as a standalone protein. The member from Shewanella oneidensis is described from crystallography work as a putative thioesterase because it belongs to the HotDog clan of enzymes. About half of the members of this family are fused to an Acetyltransf_1 domain pfam00583.¡€0€ª€0€ €CDD¡€ €Æ¢€0€0€ €‚pfam09501, Bac_small_YrzI, Probable sporulation protein (Bac_small_yrzI). Members of this family are very small proteins, about 47 residues each, in the genus Bacillus. Single members are found in Bacillus subtilis and Bacillus halodurans, while arrays of six members in tandem are found in Bacillus cereus and Bacillus anthracis. An EIxxE motif present in most members of this family resembles cleavage sites by the germination protease GPR in a number of small acid-soluble spore proteins (SASP). A role in sporulation is possible.¡€0€ª€0€ €CDD¡€ €Æ ¢€0€0€ €îpfam09502, HrpB4, Bacterial type III secretion protein (HrpB4). This entry represents proteins encoded by genes which are found in type III secretion operons in a narrow range of species including Xanthomonas, Burkholderia and Ralstonia.¡€0€ª€0€ €CDD¡€ €_q¢€0€0€ €pfam09504, RE_Bsp6I, Bsp6I restriction endonuclease. This family includes the Bsp6I (recognizes and cleaves GC^NGC) restriction endonucleases.¡€0€ª€0€ €CDD¡€ €Æ!¢€0€0€ €‚Hpfam09505, Dimeth_Pyl, Dimethylamine methyltransferase (Dimeth_PyL). This family consists of dimethylamine methyltransferases from the genus Methanosarcina. It is found in three nearly identical copies in each of Methanosarcina acetivorans, Methanosarcina barkeri, and Methanosarcina mazei. It is one of a suite of three non-homologous enzymes with a critical UAG-encoded pyrrolysine residue in these species (along with trimethylamine methyltransferase and monomethylamine methyltransferase). It demethylates dimethylamine, leaving monomethylamine, and methylates the prosthetic group of the small corrinoid protein MtbC. The methyl group is then transferred by methylcorrinoid:coenzyme M methyltransferase to coenzyme M. Note that the pyrrolysine residue is variously translated as K or X, or as a stop codon that truncates the sequence.¡€0€ª€0€ €CDD¡€ €Æ"¢€0€0€ €‚øpfam09506, Salt_tol_Pase, Glucosylglycerol-phosphate phosphatase (Salt_tol_Pase). Proteins in this family are glucosylglycerol-phosphate phosphatases, with the gene symbol stpA (Salt Tolerance Protein A). A motif characteristic of acid phosphatases is found, but otherwise this family shows little sequence similarity to other phosphatases. This enzyme acts on the glucosylglycerol phosphate, product of glucosylglycerol phosphate synthase and immediate precursor of the osmoprotectant glucosylglycerol.¡€0€ª€0€ €CDD¡€ €Æ#¢€0€0€ €‚úpfam09507, CDC27, DNA polymerase subunit Cdc27. This protein forms the C subunit of DNA polymerase delta. It carries the essential residues for binding to the Pol1 subunit of polymerase alpha, from residues 293-332, which are characterized by the motif D--G--VT, referred to as the DPIM motif. The first 160 residues of the protein form the minimal domain for binding to the B subunit, Cdc1, of polymerase delta, the final 10 C-terminal residues, 362-372, being the DNA sliding clamp, PCNA, binding motif.¡€0€ª€0€ €CDD¡€ €Æ$¢€0€0€ €‚,pfam09508, Lact_bio_phlase, Lacto-N-biose phosphorylase. The gene which codes for this protein in gut-bacteria is located in a novel putative operon for galactose metabolism. The protein appears to be a carbohydrate-processing phosphorolytic enzyme (EC:2.4.1.211), unlike either glycoside hydrolases or glycoside lyase. Intestinal colonisation by bifidobacteria is important for human health, especially in pediatrics, because colonisation seems to prevent infection by some pathogenic bacteria that cause diarrhoea or other illnesses. The operon seems to be involved in intestinal colonisation by bifidobacteria mediated by metabolism of mucin sugars. In addition, it may also resolve the question of the nature of the bifidus factor in human milk as the lacto-N-biose structure found in milk oligosaccharides.¡€0€ª€0€ €CDD¡€ €Æ%¢€0€0€ €‚hpfam09509, Hypoth_Ymh, Protein of unknown function (Hypoth_ymh). This entry consists of a relatively rare prokaryotic protein family (about 8 occurrences per 200 genomes). Genes for members of this family appear to be associated variously with phage and plasmid regions, restriction system loci, transposons, and housekeeping genes. Their function is unknown.¡€0€ª€0€ €CDD¡€ €Æ&¢€0€0€ €‚pfam09510, Rtt102p, Rtt102p-like transcription regulator protein. This protein is found in fungi. The family includes Rtt102p, a transcription regulator protein which appears to be integrally associated with both the Swi-Snf and the RSC chromatin remodelling complexes,.¡€0€ª€0€ €CDD¡€ €Æ'¢€0€0€ €‚pfam09511, RNA_lig_T4_1, RNA ligase. Members of this family include T4 phage proteins with ATP-dependent RNA ligase activity. Host defense to phage may include cleavage and inactivation of specific tRNA molecules; members of this family act to reverse this RNA damage. The enzyme is adenylated, transiently, on a Lys residue in a motif KXDGSL. This family also includes fungal tRNA ligases that have adenylyltransferase activity. tRNA ligases are enzymes required for the splicing of precursor tRNA molecules containing introns.i.¡€0€ª€0€ €CDD¡€ €Æ(¢€0€0€ €‚'pfam09512, ThiW, Thiamine-precursor transporter protein (ThiW). Levels of thiamine pyrophosphate (TPP) or thiamine regulate transcription or translation of a number of thiamine biosynthesis, salvage, or transport genes in a wide range of prokaryotes. The mechanism involves direct binding, with no protein involved, to a structural element called THI found in the untranslated upstream region of thiamine metabolism gene operons. This element is called a riboswitch and is seen also for other metabolites such as FMN and glycine. This protein family consists of proteins identified in operons controlled by the THI riboswitch and designated ThiW. The hydrophobic nature of this protein and reconstructed metabolic background suggests that this protein acts in transport of a thiazole precursor of thiamine.¡€0€ª€0€ €CDD¡€ €Æ)¢€0€0€ €‚µpfam09514, SSXRD, SSXRD motif. SSX1 can repress transcription, and this has been attributed to a putative Kruppel associated box (KRAB) repression domain at the N-terminus. However, from the analysis of these deletion constructs further repression activity was found at the C-terminus of SSX1. Which has been called the SSXRD (SSX Repression Domain). The potent repression exerted by full-length SSX1 appears to localize to this region.¡€0€ª€0€ €CDD¡€ €Æ*¢€0€0€ €‚µpfam09515, Thia_YuaJ, Thiamine transporter protein (Thia_YuaJ). Members of this protein family have been assigned as thiamine transporters by a phylogenetic analysis of families of genes regulated by the THI element, a broadly conserved RNA secondary structure element through which thiamine pyrophosphate (TPP) levels can regulate transcription of many genes related to thiamine transport, salvage, and de novo biosynthesis. Species with this protein always lack the ThiBPQ ABC transporter. In some species (e.g. Streptococcus mutans and Streptococcus pyogenes), yuaJ is the only THI-regulated gene. Evidence from Bacillus cereus indicates thiamine uptake is coupled to proton translocation.¡€0€ª€0€ €CDD¡€ €Æ+¢€0€0€ €pfam09516, RE_CfrBI, CfrBI restriction endonuclease. This family includes the CfrBI (recognizes and cleaves C^CWWGG) restriction endonuclease.¡€0€ª€0€ €CDD¡€ €Æ,¢€0€0€ €–pfam09517, RE_Eco29kI, Eco29kI restriction endonuclease. This family includes the Eco29kI (recognizes and cleaves CCGC^GG ) restriction endonuclease.¡€0€ª€0€ €CDD¡€ €Æ-¢€0€0€ €•pfam09518, RE_HindIII, HindIII restriction endonuclease. This family includes the HindIII (recognizes and cleaves A^AGCTT) restriction endonuclease.¡€0€ª€0€ €CDD¡€ €_~¢€0€0€ €¦pfam09519, RE_HindVP, HindVP restriction endonuclease. This family includes the HindVP (recognizes GRCGYC bu the cleavage site is unknown) restriction endonucleases.¡€0€ª€0€ €CDD¡€ €Æ.¢€0€0€ €Õpfam09520, RE_TdeIII, Type II restriction endonuclease, TdeIII. This family includes many TdeIII restriction endonucleases that recognize and cleave at GGNCC sites. TdeIII cleave unmethylated double-stranded DNA.¡€0€ª€0€ €CDD¡€ €Æ/¢€0€0€ €pfam09521, RE_NgoPII, NgoPII restriction endonuclease. This family includes the NgoPII (recognizes and cleaves GG^CC) restriction endonuclease.¡€0€ª€0€ €CDD¡€ €Æ0¢€0€0€ €8pfam09522, RE_R_Pab1, R.Pab1 restriction endonuclease. ¡€0€ª€0€ €CDD¡€ €_‚¢€0€0€ €‚=pfam09523, DUF2390, Protein of unknown function (DUF2390). Members of this family are bacterial hypothetical proteins, about 160 amino acids in length, found in various proteobacteria, including members of the genera Pseudomonas and Vibrio. The C-terminal region is poorly conserved and is not included in the model.¡€0€ª€0€ €CDD¡€ €Æ1¢€0€0€ €‚pfam09524, Phg_2220_C, Conserved phage C-terminus (Phg_2220_C). This entry represents the conserved C-terminal domain of a family of proteins found exclusively in bacteriophage and in bacterial prophage regions. The functions of this domain and the proteins containing it are unknown.¡€0€ª€0€ €CDD¡€ €Æ2¢€0€0€ €‚pfam09526, DUF2387, Probable metal-binding protein (DUF2387). Members of this family are small proteins, about 70 residues in length, with a basic triplet near the N-terminus and a probable metal-binding motif CPXCX(18)CXXC. Members are found in various proteobacteria.¡€0€ª€0€ €CDD¡€ €Æ3¢€0€0€ €‚ pfam09527, ATPase_gene1, Putative F0F1-ATPase subunit Ca2+/Mg2+ transporter. This model represents a protein found encoded in F1F0-ATPase operons in several genomes, including Methanosarcina barkeri (archaeal) and Chlorobium tepidum (bacterial). It is a small protein (about 100 amino acids) with long hydrophobic stretches and is presumed to be a subunit of the enzyme. It carries two transmembrane helices and is a magnesium or calcium uniporter. The atp operon of alkaliphilic Bacillus pseudofirmus OF4, as in most prokaryotes, contains the eight structural genes for the F-ATPase (ATP synthase), which are preceded by an atpI gene that encodes a membrane protein with 2 TMSs. A tenth gene, atpZ, has been found in this operon, which is upstream of and overlapping with atpI.¡€0€ª€0€ €CDD¡€ €Æ4¢€0€0€ €‚Zpfam09528, Ehrlichia_rpt, Ehrlichia tandem repeat (Ehrlichia_rpt). This entry represents 30 amino acid tandem repeat, found in a variable number of copies in an immunodominant outer membrane protein of Ehrlichia chaffeensis, a tick-borne obligate intracellular pathogen. These short tandem-repeats elicit a strong antibody response in the hosts.¡€0€ª€0€ €CDD¡€ €Æ5¢€0€0€ €‚Apfam09529, Intg_mem_TP0381, Integral membrane protein (intg_mem_TP0381). This entry represents a family of hydrophobic proteins with seven predicted transmembrane alpha helices. Members are found in Bacillus subtilis (ywaF), TP0381 from Treponema pallidum (TP0381), Streptococcus pyogenes, Rhodococcus erythropolis, etc.¡€0€ª€0€ €CDD¡€ €Æ6¢€0€0€ €‚hpfam09531, Ndc1_Nup, Nucleoporin protein Ndc1-Nup. Ndc1 is a nucleoporin protein that is a component of the Nuclear Pore Complex, and, in fungi, also of the Spindle Pole Body. It consists of six transmembrane segments, three lumenal loops, both concentrated at the N-terminus and cytoplasmic domains largely at the C-terminus, all of which are well conserved.¡€0€ª€0€ €CDD¡€ €Æ7¢€0€0€ €‚˜pfam09532, FDF, FDF domain. The FDF domain, so called because of the conserved FDF at its N termini, is an entirely alpha-helical domain with multiple exposed hydrophilic loops. It is found at the C terminus of Scd6p-like SM domains. It is also found with other divergent Sm domains and in proteins such as Dcp3p and FLJ21128, where it is found N terminal to the YjeF-N domain, a novel Rossmann fold domain.¡€0€ª€0€ €CDD¡€ €Æ8¢€0€0€ €‚pfam09533, DUF2380, Predicted lipoprotein of unknown function (DUF2380). This family consists of at least 9 paralogs in Myxococcus xanthus, a member of the Deltaproteobacteria. One appears truncated toward the N-terminus; the others are predicted lipoproteins. The function is unknown.¡€0€ª€0€ €CDD¡€ €_Š¢€0€0€ €‚Hpfam09534, Trp_oprn_chp, Tryptophan-associated transmembrane protein (Trp_oprn_chp). Members of this family are predicted transmembrane proteins with four membrane-spanning helices. Members are found in the Actinobacteria (Mycobacterium, Corynebacterium, Streptomyces), always associated with genes for tryptophan biosynthesis.¡€0€ª€0€ €CDD¡€ €Æ9¢€0€0€ €‚Spfam09535, Gmx_para_CXXCG, Protein of unknown function (Gmx_para_CXXCG). This entry consists of at least 10 paralogous proteins from Myxococcus xanthus and that lack detectable sequence similarity to any other protein family. An imperfectly conserved CXXCG motif, a probable binding site, appears twice in the multiple sequence alignment.¡€0€ª€0€ €CDD¡€ €Æ:¢€0€0€ €‚pfam09536, DUF2378, Protein of unknown function (DUF2378). This family consists of a set of at least 17 paralogous proteins in Myxococcus xanthus DK 1622 and and 12 in Stigmatella aurantiaca DW4/3-1. Members are about 200 amino acids in length. The function is unknown.¡€0€ª€0€ €CDD¡€ €Æ;¢€0€0€ €ïpfam09537, DUF2383, Domain of unknown function (DUF2383). Members of this protein family are found mostly in the Proteobacteria, although one member is found in the the marine planctomycete Pirellula sp. strain 1. The function is unknown.¡€0€ª€0€ €CDD¡€ €Æ<¢€0€0€ €‚’pfam09538, FYDLN_acid, Protein of unknown function (FYDLN_acid). Members of this family are bacterial proteins with a conserved motif [KR]FYDLN, sometimes flanked by a pair of CXXC motifs, followed by a long region of low complexity sequence in which roughly half the residues are Asp and Glu, including multiple runs of five or more acidic residues. The function of members of this family is unknown.¡€0€ª€0€ €CDD¡€ €Æ=¢€0€0€ €‚npfam09539, DUF2385, Protein of unknown function (DUF2385). Members of this uncharacterized protein family are found in a number of alphaproteobacteria, including root nodule bacteria, Brucella suis, Caulobacter crescentus, and Rhodopseudomonas palustris. Conserved residues include two well-separated cysteines, suggesting a disulfide bond. The function is unknown.¡€0€ª€0€ €CDD¡€ €Æ>¢€0€0€ €àpfam09543, DUF2379, Protein of unknown function (DUF2379). This family consists of at least 7 paralogs in Myxococcus xanthus and 6 in Stigmatella aurantiaca, both members of the Deltaproteobacteria. The function is unknown.¡€0€ª€0€ €CDD¡€ €Æ?¢€0€0€ €¼pfam09544, DUF2381, Protein of unknown function (DUF2381). This family consists of at least 8 paralogs in Myxococcus xanthus, a member of the Deltaproteobacteria. The function is unknown.¡€0€ª€0€ €CDD¡€ €Æ@¢€0€0€ €Œpfam09545, RE_AccI, AccI restriction endonuclease. This family includes the AccI (recognizes and cleaves GT^MKAC) restriction endonuclease.¡€0€ª€0€ €CDD¡€ €_“¢€0€0€ €‚pfam09546, Spore_III_AE, Stage III sporulation protein AE (spore_III_AE). This represents the stage III sporulation protein AE, which is encoded in a spore formation operon spoIIIAABCDEFGH under the control of sigma G. A comparative genome analysis of all sequenced genomes of Firmicutes shows that the proteins are strictly conserved among the sub-set of endospore-forming species.¡€0€ª€0€ €CDD¡€ €ÆA¢€0€0€ €‚rpfam09547, Spore_IV_A, Stage IV sporulation protein A (spore_IV_A). SpoIVA is designated stage IV sporulation protein A. It acts in the mother cell compartment and plays a role in spore coat morphogenesis. A comparative genome analysis of all sequenced genomes of Firmicutes shows that the proteins are strictly conserved among the sub-set of endospore-forming species.¡€0€ª€0€ €CDD¡€ €ÆB¢€0€0€ €‚ˆpfam09548, Spore_III_AB, Stage III sporulation protein AB (spore_III_AB). SpoIIIAB represents the stage III sporulation protein AB, which is encoded in a spore formation operon: spoIIIAABCDEFGH that is under sigma G regulation. A comparative genome analysis of all sequenced genomes of Firmicutes shows that the proteins are strictly conserved among the sub-set of endospore-forming species.¡€0€ª€0€ €CDD¡€ €ÆC¢€0€0€ €›pfam09549, RE_Bpu10I, Bpu10I restriction endonuclease. This family includes the Bpu10I (recognizes and cleaves CCTNAGC (-5/-2)) restriction endonucleases.¡€0€ª€0€ €CDD¡€ €ÆD¢€0€0€ €Øpfam09550, Phage_TAC_6, Phage tail assembly chaperone protein, TAC. This is a family of phage tail assembly chaperone proteins largely derived from the Rhodobacter species viral agent GTA (gene transfer agent) gp10.¡€0€ª€0€ €CDD¡€ €ÆE¢€0€0€ €‚ pfam09551, Spore_II_R, Stage II sporulation protein R (spore_II_R). SpoIIR is designated stage II sporulation protein R. A comparative genome analysis of all sequenced genomes of Firmicutes shows that the proteins are strictly conserved among the sub-set of endospore-forming species. SpoIIR is a signalling protein that links the activation of sigma E to the transcriptional activity of sigma F during sporulation.¡€0€ª€0€ €CDD¡€ €ÆF¢€0€0€ €•pfam09552, RE_BstXI, BstXI restriction endonuclease. This family includes the BstXI (recognizes and cleaves CCANNNNN^NTGG) restriction endonuclease.¡€0€ª€0€ €CDD¡€ €_š¢€0€0€ €¬pfam09553, RE_Eco47II, Eco47II restriction endonuclease. This family includes the Eco47II (which recognizes GGNCC, but the cleavage site unknown) restriction endonuclease.¡€0€ª€0€ €CDD¡€ €ÆG¢€0€0€ €pfam09554, RE_HaeII, HaeII restriction endonuclease. This family includes the HaeII (recognizes and cleaves RGCGC^Y) restriction endonuclease.¡€0€ª€0€ €CDD¡€ €_œ¢€0€0€ €pfam09556, RE_HaeIII, HaeIII restriction endonuclease. This family includes the HaeIII (recognizes and cleaves GG^CC) restriction endonuclease.¡€0€ª€0€ €CDD¡€ €ÆH¢€0€0€ €‚Ëpfam09557, DUF2382, Domain of unknown function (DUF2382). This entry describes an uncharacterized domain, sometimes found in association with a PRC-barrel domain pfam05239 which is also found in rRNA processing protein RimM and in a photosynthetic reaction centre complex protein). This domain is found in proteins from Bacillus subtilis, Deinococcus radiodurans, Nostoc sp. PCC 7120, Myxococcus xanthus, and several other species. The function is not known.¡€0€ª€0€ €CDD¡€ €ÆI¢€0€0€ €‚¦pfam09558, DUF2375, Protein of unknown function (DUF2375). Two members of this family are found in Colwellia psychrerythraea (strain 34H / ATCC BAA-681) and one each in various other species of Colwellia and Shewanella. One member from C. psychrerythraea is of special interest because it is preceded by the same cis-regulatory site as a number of genes that have the PEP-CTERM domain described by PEP_anchor (IPR013424).¡€0€ª€0€ €CDD¡€ €_Ÿ¢€0€0€ €‚wpfam09559, Cas6, Cas6 Crispr. The Cas6 Crispr family of proteins averaging 140 residues are characterized by having a GhGxxxxxGhG motif, where h indicates a hydrophobic residue, at the C-terminus. The CRISPR-Cas system is possibly a mechanism of defense against invading pathogens and plasmids that functions analogously to the RNA interference (RNAi) systems in eukaryotes.¡€0€ª€0€ €CDD¡€ €ÆJ¢€0€0€ €‚|pfam09560, Spore_YunB, Sporulation protein YunB (Spo_YunB). Spo_YunB is the sporulation protein YunB. In Bacillus subtilis its expression is controlled by sigmaE.The gene YunB seems to code for a protein involved, at least indirectly, in the pathway leading to the activation of sigmaK. Inactivation of YunB delays sigmaK activation and results in reduced sporulation efficiency.¡€0€ª€0€ €CDD¡€ €ÆK¢€0€0€ €pfam09561, RE_HpaII, HpaII restriction endonuclease. This family includes the HpaII (recognizes and cleaves C^CGG) restriction endonuclease.¡€0€ª€0€ €CDD¡€ €ÆL¢€0€0€ €Žpfam09562, RE_LlaMI, LlaMI restriction endonuclease. This family includes the LlaMI (recognizes and cleaves CC^NGG) restriction endonuclease.¡€0€ª€0€ €CDD¡€ €ÆM¢€0€0€ €‚pfam09563, RE_LlaJI, LlaJI restriction endonuclease. This family includes the LlaJI (recognizes GACGC) restriction endonucleases.¡€0€ª€0€ €CDD¡€ €ÆN¢€0€0€ €Ÿpfam09564, RE_NgoBV, NgoBV restriction endonuclease. This family includes the NgoBV (recognizes GGNNCC but cleavage site is unknown) restriction endonuclease.¡€0€ª€0€ €CDD¡€ €_¤¢€0€0€ €¡pfam09565, RE_NgoFVII, NgoFVII restriction endonuclease. This family includes the NgoFVII (recognizes GCSGC but cleavage site unknown) restriction endonuclease.¡€0€ª€0€ €CDD¡€ €ÆO¢€0€0€ €Œpfam09566, RE_SacI, SacI restriction endonuclease. This family includes the SacI (recognizes and cleaves GAGCT^C) restriction endonuclease.¡€0€ª€0€ €CDD¡€ €ÆP¢€0€0€ €pfam09567, RE_MamI, MamI restriction endonuclease. This family includes the MamI (recognizes and cleaves GATNN^NNATC) restriction endonuclease.¡€0€ª€0€ €CDD¡€ €ÆQ¢€0€0€ €—pfam09568, RE_MjaI, MjaI restriction endonuclease. This family includes the MjaI (recognizes CTAG but cleavage site unknown) restriction endonuclease.¡€0€ª€0€ €CDD¡€ €ÆR¢€0€0€ €Œpfam09569, RE_ScaI, ScaI restriction endonuclease. This family includes the ScaI (recognizes and cleaves AGT^ACT) restriction endonuclease.¡€0€ª€0€ €CDD¡€ €ÆS¢€0€0€ €‹pfam09570, RE_SinI, SinI restriction endonuclease. This family includes the SinI (recognizes and cleaves G^GWCC) restriction endonuclease.¡€0€ª€0€ €CDD¡€ €ÆT¢€0€0€ €pfam09571, RE_XcyI, XcyI restriction endonuclease. This family includes the XcyI (recognizes and cleaves C^CCGGG) restriction endonucleases.¡€0€ª€0€ €CDD¡€ €_§¢€0€0€ €™pfam09572, RE_XamI, XamI restriction endonuclease. This family includes the XamI (recognizes GTCGAC but cleavage site unknown) restriction endonuclease.¡€0€ª€0€ €CDD¡€ €ÆU¢€0€0€ €Špfam09573, RE_TaqI, TaqI restriction endonuclease. This family includes the TaqI (recognizes and cleaves T^CGA) restriction endonuclease.¡€0€ª€0€ €CDD¡€ €_©¢€0€0€ €‚€pfam09574, DUF2374, Protein of unknown function (Duf2374). This very small protein (about 46 amino acids) consists largely of a single predicted membrane-spanning region. It is found in Photobacterium profundum SS9 and in three species of Vibrio, always near periplasmic nitrate reductase genes, but far from the periplasmic nitrate reductase genes in Aeromonas hydrophila ATCC 7966.¡€0€ª€0€ €CDD¡€ €_ª¢€0€0€ €‚£pfam09575, Spore_SspJ, Small spore protein J (Spore_SspJ). Spore_SspJ represents a group of small acid-soluble proteins (SASP) from Bacillus sp., which are present in spores but not in growing cells. The sspJ gene is transcribed in the forespore compartment by RNA polymerase with the forespore-specific sigmaG. Loss of SspJ causes a slight decrease in the rate of spore outgrowth in an otherwise wild-type background.¡€0€ª€0€ €CDD¡€ €_«¢€0€0€ €‚Ápfam09577, Spore_YpjB, Sporulation protein YpjB (SpoYpjB). These proteins are found in the endospore-forming bacteria which include Bacillus species. In Bacillus subtilis, ypjB was found to be part of the sigma-E regulon. Sigma-E is a sporulation sigma factor that regulates expression in the mother cell compartment. Null mutants of ypjB show a sporulation defect, but this gene is not, however, a part of the endospore formation minimal gene set.¡€0€ª€0€ €CDD¡€ €ÆV¢€0€0€ €‚Üpfam09578, Spore_YabQ, Spore cortex protein YabQ (Spore_YabQ). This protein is predicted to span the membrane several times. It is only found in genomes of species that perform sporulation, such as Bacillus subtilis, Clostridium tetani, and other members of the Firmicutes (low-GC Gram-positive bacteria). Mutation of this sigmaE-dependent gene blocks development of the spore cortex. The length of the C-terminal region, which includes some hydrophobic regions, is variable.¡€0€ª€0€ €CDD¡€ €ÆW¢€0€0€ €‚'pfam09579, Spore_YtfJ, Sporulation protein YtfJ (Spore_YtfJ). Proteins in this family are encoded by bacterial genomes if, and only if, the species is capable of endospore formation. YtfJ was confirmed in spores of B. subtilis; it appears to be expressed in the forespore under control of SigF.¡€0€ª€0€ €CDD¡€ €ÆX¢€0€0€ €‚¬pfam09580, Spore_YhcN_YlaJ, Sporulation lipoprotein YhcN/YlaJ (Spore_YhcN_YlaJ). This entry contains YhcN and YlaJ, which are predicted lipoproteins that have been detected as spore proteins but not vegetative proteins in Bacillus subtilis. Both appear to be expressed under control of the RNA polymerase sigma-G factor. The YlaJ-like members of this family have a low-complexity, strongly acidic, 40-residue C-terminal domain.¡€0€ª€0€ €CDD¡€ €ÆY¢€0€0€ €‚dpfam09581, Spore_III_AF, Stage III sporulation protein AF (Spore_III_AF). This family represents the stage III sporulation protein AF (Spore_III_AF) of the bacterial endospore formation program, which exists in some but not all members of the Firmicutes (formerly called low-GC Gram-positives). The C-terminal region of these proteins is poorly conserved.¡€0€ª€0€ €CDD¡€ €ÆZ¢€0€0€ €‚ˆpfam09582, AnfO_nitrog, Iron only nitrogenase protein AnfO (AnfO_nitrog). Proteins in this entry include Anf1 from Rhodobacter capsulatus (Rhodopseudomonas capsulata) and AnfO from Azotobacter vinelandii. They are found exclusively in species which contain the iron-only nitrogenase, and are encoded immediately downstream of the structural genes for the nitrogenase enzyme in these species.¡€0€ª€0€ €CDD¡€ €Æ[¢€0€0€ €‚½pfam09583, Phageshock_PspG, Phage shock protein G (Phageshock_PspG). This protein was previously designated as YjbO in Escherichia coli. It is found only in genomes that have the phage shock operon (psp), but it is only rarely encoded near other psp genes. The psp regulon is upregulated in response to a number of stress conditions, including ethanol, expression of the filamentous phage secretin protein IV and other secretins and heat shock.¡€0€ª€0€ €CDD¡€ €Æ\¢€0€0€ €‚¨pfam09584, Phageshock_PspD, Phage shock protein PspD (Phageshock_PspD). Members of this family are phage shock protein PspD, found in a minority of bacteria that carry the defining genes of the phage shock regulon (pspA, pspB, pspC, and pspF). It is found in Escherichia coli, Yersinia pestis, and closely related species, where it is part of the phage shock operon. It is known to be expressed but its function is unknown.¡€0€ª€0€ €CDD¡€ €Æ]¢€0€0€ €‚zpfam09585, Lin0512_fam, Conserved hypothetical protein (Lin0512_fam). This family consists of few members, broadly distributed. It occurs so far in several Firmicutes (twice in Oceanobacillus), one Cyanobacterium, one alpha Proteobacterium, and (with a long prefix) in plants. The function is unknown. The alignment includes a well conserved motif GxGxDxHG near the N-terminus.¡€0€ª€0€ €CDD¡€ €Æ^¢€0€0€ €‚;pfam09586, YfhO, Bacterial membrane protein YfhO. This protein is a conserved membrane protein. The yfhO gene is transcribed in Difco sporulation medium and the transcription is affected by the YvrGHb two-component system. Some members of this family have been annotated as glycosyl transferases of the PMT family.¡€0€ª€0€ €CDD¡€ €Æ_¢€0€0€ €‚Lpfam09587, PGA_cap, Bacterial capsule synthesis protein PGA_cap. This protein is a putative poly-gamma-glutamate capsule biosynthesis protein found in bacteria. Poly-gamma-glutamate is a natural polymer that may be involved in virulence and may help bacteria survive in high salt concentrations. It is a surface-associated protein.¡€0€ª€0€ €CDD¡€ €Æ`¢€0€0€ €‚­pfam09588, YqaJ, YqaJ-like viral recombinase domain. This protein family is found in many different bacterial species but is of viral origin. The protein forms an oligomer and functions as a processive alkaline exonuclease that digests linear double-stranded DNA in a Mg(2+)-dependent reaction, It has a preference for 5'-phosphorylated DNA ends. It thus forms part of the two-component SynExo viral recombinase functional unit.¡€0€ª€0€ €CDD¡€ €Æa¢€0€0€ €‚—pfam09589, HrpA_pilin, HrpA pilus formation protein. HrpA is an essential component of the type III secretion system (TTSS) which pathogens use to inject virulence factors directly into their host cells, and to cause disease. The TTSS has an Hrp pilus appendage for channelling effector proteins through the plant cell wall and this pilus elongates by the addition of HrpA pilin subunits at the distal end.¡€0€ª€0€ €CDD¡€ €Æb¢€0€0€ €Ûpfam09590, Env-gp36, Lentivirus surface glycoprotein. This protein is found in feline immunodeficiency retrovirus. It represents the surface glycoprotein which is found in the polyprotein C-terminal to the Env protein.¡€0€ª€0€ €CDD¡€ €K¢€0€0€ €’pfam09591, DUF2463, Protein of unknown function (DUF2463). This protein is found in eukaryotic, parasitic microsporidia. Its function is unknown.¡€0€ª€0€ €CDD¡€ €_¹¢€0€0€ €ªpfam09592, DUF2031, Protein of unknown function (DUF2031). This protein is expressed in Plasmodium; its function is unknown. It may be the product of gene family pyst-b.¡€0€ª€0€ €CDD¡€ €_º¢€0€0€ €‚Dpfam09593, Pathogen_betaC1, Beta-satellite pathogenicity beta C1 protein. Cotton leaf-curl disease - CLCuD - is of major economic importance in cotton-growing areas of the far-east. The infectious agent appears to be a single-stranded DNA molecule of approx 1350 nucleotides in length, which, when inoculated with the Begomovirus into cotton, induces symptoms typical of CLCuD. This molecule requires the Begomovirus for replication and encapsidation. DNA beta encodes a single protein, betaC1. The intracellular distribution of betaC1 is consistent with the hypothesis that it has a role in transporting the DNA A of Begomovirus from the nuclear site of replication to the plasmodesmatal exit sites of the infected cell. The DNA beta-encoded protein, betaC1, is the determinant of both pathogenicity and suppression of gene silencing.¡€0€ª€0€ €CDD¡€ €K!¢€0€0€ €¯pfam09594, DUF2029, Protein of unknown function (DUF2029). This is a putative transmembrane protein from bacteria. It is likely to be conserved between Mycobacterium species.¡€0€ª€0€ €CDD¡€ €Æc¢€0€0€ €Ápfam09595, Metaviral_G, Metaviral_G glycoprotein. This is a viral attachment glycoprotein from region G of metaviruses. It is high in serine and threonine suggesting it is highly glycosylated.¡€0€ª€0€ €CDD¡€ €Æd¢€0€0€ €‚0pfam09596, MamL-1, MamL-1 domain. The MamL-1 domain is a polypeptide of up to 70 residues, numbers 15-67 of which adopt an elongated kinked helix that wraps around ANK and CSL forming one of the complexes in the build-up of the Notch transcriptional complex for recruiting general transcription factors.¡€0€ª€0€ €CDD¡€ €Æe¢€0€0€ €ˆpfam09597, IGR, IGR protein motif. This domain is found in fungal proteins and contains a conserved IGR motif. Its function is unknown.¡€0€ª€0€ €CDD¡€ €Æf¢€0€0€ €‚zpfam09598, Stm1_N, Stm1. This region is found at the N terminal of the Stm1 protein. Stm1 is a G4 quadraplex and purine motif triplex nucleic acid-binding protein. It has been implicated in many biological processes including apoptosis and telomere biosynthesis. Stm1 is known to interact with CDC13, and is known to associate with ribosomes and nuclear telomere cap complexes.¡€0€ª€0€ €CDD¡€ €Æg¢€0€0€ €‚ pfam09599, IpaC_SipC, Salmonella-Shigella invasin protein C (IpaC_SipC). This entry represents a family of proteins associated with bacterial type III secretion systems, which are injection machines for virulence factors into host cell cytoplasm. Characterized members of this protein family are known to be secreted and are described as invasins, including IpaC from Shigella flexneri and SipC from Salmonella typhimurium. Members may be referred to as invasins, pathogenicity island effectors, and cell invasion proteins.¡€0€ª€0€ €CDD¡€ €_¿¢€0€0€ €‚]pfam09600, Cyd_oper_YbgE, Cyd operon protein YbgE (Cyd_oper_YbgE). This entry describes a small protein of unknown function, about 100 amino acids in length, essentially always found in an operon with CydAB, subunits of the cytochrome d terminal oxidase. It appears to be an integral membrane protein. It is found so far only in the Proteobacteria.¡€0€ª€0€ €CDD¡€ €Æh¢€0€0€ €‚pfam09601, DUF2459, Protein of unknown function (DUF2459). This conserved hypothetical protein of unknown function is found in several Proteobacteria. Its function is unknown and its genome context is not well-conserved. It is found amid urease genes in at least one species.¡€0€ª€0€ €CDD¡€ €Æi¢€0€0€ €‚+pfam09602, PhaP_Bmeg, Polyhydroxyalkanoic acid inclusion protein (PhaP_Bmeg). This entry describes a protein found in polyhydroxyalkanoic acid (PHA) gene regions and incorporated into PHA inclusions in Bacillus cereus and Bacillus megaterium. The role of the protein may include amino acid storage.¡€0€ª€0€ €CDD¡€ €_¢€0€0€ €‚Fpfam09603, Fib_succ_major, Fibrobacter succinogenes major domain (Fib_succ_major). This domain of about 175 to 200 amino acids is found, in from one to five copies, in over 50 proteins in Fibrobacter succinogenes S85, an obligate anaerobe of the rumen. Many members of this family have an apparent lipoprotein signal sequence. Conserved cysteine residues, suggestive of disulfide bond formation, are also consistent with an extracytoplasmic location for this domain. This domain can also be found in small numbers of proteins in Chlorobium tepidum and Bacteroides thetaiotaomicron.¡€0€ª€0€ €CDD¡€ €Æj¢€0€0€ €‚bpfam09604, Potass_KdpF, F subunit of K+-transporting ATPase (Potass_KdpF). This entry describes a very small integral membrane peptide KdpF, a subunit of the K(+)-translocating Kdp complex. It is found upstream of the KdpA subunit (IPR004623). Because of its very small size and highly hydrophobic character, it is sometimes missed in genome annotation.¡€0€ª€0€ €CDD¡€ €Æk¢€0€0€ €‚†pfam09605, Trep_Strep, Hypothetical bacterial integral membrane protein (Trep_Strep). This family consists of strongly hydrophobic proteins about 190 amino acids in length with a strongly basic motif near the C-terminus. It is found in rather few species, but in paralogous families of 12 members in the oral pathogenic spirochaete Treponema denticola and 2 in Streptococcus pneumoniae R6.¡€0€ª€0€ €CDD¡€ €Æl¢€0€0€ €‚pfam09606, Med15, ARC105 or Med15 subunit of Mediator complex non-fungal. The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.¡€0€ª€0€ €CDD¡€ €Æm¢€0€0€ €‚œpfam09607, BrkDBD, Brinker DNA-binding domain. This DNA-binding domain is the first approx. 100 residues of the N-terminal end of Brinker. The structure of this domain in complex with DNA consists of four alpha-helices that contain a helix-turn-helix DNA recognition motif specific for GC-rich DNA. The Brinker nuclear repressor is a major element of the Drosophila Decapentaplegic morphogen signalling pathway.¡€0€ª€0€ €CDD¡€ €_Æ¢€0€0€ €õpfam09608, Alph_Pro_TM, Putative transmembrane protein (Alph_Pro_TM). This family consists of predicted transmembrane proteins of about 270 amino acids. Members are found, so far, only among the Alphaproteobacteria and only once in each genome.¡€0€ª€0€ €CDD¡€ €Æn¢€0€0€ €‚;pfam09609, Cas_GSU0054, CRISPR-associated protein, GSU0054 family (Cas_GSU0054). This entry represents a rare CRISPR-associated protein. So far, members are found in Geobacter sulfurreducens and in two unpublished genomes: Gemmata obscuriglobus and Actinomyces naeslundii. CRISPR-associated proteins typically are found near CRISPR repeats and other CRISPR-associated proteins, have low levels of sequence identify, have sequence relationships that suggest lateral transfer, and show some sequence similarity to DNA-active proteins such as helicases and repair proteins.¡€0€ª€0€ €CDD¡€ €Æo¢€0€0€ €‚pfam09610, Myco_arth_vir_N, Mycoplasma virulence signal region (Myco_arth_vir_N). This entry represents the N-terminal region of a family of large, virulence-associated proteins in Mycoplasma arthritidis and smaller proteins in Mycoplasma capricolum. It includes a probable signal sequence or signal anchor, which, in most instances, has four consecutive Lys residues before the hydrophobic stretch.¡€0€ª€0€ €CDD¡€ €Æp¢€0€0€ €‚Æpfam09611, Cas_Csy1, CRISPR-associated protein (Cas_Csy1). CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) is a widespread family of prokaryotic direct repeats with spacers of unique sequence between consecutive repeats. This entry, typified by YPO2465 of Yersinia pestis, is a CRISPR-associated (Cas) entry strictly associated with the Ypest subtype of CRISPR/Cas locus. It is designated Csy1, for CRISPR/Cas Subtype Ypest protein 1.¡€0€ª€0€ €CDD¡€ €Æq¢€0€0€ €‚Æpfam09612, HtrL_YibB, Bacterial protein of unknown function (HtrL_YibB). The protein from this rare, uncharacterized protein family is designated HtrL or YibB in E. coli, where its gene is found in a region of LPS core biosynthesis genes. homologs are found in Shigella flexneri, Campylobacter jejuni, and Caenorhabditis elegans only. The htrL gene may represent an insertion to the LPS core biosynthesis region, rather than an LPS biosynthetic protein.¡€0€ª€0€ €CDD¡€ €Ær¢€0€0€ €ïpfam09613, HrpB1_HrpK, Bacterial type III secretion protein (HrpB1_HrpK). This family of proteins is encoded by genes found within type III secretion operons in a limited range of species including Xanthomonas, Ralstonia and Burkholderia.¡€0€ª€0€ €CDD¡€ €Æs¢€0€0€ €‚Æpfam09614, Cas_Csy2, CRISPR-associated protein (Cas_Csy2). CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) is a widespread family of prokaryotic direct repeats with spacers of unique sequence between consecutive repeats. This entry, typified by YPO2464 of Yersinia pestis, is a CRISPR-associated (Cas) entry strictly associated with the Ypest subtype of CRISPR/Cas locus. It is designated Csy2, for CRISPR/Cas Subtype Ypest protein 2.¡€0€ª€0€ €CDD¡€ €Æt¢€0€0€ €‚Æpfam09615, Cas_Csy3, CRISPR-associated protein (Cas_Csy3). CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) is a widespread family of prokaryotic direct repeats with spacers of unique sequence between consecutive repeats. This entry, typified by YPO2463 of Yersinia pestis, is a CRISPR-associated (Cas) entry strictly associated with the Ypest subtype of CRISPR/Cas locus. It is designated Csy3, for CRISPR/Cas Subtype Ypest protein 3.¡€0€ª€0€ €CDD¡€ €Æu¢€0€0€ €‚epfam09617, Cas_GSU0053, CRISPR-associated protein GSU0053 (Cas_GSU0053). This entry is found in CRISPR-associated (cas) proteins in the genomes of Geobacter sulfurreducens PCA and Desulfotalea psychrophila LSv54 (both Desulfobacterales from the Deltaproteobacteria), Gemmata obscuriglobus (a Planctomycete), and Actinomyces naeslundii MG1 (Actinobacteria).¡€0€ª€0€ €CDD¡€ €Æv¢€0€0€ €‚Ðpfam09618, Cas_Csy4, CRISPR-associated protein (Cas_Csy4). CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) is a widespread family of prokaryotic direct repeats with spacers of unique sequence between consecutive repeats. This protein family, typified by YPO2462 of Yersinia pestis, is a CRISPR-associated (Cas) family strictly associated with the Ypest subtype of CRISPR/Cas locus. It is designated Csy4, for CRISPR/Cas Subtype Ypest protein 4.¡€0€ª€0€ €CDD¡€ €Æw¢€0€0€ €‚upfam09619, YscW, Type III secretion system lipoprotein chaperone (YscW). This entry is encoded within type III secretion operons. The protein has been characterized as a chaperone for the outer membrane pore component YscC. YscW is a lipoprotein which is itself localized to the outer membrane and, it is believed, facilitates the oligomerization and localization of YscC.¡€0€ª€0€ €CDD¡€ €Æx¢€0€0€ €‚¶pfam09620, Cas_csx3, CRISPR-associated protein (Cas_csx3). This entry is encoded in CRISPR-associated (cas) gene clusters, near CRISPR repeats, in the genomes of several different thermophiles: Archaeoglobus fulgidus (archaeal), Aquifex aeolicus (Aquificae), Dictyoglomus thermophilum (Dictyoglomi), and a thermophilic Synechococcus (Cyanobacteria). It is not yet assigned to a specific CRISPR/cas subtype (hence the x designation csx3).¡€0€ª€0€ €CDD¡€ €Æy¢€0€0€ €Ýpfam09621, LcrR, Type III secretion system regulator (LcrR). This family of proteins are encoded within type III secretion operons and have been characterized in Yersinia as a regulator of the Low-Calcium Response (LCR).¡€0€ª€0€ €CDD¡€ €_Ò¢€0€0€ €‚%pfam09622, DUF2391, Putative integral membrane protein (DUF2391). This entry is found in Nostoc sp. PCC 7120, Agrobacterium tumefaciens, Rhizobium meliloti, and Gloeobacter violaceus in a conserved two-gene neighborhood. Proteins containing this entry appear to span the membrane seven times.¡€0€ª€0€ €CDD¡€ €Æz¢€0€0€ €‚(pfam09623, Cas_NE0113, CRISPR-associated protein NE0113 (Cas_NE0113). Members of this minor CRISPR-associated (Cas) protein family are encoded in cas gene clusters in Vibrio vulnificus YJ016, Nitrosomonas europaea ATCC 19718, Mannheimia succiniciproducens MBEL55E, and Verrucomicrobium spinosum.¡€0€ª€0€ €CDD¡€ €Æ{¢€0€0€ €‚epfam09624, DUF2393, Protein of unknown function (DUF2393). The function of this protein is unknown. It is always found as part of a two-gene operon with IPR013416, a protein that appears to span the membrane seven times. It has so far been found in the bacteria Nostoc sp. PCC 7120, Agrobacterium tumefaciens, Rhizobium meliloti, and Gloeobacter violaceus.¡€0€ª€0€ €CDD¡€ €Æ|¢€0€0€ €‚épfam09625, VP9, VP9 protein. VP9 is a protein containing a ferredoxin fold. Two dimers come together to form one asymmetric unit which possesses a DNA recognition fold and specific metal binding sites possibly for zinc. It is postulated that being a non-structural protein VP9 is involved in the transcriptional regulation of the White spot syndrome virus, WSSV, from which it comes. WSSV is the major viral pathogen in shrimp aquaculture. VP9 is found N-terminal to the pfam07056 domain.¡€0€ª€0€ €CDD¡€ €Æ}¢€0€0€ €‚²pfam09626, DHC, Dihaem cytochrome c. Dihaem cytochrome c (DHC) is a soluble c-type cytochrome that folds into two distinct domains, each binding a single haem group and connected by a small linker region. Despite little sequence similarity, the N-terminal domain (residues 12-75) is a class I type cytochrome c, that binds one of the haems, but the domain surrounding the other haem is structurally unique. DHC binds electrostatically to an oxygen-binding protein, sphaeroides haem protein (SHP), as a component of a conserved electron transfer pathway. DHC acts as the physiological electron donor for SHP during phototrophic growth. In certain species DHC is found upstream of pfam01292.¡€0€ª€0€ €CDD¡€ €Æ~¢€0€0€ €‚8pfam09627, PrgU, PrgU-like protein. This hypothetical protein of 125 residues is expressed in bacteria but is thought to be plasmid in origin. It forms a six beta-strand barrel with three accompanying alpha helices and is probably a homo-dimer in the cell. It may be involved in pheromone-inducible conjugation.¡€0€ª€0€ €CDD¡€ €_Ø¢€0€0€ €Ëpfam09628, YvfG, YvfG protein. Yvfg is a hypothetical protein of 71 residues expressed in some bacteria. The monomer consists of two parallel alpha helices, and the protein crystallizes as a homo-dimer.¡€0€ª€0€ €CDD¡€ €å墀0€0€ €ñpfam09629, YorP, YorP protein. YorP is a 71 residue protein found in bacteria. As it is also found in a bacteriophage it might be of viral origin. The structure is of an alpha helix between two of five beta strands. The function is unknown.¡€0€ª€0€ €CDD¡€ €_Ù¢€0€0€ €ñpfam09630, DUF2024, Domain of unknown function (DUF2024). This protein of 86 residues is expressed in bacteria. It consists of four alpha helices and two beta strands. Its function is unknown. One UniProt entry gives the gene name as Traf5.¡€0€ª€0€ €CDD¡€ €Æ¢€0€0€ €‚ípfam09631, Sen15, Sen15 protein. The Sen15 subunit of the tRNA intron-splicing endonuclease is one of the two structural subunits of this hetero-tetrameric enzyme. Residues 36-157 of this subunit possess a novel homodimeric fold. Each monomer consists of three alpha-helices and a mixed antiparallel/parallel beta-sheet. Two monomers of Sen15 fold with two monomers of Sen34, one of the two catalytic subunits, to form an alpha2-beta2 tetramer as part of the functional endonuclease assembly.¡€0€ª€0€ €CDD¡€ €Æ€¢€0€0€ €‚)pfam09632, Rac1, Rac1-binding domain. The Rac1-binding domain is the C-terminal portion of YpkA from Yersinia. It is an all-helical molecule consisting of two distinct subdomains connected by a linker. the N-terminal end, residues 434-615, consists of six helices organised into two three-helix bundles packed against each other. This region is involved with binding to GTPases. The C-terminal end, residues 705-732. is a novel and elongated fold consisting of four helices clustered into two pairs, and this fold carries the helix implicated in actin activation. Rac1-binding domain mimics host guanidine nucleotide dissociation inhibitors (GDIs) of the Rho GTPases, thereby inhibiting nucleotide exchange in Rac1 and causing cytoskeletal disruption in the host. It is usually found downstream of pfam00069.¡€0€ª€0€ €CDD¡€ €_Ü¢€0€0€ €Öpfam09633, DUF2023, Protein of unknown function (DUF2023). This protein of approx.120 residues consists of three beta strands and five alpha helices, thought to fold into a homo-dimer. It is expressed in bacteria.¡€0€ª€0€ €CDD¡€ €Æ¢€0€0€ €‚pfam09634, DUF2025, Protein of unknown function (DUF2025). This protein is produced from gene PA1123 in Pseudomonas. It contains three alpha helices and six beta strands and is thought to be monomeric. It appears to be present in the biofilm layer and may be a lipoprotein.¡€0€ª€0€ €CDD¡€ €Æ‚¢€0€0€ €‚ðpfam09635, MetRS-N, MetRS-N binding domain. The MetRS-N domain binds an Arc1-P domain in a tetrameric complex resembling a classical GST homo-dimer. Domain-swapping between symmetrically related MetRS-N and Arc1p-N domains generates a 2:2 tetramer held together by van der Waals forces. This domain is necessary for formation of the aminoacyl-tRNA synthetase complex necessary for tRNA nuclear export and shuttling as part of the translational apparatus. The domain is associated with pfam09334.¡€0€ª€0€ €CDD¡€ €å袀0€0€ €åpfam09636, XkdW, XkdW protein. This protein of approx. 100 residues contains two alpha helices and two beta strands and is probably monomeric. It is expressed in bacteria but is probably viral in origin. Its function is unknown.¡€0€ª€0€ €CDD¡€ €å颀0€0€ €‚(pfam09637, Med18, Med18 protein. Med18 is one subunit of Mediator, a head-module multiprotein complex, that stimulates basal RNA polymerase II (Pol II) transcription. Med18 consists of an eight-stranded beta-barrel with a central pore and three flanking helices. It complexes with Med8 and Med20 proteins by forming a heterodimer of two-fold symmetry with Med20 and binding the C-terminal alpha-helix region of Med8 across the top of its barrel. This complex creates a multipartite TBP-binding site that can be modulated by transcriptional activators.¡€0€ª€0€ €CDD¡€ €ƃ¢€0€0€ €Æpfam09638, Ph1570, Ph1570 protein. This is a hypothetical protein from Pyroccous horikoshii of unknown function. It contains six alpha helices and eight beta strands and is thought to be monomeric.¡€0€ª€0€ €CDD¡€ €_ࢀ0€0€ €‚pfam09639, YjcQ, YjcQ protein. YjcQ is a protein of approx. 100 residues containing four alpha helices and three beta strands. It is expressed in bacteria and also in viruses. It appears to be under the regulation of SigD RNA polymerase which is responsible for the expression of many genes encoding cell-surface proteins related to flagellar assembly, motility, chemotaxis and autolysis in the late exponential growth phase. The exact function of YjcQ is unknown. However, it is thought to be a prophage head protein in viruses.¡€0€ª€0€ €CDD¡€ €_ᢀ0€0€ €ºpfam09640, DUF2027, Domain of unknown function (DUF2027). This protein domain is of unknown function. though putatively involved in DNA mismatch repair. It is associated with pfam01713.¡€0€ª€0€ €CDD¡€ €Æ„¢€0€0€ €‚Apfam09641, DUF2026, Protein of unknown function (DUF2026). This protein of approx. 100 residues is found in bacteria. It contains up to five alpha helices and up to seven beta strands and is probably monomeric. Its function is unknown. It is cited as a major prophage head protein, so might generally be of viral origin.¡€0€ª€0€ €CDD¡€ €_㢀0€0€ €‚pfam09642, YonK, YonK protein. YonK protein is expressed by the bacterial prophage SPbetaC. It is a 63 residue protein that associates into a homo-octamer in the form of a beta-stranded barrel with four outer helical features at points of the compass. Its function is unknown.¡€0€ª€0€ €CDD¡€ €_䢀0€0€ €™pfam09643, YopX, YopX protein. YopX is a protein that is largely helical, with three identical chains probably complexing into a twelve-chain structure.¡€0€ª€0€ €CDD¡€ €Æ…¢€0€0€ €‚pfam09644, Mg296, Mg296 protein. This protein of 129 residues is expressed in bacteria. It consists of three identical chains of five alpha helices. Two copies of each chain associate into a complex of six units of possible biological significance but of unknown function.¡€0€ª€0€ €CDD¡€ €_梀0€0€ €špfam09645, F-112, F-112 protein. F-112 protein is of 70-110 residues and is found in viruses. Its winged-helix structure suggests a DNA-binding function.¡€0€ª€0€ €CDD¡€ €_碀0€0€ €‚dpfam09646, Gp37, Gp37 protein. This protein of 154 residues consists of a unit of helices and beta sheets that crystallizes into a beautiful asymmetrical dodecameric barrel-structure, of two six-membered rings one on top of the other. It is expressed in bacteria but is of viral origin as it is found in phage BcepMu and is probably a pathogenesis factor.¡€0€ª€0€ €CDD¡€ €Ɔ¢€0€0€ €‚cpfam09648, YycI, YycH protein. This domain is exclusively found in YycI proteins in the low GC content Gram positive species. These two domains share the same structural fold with domains two and three of YycH pfam07435. Both, YycH and YycI are always found in pair on the chromosome, downstream of the essential histidine kinase YycG. Additionally, both proteins share a function in regulating the YycG kinase with which they appear to form a ternary complex. Lastly, the two proteins always contain an N-terminal transmembrane helix and are localized to the periplasmic space as shown by PhoA fusion studies.¡€0€ª€0€ €CDD¡€ €Ƈ¢€0€0€ €‚²pfam09649, CHZ, Histone chaperone domain CHZ. This domain is highly conserved from yeasts to humans and is part of the chaperone protein HIRIP3 in vertebrates which interacts with the H3.3 chaperone HIRA, implicated in histone replacement during transcription. N- and C- termini of Chz family members are relatively divergent but do contain similar acidic stretches rich in Glu/Asp residues, characteristic of all histone chaperones.¡€0€ª€0€ €CDD¡€ €ƈ¢€0€0€ €‚žpfam09650, PHA_gran_rgn, Putative polyhydroxyalkanoic acid system protein (PHA_gran_rgn). Proteins in this entry are encoded by genes involved in either polyhydroxyalkanoic acid (PHA) biosynthesis or utilisation, including proteins found at the surface of PHA granules. These proteins have so far been found in the Pseudomonadales, Xanthomonadales, and Vibrionales, all of which belong to the Gammaproteobacteria.¡€0€ª€0€ €CDD¡€ €Ɖ¢€0€0€ €‚spfam09651, Cas_APE2256, CRISPR-associated protein (Cas_APE2256). This entry represents a conserved region of about 150 amino acids found in at least five archaeal and three bacterial species. These species all contain CRISPRs (Clustered Regularly Interspaced Short Palindromic Repeats). In six of eight species, the protein is encoded the vicinity of a CRISPR/Cas locus.¡€0€ª€0€ €CDD¡€ €ÆŠ¢€0€0€ €‚„pfam09652, Cas_VVA1548, Putative CRISPR-associated protein (Cas_VVA1548). This entry represents a conserved region of about 95 amino acids found exclusively in species with CRISPRs (Clustered Regularly Interspaced Short Palindromic Repeats). In all bacterial species that contain this entry, the genes encoding the proteins are in the midst of a cluster of cas (CRISPR-associated) genes.¡€0€ª€0€ €CDD¡€ €_í¢€0€0€ €îpfam09654, DUF2396, Protein of unknown function (DUF2396). These conserved hypothetical proteins have so far been found only in the Cyanobacteria. They are about 170 amino acids long and contain a CxxCx(14)CxxH motif near the N-terminus.¡€0€ª€0€ €CDD¡€ €Æ‹¢€0€0€ €‚¢pfam09655, Nitr_red_assoc, Conserved nitrate reductase-associated protein (Nitr_red_assoc). Proteins in this entry are found in the Cyanobacteria, and are mostly encoded near nitrate reductase and molybdopterin biosynthesis genes. Molybdopterin guanine dinucleotide is a cofactor for nitrate reductase. These proteins are sometimes annotated as nitrate reductase-associated proteins, though their function is unknown.¡€0€ª€0€ €CDD¡€ €ÆŒ¢€0€0€ €úpfam09656, PGPGW, Putative transmembrane protein (PGPGW). Proteins in this entry are putative Actinobacterial proteins of about 150 amino acids in length, with three predicted transmembrane helices and an unusual motif with consensus sequence PGPGW.¡€0€ª€0€ €CDD¡€ €Æ¢€0€0€ €‚\pfam09657, Cas_Csx8, CRISPR-associated protein Csx8 (Cas_Csx8). Clusters of short DNA repeats with nonhomologous spacers, which are found at regular intervals in the genomes of phylogenetically distinct prokaryotic species, comprise a family with recognisable features. This family is known as CRISPR (short for Clustered, Regularly Interspaced Short Palindromic Repeats). A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This entry describes proteins of unknown function which are encoded in the midst of a cas gene operon.¡€0€ª€0€ €CDD¡€ €_ñ¢€0€0€ €‚4pfam09658, Cas_Csx9, CRISPR-associated protein (Cas_Csx9). Clusters of short DNA repeats with nonhomologous spacers, which are found at regular intervals in the genomes of phylogenetically distinct prokaryotic species, comprise a family with recognisable features. This family is known as CRISPR (short for Clustered, Regularly Interspaced Short Palindromic Repeats). A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This entry describes archaeal proteins encoded in cas gene regions.¡€0€ª€0€ €CDD¡€ €_ò¢€0€0€ €‚ðpfam09659, Cas_Csm6, CRISPR-associated protein (Cas_Csm6). Clusters of short DNA repeats with nonhomologous spacers, which are found at regular intervals in the genomes of phylogenetically distinct prokaryotic species, comprise a family with recognisable features. This family is known as CRISPR (short for Clustered, Regularly Interspaced Short Palindromic Repeats). A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins.¡€0€ª€0€ €CDD¡€ €ÆŽ¢€0€0€ €‚¤pfam09660, DUF2397, Protein of unknown function (DUF2397). Proteins in this entry are encoded within a conserved gene four-gene neighborhood found sporadically in a phylogenetically broad range of bacteria including: Nocardia farcinica, Symbiobacterium thermophilum, and Streptomyces avermitilis (Actinobacteria), Geobacillus kaustophilus (Firmicutes), Azoarcus sp. EbN1 and Ralstonia solanacearum (Betaproteobacteria).¡€0€ª€0€ €CDD¡€ €Æ¢€0€0€ €‚¤pfam09661, DUF2398, Protein of unknown function (DUF2398). Proteins in this entry are encoded within a conserved gene four-gene neighborhood found sporadically in a phylogenetically broad range of bacteria including: Nocardia farcinica, Symbiobacterium thermophilum, and Streptomyces avermitilis (Actinobacteria), Geobacillus kaustophilus (Firmicutes), Azoarcus sp. EbN1 and Ralstonia solanacearum (Betaproteobacteria).¡€0€ª€0€ €CDD¡€ €Æ¢€0€0€ €‚µpfam09662, Phenyl_P_gamma, Phenylphosphate carboxylase gamma subunit (Phenyl_P_gamma). Members of this protein family are the gamma subunit of phenylphosphate carboxylase. Phenol (methyl-benzene) is converted to phenylphosphate, then para-carboxylated by this four-subunit enzyme, with the release of phosphate, to 4-hydroxybenzoate. The enzyme contains neither biotin nor thiamin pyrophosphate. The gamma subunit has no known homologs.¡€0€ª€0€ €CDD¡€ €_ö¢€0€0€ €‚˜pfam09663, Amido_AtzD_TrzD, Amidohydrolase ring-opening protein (Amido_AtzD_TrzD). Members of this family are ring-opening amidohydrolases, including cyanuric acid amidohydrolase (EC:3.5.2.15) (AtzD and TrzD) and barbiturase. Note that barbiturase does not act as defined for EC:3.5.2.1 (barbiturate + water = malonate + urea) but rather catalyzes the ring opening of barbiturase acid to ureidomalonic acid.¡€0€ª€0€ €CDD¡€ €Æ‘¢€0€0€ €‚Þpfam09664, DUF2399, Protein of unknown function C-terminus (DUF2399). Proteins in this entry are encoded within a conserved gene four-gene neighborhood found sporadically in a phylogenetically broad range of bacteria including: Nocardia farcinica, Symbiobacterium thermophilum, and Streptomyces avermitilis (Actinobacteria), Geobacillus kaustophilus (Firmicutes), Azoarcus sp. EbN1 and Ralstonia solanacearum (Beta-proteobacteria). Just the C-terminal region is ioncluded here.¡€0€ª€0€ €CDD¡€ €Æ’¢€0€0€ €‚ pfam09665, RE_Alw26IDE, Type II restriction endonuclease (RE_Alw26IDE). Members of this entry are type II restriction endonucleases of the Alw26I/Eco31I/Esp3I family. characterized specificities of the three members are GGTCTC, CGTCTC and the shared subsequence GTCTC.¡€0€ª€0€ €CDD¡€ €_ù¢€0€0€ €÷pfam09666, Sororin, Sororin protein. Sororin is an essential, cell cycle-dependent mediator of sister chromatid cohesion. The protein is nuclear in interphase cells, dispersed from the chromatin in mitosis, and interacts with the cohesin complex.¡€0€ª€0€ €CDD¡€ €Æ“¢€0€0€ €•pfam09667, DUF2028, Domain of unknown function (DUF2028). This region of similarity is found in the vertebrate homologs of the drosophila Bobby Sox.¡€0€ª€0€ €CDD¡€ €Æ”¢€0€0€ €Ôpfam09668, Asp_protease, Aspartyl protease. This family of eukaryotic aspartyl proteases have a fold similar to retroviral proteases which implies they function proteolytically during regulated protein turnover.¡€0€ª€0€ €CDD¡€ €Æ•¢€0€0€ €‚Äpfam09669, Phage_pRha, Phage regulatory protein Rha (Phage_pRha). Members of this protein family are found in temperate phage and bacterial prophage regions. Members include the product of the rha gene of the lambdoid phage phi-80, a late operon gene. The presence of this gene interferes with infection of bacterial strains that lack integration host factor (IHF), which regulates the rha gene. It is suggested that Rha is a phage regulatory protein.¡€0€ª€0€ €CDD¡€ €Æ–¢€0€0€ €‚Œpfam09670, Cas_Cas02710, CRISPR-associated protein (Cas_Cas02710). Members of this family are found, exclusively in the vicinity of CRISPR repeats and other CRISPR-associated (cas) genes, in Methanothermobacter thermautotrophicus (Methanobacterium thermoformicicum), Thermus thermophilus (Deinococcus-Thermus), Chloroflexus aurantiacus (Chloroflexi), and Thermomicrobium roseum (Thermomicrobia).¡€0€ª€0€ €CDD¡€ €Æ—¢€0€0€ €‚pfam09671, Spore_GerQ, Spore coat protein (Spore_GerQ). Members of this protein family are the spore coat protein GerQ of endospore-forming Firmicutes (low GC Gram-positive bacteria). This protein is cross-linked by a spore coat-associated transglutaminase.¡€0€ª€0€ €CDD¡€ €_þ¢€0€0€ €‚#pfam09673, TrbC_Ftype, Type-F conjugative transfer system pilin assembly protein. This entry represents TrbC, a protein that is an essential component of the F-type conjugative pilus assembly system for the transfer of plasmid DNA. The N-terminal portion of these proteins is heterogeneous.¡€0€ª€0€ €CDD¡€ €Ƙ¢€0€0€ €‚!pfam09674, DUF2400, Protein of unknown function (DUF2400). Members of this uncharacterized protein family are found sporadically, so far only among spirochetes, epsilon and delta proteobacteria, and Bacteroides. The function is unknown and its gene neighborhoods show little conservation.¡€0€ª€0€ €CDD¡€ €Æ™¢€0€0€ €‚ìpfam09675, Chlamy_scaf, Chlamydia-phage Chp2 scaffold (Chlamy_scaf). Members of this entry are encoded by genes in chlamydia-phage such as Chp2. These viruses have around eight genes and obligately infect intracellular bacterial pathogens of the genus Chlamydia. This protein is annotated as VP3 or structural protein (as if a protein of mature viral particles), however, it is displaced from procapsids as DNA is packaged, and therefore is more correctly described as a scaffolding protein.¡€0€ª€0€ €CDD¡€ €`¢€0€0€ €‚Epfam09676, TraV, Type IV conjugative transfer system lipoprotein (TraV). This entry includes TraV, which is a component of conjugative type IV secretion system. TraV is an outer membrane lipoprotein that is believed to interact with the secretin TraK. The alignment contains three conserved cysteines in the N-terminal half.¡€0€ª€0€ €CDD¡€ €Æš¢€0€0€ €üpfam09677, TrbI_Ftype, Type-F conjugative transfer system protein (TrbI_Ftype). This entry represents TrbI, an essential component of the F-type conjugative transfer system for plasmid DNA transfer that has been shown to be localized to the periplasm.¡€0€ª€0€ €CDD¡€ €Æ›¢€0€0€ €ßpfam09678, Caa3_CtaG, Cytochrome c oxidase caa3 assembly factor (Caa3_CtaG). Members of this family are the CtaG protein required for assembly of active cytochrome c oxidase of the caa3 type, as found in Bacillus subtilis.¡€0€ª€0€ €CDD¡€ €Æœ¢€0€0€ €‚pfam09679, TraQ, Type-F conjugative transfer system pilin chaperone (TraQ). This entry represents TraQ, a protein that makes a specific interaction with pilin (TraA) to aid its transfer through the inner membrane during the process of F-type conjugative pilus assembly.¡€0€ª€0€ €CDD¡€ €`¢€0€0€ €‚·pfam09680, Tiny_TM_bacill, Protein of unknown function (Tiny_TM_bacill). This entry represents a family of hypothetical proteins, half of which are 40 residues or less in length. Members are found only in spore-forming species. A Gly-rich variable region is followed by a strongly conserved, highly hydrophobic region, predicted to form a transmembrane helix, ending with an invariant Gly. The consensus for this stretch is FALLVVFILLIIV.¡€0€ª€0€ €CDD¡€ €Æ¢€0€0€ €‚jpfam09681, Phage_rep_org_N, N-terminal phage replisome organiser (Phage_rep_org_N). This entry represents the N-terminal domain of a small family of phage proteins. The protein contains a region of low-complexity sequence that reflects DNA direct repeats able to function as an origin of phage replication. The region is N-terminal to the low-complexity region.¡€0€ª€0€ €CDD¡€ €Æž¢€0€0€ €‚Ìpfam09682, Phage_holin_6_1, Bacteriophage holin of superfamily 6 (Holin_LLH). Phage_holin_6_1 or Holin_LLH identifies a family of phage holins from a number of phage and prophage regions of Gram-positive bacteria. Like other holins, it is large for holins (about 100-160 amino acids) with stretches of hydrophobic sequence and is encoded adjacent to lytic enzymes. Holin LLH family is found in phage of Firmicutes and have an N-terminal transmembrane segment.¡€0€ª€0€ €CDD¡€ €ÆŸ¢€0€0€ €Øpfam09683, Lactococcin_972, Bacteriocin (Lactococcin_972). These sequences represent bacteriocins related to lactococcin. Members tend to be found in association with a seven transmembrane putative immunity protein.¡€0€ª€0€ €CDD¡€ €Æ ¢€0€0€ €Çpfam09684, Tail_P2_I, Phage tail protein (Tail_P2_I). These sequences represent the family of phage P2 protein I and related tail proteins from a number of temperate phage of Gram-negative bacteria.¡€0€ª€0€ €CDD¡€ €Æ¡¢€0€0€ €;pfam09685, DUF4870, Domain of unknown function (DUF4870). ¡€0€ª€0€ €CDD¡€ €Æ¢¢€0€0€ €‚ëpfam09686, Plasmid_RAQPRD, Plasmid protein of unknown function (Plasmid_RAQPRD). This entry identifies a family of proteins, which are about 100 amino acids in length, including a predicted signal sequence and a perfectly conserved motif RAQPRD towards the C terminus. Members are found in the Pseudomonas putida TOL plasmid pWW0 and in cryptic plasmid regions of Salmonella enterica subsp. enterica serovar Typhi and Pseudomonas syringae DC3000. The function of these proteins are unknown.¡€0€ª€0€ €CDD¡€ €Æ£¢€0€0€ €‚¡pfam09687, PRESAN, Plasmodium RESA N-terminal. The short, four-helical domain first identified in the Plasmodium export proteins PHISTa and PHISTc has been extended to become this six-helical PRESAN domain identified in the P. falciparum-specific RESA-type (Ring-infected erythrocyte surface antigen) proteins in association with the DnaJ domain. Overall, at least 67 proteins have been detected in P. falciparum with complete copies of the PRESAN domain. No versions of this domain were detected in other apicomplexan genera, suggesting that the domain was 'invented' after the divergence of the lineage leading to the genus Plasmodium undergoing a dramatic proliferation only in P. falciparum. A secondary structure-prediction derived from the multiple alignment of the PRESAN family reveals that it is composed of an all-helical fold with six conserved helical segments. There is some evidence it might localize to membranes.¡€0€ª€0€ €CDD¡€ €Ƥ¢€0€0€ €÷pfam09688, Wx5_PLAF3D7, Protein of unknown function (Wx5_PLAF3D7). This set of protein sequences represent a family of at least four proteins in Plasmodium falciparum (isolate 3D7). An interesting feature is five perfectly conserved Trp residues.¡€0€ª€0€ €CDD¡€ €Æ¥¢€0€0€ €Ýpfam09689, PY_rept_46, Plasmodium yoelii repeat (PY_rept_46). This repeat is found in the products of only 2 genes in Plasmodium yoelii, in each of these proteins it is repeated 9 times. It is found in no other organism.¡€0€ª€0€ €CDD¡€ €`¢€0€0€ €‚âpfam09690, PYST-C1, Plasmodium yoelii subtelomeric region (PYST-C1). This group of sequences are defined by the N-terminal domain of a paralogous family of Plasmodium yoelii genes preferentially located in the subtelomeric regions of the chromosomes. There are no obvious homologs to these genes in any other organism. The C-terminal portions of the genes that contain this domain are divergent and some contain other yoelii-specific paralogous domains such as PYST-C2 (IPR006491).¡€0€ª€0€ €CDD¡€ €`¢€0€0€ €‚}pfam09691, T2SS_PulS_OutS, Type II secretion system pilotin lipoprotein (PulS_OutS). This family comprises lipoproteins from four gamma proteobacterial species: PulS protein of Klebsiella pneumoniae (P20440), the OutS protein of Erwinia chrysanthemi (Q01567) and Pectobacterium chrysanthemi, and the functionally uncharacterized E. coli protein EtpO. PulS and OutS have been shown to interact with and facilitate insertion of secretins into the outer membrane, suggesting a chaperone-like, or piloting function for members of this family. In the pilotin from this four-helix protein from enterohemorrhagic Escherichia coli, the straight helix alpha2, the curved helix alpha3 and the bent helix alpha4 surround the central N-terminal helix alpha1. These helices create a prominent groove, mainly formed by side chains of helices 1,2 and 3 suggesting this groove is important as a binding site.¡€0€ª€0€ €CDD¡€ €`¢€0€0€ €Âpfam09692, Arb1, Argonaute siRNA chaperone (ARC) complex subunit Arb1. Arb1 is required for histone H3 Lys9 (H3-K9) methylation, heterochromatin, assembly and siRNA generation in fission yeast.¡€0€ª€0€ €CDD¡€ €Ʀ¢€0€0€ €‚Ipfam09693, Phage_XkdX, Phage uncharacterized protein (Phage_XkdX). This entry identifies a family of small (about 50 amino acid) phage proteins, found in at least 12 different phage and prophage regions of Gram-positive bacteria. In a number of these phage, the gene for this protein is found near the holin and endolysin genes.¡€0€ª€0€ €CDD¡€ €`¢€0€0€ €‚5pfam09694, Gcw_chp, Bacterial protein of unknown function (Gcw_chp). This entry represents a conserved hypothetical protein about 240 residues in length found so far in Proteobacteria including Shewanella oneidensis and Ralstonia solanacearum, usually as part of a paralogous family. The function is unknown.¡€0€ª€0€ €CDD¡€ €`¢€0€0€ €ªpfam09695, YtfJ_HI0045, Bacterial protein of unknown function (YtfJ_HI0045). These are sequences from gamma proteobacteria that are related to the E. coli protein, YtfJ.¡€0€ª€0€ €CDD¡€ €`¢€0€0€ €²pfam09696, Ctf8, Ctf8. Ctf8 (chromosome transmissions fidelity 8) is a component of the Ctf18 RFC-like complex which is a DNA clamp loader involved in sister chromatid cohesion.¡€0€ª€0€ €CDD¡€ €Ƨ¢€0€0€ €¬pfam09697, Porph_ging, Protein of unknown function (Porph_ging). This family of proteins of unknown function is found in Porphyromonas gingivalis (Bacteroides gingivalis).¡€0€ª€0€ €CDD¡€ €ƨ¢€0€0€ €‚Epfam09698, GSu_C4xC__C2xCH, Geobacter CxxxxCH...CXXCH motif (GSu_C4xC__C2xCH). This motif occurs from three to eight times in eight different proteins of Geobacter sulfurreducens. The final CXXCH motif matches the cytochrome c family haem-binding site signature, suggesting that the sequence may be involved in haem-binding.¡€0€ª€0€ €CDD¡€ €`¢€0€0€ €‚‘pfam09699, Paired_CXXCH_1, Doubled CXXCH motif (Paired_CXXCH_1). This entry represents a domain of about 41 amino acids that contains, among other motifs, two copies of the motif CXXCH associated with haem binding. This domain is predicted to be a high molecular weight c-type cytochrome and is often found in multiple copies. Members are found mostly in species of Shewanella, Geobacter, and Vibrio.¡€0€ª€0€ €CDD¡€ €Æ©¢€0€0€ €‚œpfam09700, Cas_Cmr3, CRISPR-associated protein (Cas_Cmr3). CRISPR is a term for Clustered Regularly Interspaced Short Palindromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR associated) proteins. This highly divergent family, found in at least ten different archaeal and bacterial species, is represented by TM1793 from Thermotoga maritima.¡€0€ª€0€ €CDD¡€ €ƪ¢€0€0€ €‚zpfam09701, Cas_Cmr5, CRISPR-associated protein (Cas_Cmr5). CRISPR is a term for Clustered, Regularly Interspaced Short Palindromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This family, represented by TM1791.1 of Thermotoga maritima, is found in both archaeal and bacterial species.¡€0€ª€0€ €CDD¡€ €Æ«¢€0€0€ €‚Îpfam09702, Cas_Csa5, CRISPR-associated protein (Cas_Csa5). CRISPR is a term for Clustered, Regularly Interspaced Short Palindromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This entry represents a minor family of Cas proteins found in various species of Sulfolobus and Pyrococcus (all archaeal). It is found with two different CRISPR loci in Sulfolobus solfataricus.¡€0€ª€0€ €CDD¡€ €`¢€0€0€ €‚Ppfam09703, Cas_Csa4, CRISPR-associated protein (Cas_Csa4). CRISPR loci appear to be mobile elements with a wide host range. This entry represents a protein that tends to be found near CRISPR repeats. The species range for this species, so far, is exclusively archaeal. It is found so far in only four different species, and includes two tandem genes in Pyrococcus furiosus DSM 3638. CRISPR is a term for Clustered, Regularly Interspaced Short Palindromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins.¡€0€ª€0€ €CDD¡€ €`¢€0€0€ €‚Spfam09704, Cas_Cas5d, CRISPR-associated protein (Cas_Cas5). CRISPR is a term for Clustered, Regularly Interspaced Short Palindromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This small Cas family is represented by CT1134 of Chlorobium tepidum.¡€0€ª€0€ €CDD¡€ €Ƭ¢€0€0€ €‚äpfam09706, Cas_CXXC_CXXC, CRISPR-associated protein (Cas_CXXC_CXXC). CRISPR is a term for Clustered, Regularly Interspaced Short Palindromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This entry describes a conserved region of about 65 amino acids from an otherwise highly divergent protein found in a minority of CRISPR-associated protein regions. This region features two motifs of CXXC.¡€0€ª€0€ €CDD¡€ €Æ­¢€0€0€ €‚>pfam09707, Cas_Cas2CT1978, CRISPR-associated protein (Cas_Cas2CT1978). This entry represents a minor branch of the Cas2 family of CRISPR-associated protein which are found in IPR003799. Cas proteins are found adjacent to a characteristic short, palindromic repeat cluster termed CRISPR, a probable mobile DNA element.¡€0€ª€0€ €CDD¡€ €Æ®¢€0€0€ €‚pfam09709, Cas_Csd1, CRISPR-associated protein (Cas_Csd1). CRISPR loci appear to be mobile elements with a wide host range. This entry represents proteins that tend to be found near CRISPR repeats. The species range, so far, is exclusively bacterial and mesophilic, although CRISPR loci are particularly common among the archaea and thermophilic bacteria. Clusters of short DNA repeats with nonhomologous spacers, which are found at regular intervals in the genomes of phylogenetically distinct prokaryotic species, comprise a family with recognisable features. This family is known as CRISPR (short for Clustered, Regularly Interspaced Short Palindromic Repeats). A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins.¡€0€ª€0€ €CDD¡€ €Ư¢€0€0€ €‚wpfam09710, Trep_dent_lipo, Treponema clustered lipoprotein (Trep_dent_lipo). This entry represents a family of six predicted lipoproteins from a region of about 20 tandemly arranged genes in the Treponema denticola genome. Two other neighboring genes share the lipoprotein signal peptide region but do not show more extensive homology. The function of this locus is unknown.¡€0€ª€0€ €CDD¡€ €`"¢€0€0€ €‚opfam09711, Cas_Csn2, CRISPR-associated protein (Cas_Csn2). CRISPR loci appear to be mobile elements with a wide host range. This entry represents proteins found only in CRISPR-containing species, near other CRISPR-associated proteins (cas). The species range so far for these proteins is pathogenic bacteria only. Clusters of short DNA repeats with nonhomologous spacers, which are found at regular intervals in the genomes of phylogenetically distinct prokaryotic species, comprise a family with recognisable features. This family is known as CRISPR (short for Clustered, Regularly Interspaced Short Palindromic Repeats).¡€0€ª€0€ €CDD¡€ €ư¢€0€0€ €‚¬pfam09712, PHA_synth_III_E, Poly(R)-hydroxyalkanoic acid synthase subunit (PHA_synth_III_E). This entry represents the PhaE subunit of the heterodimeric class (class III) of polymerase for poly(R)-hydroxyalkanoic acids (PHAs), carbon and energy storage polymers of many bacteria. The most common PHA is polyhydroxybutyrate but about 150 different constituent hydroxyalkanoic acids (HAs) have been identified in various species.¡€0€ª€0€ €CDD¡€ €Ʊ¢€0€0€ €‚ypfam09713, A_thal_3526, Plant protein 1589 of unknown function (A_thal_3526). This plant-specific family of proteins is defined by an uncharacterized region 57 residues in length. It is found toward the N terminus of most proteins that contain it. Examples include at least several proteins from Arabidopsis thaliana and Oryza sativa. The function of the proteins are unknown.¡€0€ª€0€ €CDD¡€ €Ʋ¢€0€0€ €ùpfam09715, Plasmod_dom_1, Plasmodium protein of unknown function (Plasmod_dom_1). These sequences represent an uncharacterized family consisting of a small number of hypothetical proteins of the malaria parasite Plasmodium falciparum (isolate 3D7).¡€0€ª€0€ €CDD¡€ €Ƴ¢€0€0€ €‚1pfam09716, ETRAMP, Malarial early transcribed membrane protein (ETRAMP). These sequences represent a family of proteins from the malaria parasite Plasmodium falciparum, several of which have been shown to be expressed specifically in the ring stage as well as the rodent parasite Plasmodium yoelii. A homolog from Plasmodium chabaudi was localized to the parasitophorous vacuole membrane. Members have an initial hydrophobic, Phe/Tyr-rich, stretch long enough to span the membrane, a highly charged region rich in Lys, a second putative transmembrane region and a second highly charged, low complexity sequence region. Some members have up to 100 residues of additional C-terminal sequence. These genes have been shown to be found in the sub-telomeric regions of both Plasmodium falciparum and P. yoelii chromosomes.¡€0€ª€0€ €CDD¡€ €Æ´¢€0€0€ €‚¥pfam09717, CPW_WPC, Plasmodium falciparum domain of unknown function (CPW_WPC). This group of sequences is defined by a domain of about 61 residues in length with six well-conserved cysteine residues and six well-conserved aromatic sites. The domain can be found in tandem repeats, and is known so far only in Plasmodium falciparum. It is named for motifs of CPxxW and (less well conserved) WPC. Its function is unknown.¡€0€ª€0€ €CDD¡€ €Ƶ¢€0€0€ €‚Tpfam09718, Tape_meas_lam_C, Lambda phage tail tape-measure protein (Tape_meas_lam_C). This represents a relatively well-conserved region near the C terminus of the tape measure protein of a lambda and related phage. The protein, which controls phage tail length, is typically about 1000 residues in length. Both low-complexity sequence and insertion/deletion events appear common in this family. Mutational studies suggest a ruler or template role in the determination of phage tail length. Similar behaviour is attributed to proteins from distantly related or unrelated families in other phage.¡€0€ª€0€ €CDD¡€ €ƶ¢€0€0€ €‚4pfam09719, C_GCAxxG_C_C, Putative redox-active protein (C_GCAxxG_C_C). This entry represents a putative redox-active protein of about 140 residues, with four perfectly conserved Cys residues. It includes a CGAXXG motif. Most members are found within one or two loci of transporter or oxidoreductase genes. A member from Geobacter sulfurreducens, located in a molybdenum transporter operon, has a TAT (twin-arginine translocation) signal sequence for Sec-independent transport across the plasma membrane, a hallmark of bound prosthetic groups such as FeS clusters.¡€0€ª€0€ €CDD¡€ €Æ·¢€0€0€ €‚Ypfam09720, Unstab_antitox, Putative addiction module component. This entry defines several short bacterial proteins, typically about 75 amino acids long, which are always found as part of a pair (at least) of small genes. The other protein in the pair always belongs to a family of plasmid stabilisation proteins (IPR007712). It is likely that this protein and its partner comprise some form of addiction module - a pair of genes consisting of a stable toxin and an unstable antitoxin which mediate programmed cell death - although these gene-pairs are usually found on the bacterial main chromosome.¡€0€ª€0€ €CDD¡€ €Ƹ¢€0€0€ €‚Jpfam09721, Exosortase_EpsH, Transmembrane exosortase (Exosortase_EpsH). Members of this family are designated exosortase, analogous to sortase in cell wall sorting mediated by LPXTG domains in Gram-positive bacteria. The phylogenetic distribution of the proteins in this entry is nearly perfectly correlated with the distribution of the proteins having the PEP-CTERM anchor motif, IPR013424. Members of this entry are integral membrane proteins with eight predicted transmembrane helices in common. Some members of this family have long trailing sequences past the region described by this model. This model does not include the region of the first predicted transmembrane region. The best characterized member is EpsH of Methylobacillus sp. 12S, where it is part of a locus associated with biosynthesis of the exopolysaccharide methanol-an.¡€0€ª€0€ €CDD¡€ €ƹ¢€0€0€ €Þpfam09722, DUF2384, Protein of unknown function (DUF2384). Proteins in this family are found almost exclusively in the Proteobacteria, but also in Gloeobacter violaceus PCC 7421, a cyanobacterium. The function is unknown.¡€0€ª€0€ €CDD¡€ €ƺ¢€0€0€ €‚Çpfam09723, Zn-ribbon_8, Zinc ribbon domain. This entry represents a region of about 41 amino acids found in a number of small proteins in a wide range of bacteria. The region usually begins with the initiator Met and contains two CxxC motifs separated by 17 amino acids. One protein in this entry has been noted as a putative regulatory protein, designated FmdB. Most proteins in this entry have a C-terminal region containing highly degenerate sequence.¡€0€ª€0€ €CDD¡€ €Æ»¢€0€0€ €ápfam09724, DUF2036, Uncharacterized conserved protein (DUF2036). This family of proteins includes members ranging in size from approximately 300 to 460 residues. There are a number of well-conserved domains along the length.¡€0€ª€0€ €CDD¡€ €Ƽ¢€0€0€ €‚pfam09725, Fra10Ac1, Folate-sensitive fragile site protein Fra10Ac1. This entry represents the full-length proteins in which, in higher eukaryotes, the nested domain EDSLL lies. Fra10Ac1 is a highly conserved protein, of unknown function that is nuclear and highly expressed in brain.¡€0€ª€0€ €CDD¡€ €ƽ¢€0€0€ €lpfam09726, Macoilin, Transmembrane protein. This entry is a highly conserved protein present in eukaryotes.¡€0€ª€0€ €CDD¡€ €ƾ¢€0€0€ €‚Žpfam09727, CortBP2, Cortactin-binding protein-2. This entry is the first approximately 250 residues of cortactin-binding protein 2. In addition to being a positional candidate for autism this protein is expressed at highest levels in the brain in humans. The human protein has six associated ankyrin repeat domains pfam00023 towards the C-terminus which act as protein-protein interaction domains.¡€0€ª€0€ €CDD¡€ €Æ¿¢€0€0€ €‚&pfam09728, Taxilin, Myosin-like coiled-coil protein. Taxilin contains an extraordinarily long coiled-coil domain in its C-terminal half and is ubiquitously expressed. It is a novel binding partner of several syntaxin family members and is possibly involved in Ca2+-dependent exocytosis in neuroendocrine cells. Gamma-taxilin, described as leucine zipper protein Factor Inhibiting ATF4-mediated Transcription (FIAT), localizes to the nucleus in osteoblasts and dimerizes with ATF4 to form inactive dimers, thus inhibiting ATF4-mediated transcription.¡€0€ª€0€ €CDD¡€ €ÆÀ¢€0€0€ €‚)pfam09729, Gti1_Pac2, Gti1/Pac2 family. In S. pombe the gti1 protein promotes the onset of gluconate uptake upon glucose starvation. In S. pombe the Pac2 protein controls the onset of sexual development, by inhibiting the expression of ste11, in a pathway that is independent of the cAMP cascade.¡€0€ª€0€ €CDD¡€ €ÆÁ¢€0€0€ €‚pfam09730, BicD, Microtubule-associated protein Bicaudal-D. BicD proteins consist of three coiled-coiled domains and are involved in dynein-mediated minus end-directed transport from the Golgi apparatus to the endoplasmic reticulum (ER). For full functioning they bind with GSK-3beta pfam05350 to maintain the anchoring of microtubules to the centromere. It appears that amino-acid residues 437-617 of BicD and the kinase activity of GSK-3 are necessary for the formation of a complex between BicD and GSK-3beta in intact cells.¡€0€ª€0€ €CDD¡€ €ÆÂ¢€0€0€ €‚pfam09731, Mitofilin, Mitochondrial inner membrane protein. Mitofilin controls mitochondrial cristae morphology. Mitofilin is enriched in the narrow space between the inner boundary and the outer membranes, where it forms a homotypic interaction and assembles into a large multimeric protein complex. The first 78 amino acids contain a typical amino-terminal-cleavable mitochondrial presequence rich in positive-charged and hydroxylated residues and a membrane anchor domain. In addition, it has three centrally located coiled coil domains.¡€0€ª€0€ €CDD¡€ €ÆÃ¢€0€0€ €‚pfam09732, CactinC_cactus, Cactus-binding C-terminus of cactin protein. CactinC_cactus is the C-terminal 200 residues of the cactin protein which are necessary for the association of cactin with IkappaB-cactus as one of the intracellular members of the Rel complex. The Rel (NF-kappaB) pathway is conserved in invertebrates and vertebrates. In mammals, it controls the activities of the immune and inflammatory response genes as well as viral genes, and is critical for cell growth and survival. In Drosophila, the Rel pathway functions in the innate cellular and humoral immune response, in muscle development, and in the establishment of dorsal-ventral polarity in the early embryo. Most members of the family also have a Cactin_mid domain pfam10312 further upstream.¡€0€ª€0€ €CDD¡€ €ÆÄ¢€0€0€ €‚ïpfam09733, VEFS-Box, VEFS-Box of polycomb protein. The VEFS-Box (VRN2-EMF2-FIS2-Su(z)12) box is the C-terminal region of these proteins, characterized by an acidic cluster and a tryptophan/methionine-rich sequence, the acidic-W/M domain. Some of these sequences are associated with a zinc-finger domain about 100 residues towards the N-terminus. This protein is one of the polycomb cluster of proteins which control HOX gene transcription as it functions in heterochromatin-mediated repression.¡€0€ª€0€ €CDD¡€ €ÆÅ¢€0€0€ €‚bpfam09734, Tau95, RNA polymerase III transcription factor (TF)IIIC subunit. TFIIIC1 is a multisubunit DNA binding factor that serves as a dynamic platform for assembly of pre-initiation complexes on class III genes. This entry represents the tau 95 subunit which holds a key position in TFIIIC, exerting both upstream and downstream influence on the TFIIIC-DNA complex by rendering the complex more stable. Once bound to tDNA-intragenic promoter elements, TFIIIC directs the assembly of TFIIIB on the DNA, which in turn recruits the RNA polymerase III (pol III) and activates multiple rounds of transcription.¡€0€ª€0€ €CDD¡€ €ÆÆ¢€0€0€ €‚ pfam09735, Nckap1, Membrane-associated apoptosis protein. Expression of this protein was found to be markedly reduced in patients with Alzheimer's disease. It is involved in the regulation of actin polymerization in the brain as part of a WAVE2 signalling complex.¡€0€ª€0€ €CDD¡€ €ÆÇ¢€0€0€ €‚xpfam09736, Bud13, Pre-mRNA-splicing factor of RES complex. This entry is characterized by proteins with alternating conserved and low-complexity regions. Bud13 together with Snu17p and a newly identified factor, Pml1p/Ylr016c, form a novel trimeric complex. called The RES complex, pre-mRNA retention and splicing complex. Subunits of this complex are not essential for viability of yeasts but they are required for efficient splicing in vitro and in vivo. Furthermore, inactivation of this complex causes pre-mRNA leakage from the nucleus. Bud13 contains a unique, phylogenetically conserved C-terminal region of unknown function.¡€0€ª€0€ €CDD¡€ €ÆÈ¢€0€0€ €‚pfam09737, Det1, De-etiolated protein 1 Det1. This is the C-terminal conserved 400 residues of Det1 proteins of approximately 550 amino acids. Det1 (de-etiolated-1) is an essential negative regulator of plant light responses, and it is a component of the Arabidopsis CDD complex containing DDB1 and COP10 ubiquitin E2 variant. Mammalian Det1 forms stable DDD-E2 complexes, consisting of DDB1, DDA1 (DET1, DDB1 Associated 1), and a member of the UBE2E group of canonical ubiquitin conjugating enzymes and modulates Cul4A function.¡€0€ª€0€ €CDD¡€ €ÆÉ¢€0€0€ €‚Upfam09738, LRRFIP, LRRFIP family. LRRFIP1 is a transcriptional repressor which preferentially binds to the GC-rich consensus sequence (5'- AGCCCCCGGCG-3') and may regulate expression of TNF, EGFR and PDGFA. LRRFIP2 may function as activator of the canonical Wnt signalling pathway, in association with DVL3, upstream of CTNNB1/beta-catenin.¡€0€ª€0€ €CDD¡€ €ÆÊ¢€0€0€ €‚œpfam09739, MCM_bind, Mini-chromosome maintenance replisome factor. This entry is of proteins of approximately 600 residues in length containing alternating regions of conservation and low complexity. The Arabidopsis protein is a replisome factor found to bind with the mini-chromosome maintenance, MCM-binding, complex and is crucial for efficient DNA replication. The family now spans the full-length proteins.¡€0€ª€0€ €CDD¡€ €ÆË¢€0€0€ €épfam09740, DUF2043, Uncharacterized conserved protein (DUF2043). This is a 100 residue conserved region of a family of proteins found from fungi to humans. This region contains three conserved Cysteines and a motif of {CP}{y/l}{HG}.¡€0€ª€0€ €CDD¡€ €ÆÌ¢€0€0€ €øpfam09741, DUF2045, Uncharacterized conserved protein (DUF2045). This entry is the conserved 250 residues of proteins of approximately 450 amino acids. It contains several highly conserved motifs including a CVxLxxxD motif.The function is unknown.¡€0€ª€0€ €CDD¡€ €ÆÍ¢€0€0€ €‚pfam09742, Dymeclin, Dyggve-Melchior-Clausen syndrome protein. Dymeclin (Dyggve-Melchior-Clausen syndrome protein) contains a large number of leucine and isoleucine residues and a total of 17 repeated dileucine motifs. It is characteristically about 700 residues long and present in plants and animals. Mutations in the gene coding for this protein in humans give rise to the disorder Dyggve-Melchior-Clausen syndrome (DMC, MIM 223800) which is an autosomal-recessive disorder characterized by the association of a spondylo-epi-metaphyseal dysplasia and mental retardation. DYM transcripts are widely expressed throughout human development and Dymeclin is not an integral membrane protein of the ER, but rather a peripheral membrane protein dynamically associated with the Golgi apparatus.¡€0€ª€0€ €CDD¡€ €ÆÎ¢€0€0€ €Äpfam09743, DUF2042, Uncharacterized conserved protein (DUF2042). This entry is the conserved N-terminal 300 residues of a group of proteins found from protozoa to Humans. The function is unknown.¡€0€ª€0€ €CDD¡€ €ÆÏ¢€0€0€ €ïpfam09744, Jnk-SapK_ap_N, JNK_SAPK-associated protein-1. This is the N-terminal 200 residues of a set of proteins conserved from yeasts to humans. Most of the proteins in this entry have an RhoGEF pfam00621 domain at their C-terminal end.¡€0€ª€0€ €CDD¡€ €ÆÐ¢€0€0€ €þpfam09745, DUF2040, Coiled-coil domain-containing protein 55 (DUF2040). This entry is a conserved domain of approximately 130 residues of proteins conserved from fungi to humans. The proteins do contain a coiled-coil domain, but the function is unknown.¡€0€ª€0€ €CDD¡€ €ÆÑ¢€0€0€ €‚~pfam09746, Membralin, tumor-associated protein. Membralin is evolutionarily highly conserved; though it seems to represent a unique protein family. The protein appears to contain several transmembrane regions. In humans it is expressed in certain cancers, particularly ovarian cancers. Membralin-like gene homologs have been identified in plants including grape, cotton and tomato.¡€0€ª€0€ €CDD¡€ €ÆÒ¢€0€0€ €×pfam09747, DUF2052, Coiled-coil domain containing protein (DUF2052). This entry is of sequences of two conserved domains separated by a region of low complexity, spanning some 200 residues. The function is unknown.¡€0€ª€0€ €CDD¡€ €ÆÓ¢€0€0€ €‚ápfam09748, Med10, Transcription factor subunit Med10 of Mediator complex. Med10 is one of the protein subunits of the Mediator complex, tethered to Rgr1 protein. The Mediator complex is required for the transcription of most RNA polymerase II (Pol II)-transcribed genes. Med10 specifically mediates basal-level HIS4 transcription via Gcn4, and, additionally, there is a putative requirement for Med10 in Bas2-mediated transcription. Med10 is part of the middle region of Mediator.¡€0€ª€0€ €CDD¡€ €ÆÔ¢€0€0€ €Øpfam09749, HVSL, Uncharacterized conserved protein. This entry is of proteins of approximately 300 residues conserved from plants to humans. It contains two conserved motifs, HxSL and FHVSL. The function is unknown.¡€0€ª€0€ €CDD¡€ €ÆÕ¢€0€0€ €‚ pfam09750, DRY_EERY, Alternative splicing regulator. This entry represents the conserved N-terminal region of SWAP (suppressor-of-white-apricot protein) proteins. This region contains two highly conserved motifs, viz: DRY and EERY, which appear to be the sites for alternative splicing of exons 2 and 3 of the SWAP mRNA. These proteins are thus thought to be involved in auto-regulation of pre-mRNA splicing. Most family members are associated with two Surp domains pfam01805 and an Arginine- serine-rich binding region towards the C-terminus.¡€0€ª€0€ €CDD¡€ €ÆÖ¢€0€0€ €‚ôpfam09751, Es2, Nuclear protein Es2. This entry is of a family of proteins of approximately 500 residues with alternating regions of low complexity and conservation where the domain similarities are strong. Apart from a predicted coiled-coil domain, no other known functional domains have been characterized. The protein appears to be expressed in the nucleus and particularly highly in the pons sub-region of the brain. The protein is clearly necessary for normal development of the nervous system.¡€0€ª€0€ €CDD¡€ €Æ×¢€0€0€ €•pfam09752, DUF2048, Abhydrolase domain containing 18. The proteins in this family are conserved from plants to vertebrates. The function is unknown.¡€0€ª€0€ €CDD¡€ €`K¢€0€0€ €‚>pfam09753, Use1, Membrane fusion protein Use1. This entry is of a family of proteins all approximately 300 residues in length. The proteins have a single C-terminal trans-membrane domain and a SNARE [soluble NSF (N-ethylmaleimide-sensitive fusion protein) attachment protein receptor] domain of approximately 60 residues. The SNARE domains are essential for membrane fusion and are conserved from yeasts to humans. Use1 is one of the three protein subunits that make up the SNARE complex and it is specifically required for Golgi-endoplasmic reticulum retrograde transport.¡€0€ª€0€ €CDD¡€ €ÆØ¢€0€0€ €‚¬pfam09754, PAC2, PAC2 family. This PAC2 (Proteasome assembly chaperone) family of proteins is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 247 and 307 amino acids in length. These proteins function as a chaperone for the 26S proteasome. The 26S proteasome mediates ubiquitin-dependent proteolysis in eukaryotic cells. A number of studies including very recent ones have revealed that assembly of its 20S catalytic core particle is an ordered process that involves several conserved proteasome assembly chaperones (PACs). Two heterodimeric chaperones, PAC1-PAC2 and PAC3-PAC4, promote the assembly of rings composed of seven alpha subunits.¡€0€ª€0€ €CDD¡€ €ÆÙ¢€0€0€ €Èpfam09755, DUF2046, Uncharacterized conserved protein H4 (DUF2046). This is the conserved N-terminal 350 residues of a family of proteins of unknown function possibly containing a coiled-coil domain.¡€0€ª€0€ €CDD¡€ €ÆÚ¢€0€0€ €ªpfam09756, DDRGK, DDRGK domain. This is a family of proteins of approximately 300 residues, found in plants and vertebrates. They contain a highly conserved DDRGK motif.¡€0€ª€0€ €CDD¡€ €ÆÛ¢€0€0€ €‚cpfam09757, Arb2, Arb2 domain. A second fission yeast Argonaute complex (Argonaute siRNA chaperone, ARC) that contains two previously uncharacterized proteins, Arb1 and Arb2, both of which are required for histone H3 Lys9 (H3-K9) methylation, heterochromatin assembly and siRNA generation. This family includes a region found in Arb2 and the Hda1 protein.¡€0€ª€0€ €CDD¡€ €ÆÜ¢€0€0€ €Ópfam09758, FPL, Uncharacterized conserved protein. This entry represents an N-terminal region of approximately 150 residues of a family of proteins of unknown function. It contains a highly conserved FPL motif.¡€0€ª€0€ €CDD¡€ €ÆÝ¢€0€0€ €‚ pfam09759, Atx10homo_assoc, Spinocerebellar ataxia type 10 protein domain. This is the conserved C-terminal 100 residues of Ataxin-10. Ataxin-10 belongs to the family of armadillo repeat proteins and in solution it tends to form homotrimeric complexes, which associate via a tip-to-tip association in a horseshoe-shaped contact with the concave sides of the molecules facing each other. This domain may represent the homo-association site since that is located near the C-terminus of Ataxin-10. The protein does not contain a signal sequence for secretion or any subcellular compartment confirming its cytoplasmic localization, specifically to the olivocerebellar region.¡€0€ª€0€ €CDD¡€ €ÆÞ¢€0€0€ €‚ pfam09762, KOG2701, Coiled-coil domain-containing protein (DUF2037). This entry represents the conserved N-terminal 200 residues of a family of proteins conserved from plants to vertebrates. In Drosophila it comes from the Fidipidine gene, and is of unknown function.¡€0€ª€0€ €CDD¡€ €Æß¢€0€0€ €‚ÿpfam09763, Sec3_C, Exocyst complex component Sec3. This entry is the conserved middle and C-terminus of the Sec3 protein. Sec3 binds to the C-terminal cytoplasmic domain of GLYT1 (glycine transporter protein 1). Sec3 is the exocyst component that is closest to the plasma membrane docking site and it serves as a spatial landmark in the plasma membrane for incoming secretory vesicles. Sec3 is recruited to the sites of polarised membrane growth through its interaction with Rho1p, a small GTP-binding protein.¡€0€ª€0€ €CDD¡€ €Æà¢€0€0€ €÷pfam09764, Nt_Gln_amidase, N-terminal glutamine amidase. This protein is conserved from plants to humans. It represents a family of N terminal glutamine amidases. The enzyme removes the NH2 group from a Gln, at the N-terminal, rendering it a Glu.¡€0€ª€0€ €CDD¡€ €Æá¢€0€0€ €‚Cpfam09765, WD-3, WD-repeat region. This entry is of a region of approximately 100 residues containing three WD repeats and six cysteine residues possibly as three cystine-bridges. These regions are contained within the Fancl protein in humans which is the putative E3 ubiquitin ligase subunit of the FA complex (Fanconi anaemia). Eight subunits of the Fanconi anaemia gene products form a multisubunit nuclear complex which is required for mono-ubiquitination of a downstream FA protein, FANCD2. The WD repeats are required for interaction with other subunits of the FA complex.¡€0€ª€0€ €CDD¡€ €Æâ¢€0€0€ €‚¢€0€0€ €¤pfam09870, DUF2097, Uncharacterized protein conserved in archaea (DUF2097). This domain, found in various hypothetical prokaryotic proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Ç?¢€0€0€ €¤pfam09871, DUF2098, Uncharacterized protein conserved in archaea (DUF2098). This domain, found in various hypothetical prokaryotic proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Ç@¢€0€0€ €¤pfam09872, DUF2099, Uncharacterized protein conserved in archaea (DUF2099). This domain, found in various hypothetical prokaryotic proteins, has no known function.¡€0€ª€0€ €CDD¡€ €ÇA¢€0€0€ €¡pfam09873, DUF2100, Uncharacterized protein conserved in archaea (DUF2100). This domain, found in various hypothetical archaeal proteins, has no known function.¡€0€ª€0€ €CDD¡€ €ÇB¢€0€0€ €pfam09874, DUF2101, Predicted membrane protein (DUF2101). This domain, found in various archaeal and bacterial proteins, has no known function.¡€0€ª€0€ €CDD¡€ €梀0€0€ €¡pfam09875, DUF2102, Uncharacterized protein conserved in archaea (DUF2102). This domain, found in various hypothetical archaeal proteins, has no known function.¡€0€ª€0€ €CDD¡€ €ÇC¢€0€0€ €¡pfam09876, DUF2103, Predicted metal-binding protein (DUF2103). This domain, found in various putative metal binding prokaryotic proteins, has no known function.¡€0€ª€0€ €CDD¡€ €ÇD¢€0€0€ €pfam09877, DUF2104, Predicted membrane protein (DUF2104). This domain, found in various hypothetical archaeal proteins, has no known function.¡€0€ª€0€ €CDD¡€ €`¿¢€0€0€ €pfam09878, DUF2105, Predicted membrane protein (DUF2105). This domain, found in various hypothetical archaeal proteins, has no known function.¡€0€ª€0€ €CDD¡€ €ÇE¢€0€0€ €pfam09879, DUF2106, Predicted membrane protein (DUF2106). This domain, found in various hypothetical archaeal proteins, has no known function.¡€0€ª€0€ €CDD¡€ €ÇF¢€0€0€ €pfam09880, DUF2107, Predicted membrane protein (DUF2107). This domain, found in various hypothetical archaeal proteins, has no known function.¡€0€ª€0€ €CDD¡€ €ÇG¢€0€0€ €pfam09881, DUF2108, Predicted membrane protein (DUF2108). This domain, found in various hypothetical archaeal proteins, has no known function.¡€0€ª€0€ €CDD¡€ €ÇH¢€0€0€ €pfam09882, DUF2109, Predicted membrane protein (DUF2109). This domain, found in various hypothetical archaeal proteins, has no known function.¡€0€ª€0€ €CDD¡€ €ÇI¢€0€0€ €¡pfam09883, DUF2110, Uncharacterized protein conserved in archaea (DUF2110). This domain, found in various hypothetical archaeal proteins, has no known function.¡€0€ª€0€ €CDD¡€ €ÇJ¢€0€0€ €¡pfam09884, DUF2111, Uncharacterized protein conserved in archaea (DUF2111). This domain, found in various hypothetical archaeal proteins, has no known function.¡€0€ª€0€ €CDD¡€ €ÇK¢€0€0€ €¡pfam09885, DUF2112, Uncharacterized protein conserved in archaea (DUF2112). This domain, found in various hypothetical archaeal proteins, has no known function.¡€0€ª€0€ €CDD¡€ €ÇL¢€0€0€ €¡pfam09886, DUF2113, Uncharacterized protein conserved in archaea (DUF2113). This domain, found in various hypothetical archaeal proteins, has no known function.¡€0€ª€0€ €CDD¡€ €ÇM¢€0€0€ €¡pfam09887, DUF2114, Uncharacterized protein conserved in archaea (DUF2114). This domain, found in various hypothetical archaeal proteins, has no known function.¡€0€ª€0€ €CDD¡€ €ÇN¢€0€0€ €¡pfam09888, DUF2115, Uncharacterized protein conserved in archaea (DUF2115). This domain, found in various hypothetical archaeal proteins, has no known function.¡€0€ª€0€ €CDD¡€ €ÇO¢€0€0€ €åpfam09889, DUF2116, Uncharacterized protein containing a Zn-ribbon (DUF2116). This domain, found in various hypothetical archaeal proteins, has no known function. Structural modelling suggests this domain may bind nucleic acids.¡€0€ª€0€ €CDD¡€ €`Ë¢€0€0€ €¡pfam09890, DUF2117, Uncharacterized protein conserved in archaea (DUF2117). This domain, found in various hypothetical archaeal proteins, has no known function.¡€0€ª€0€ €CDD¡€ €ÇP¢€0€0€ €¡pfam09891, DUF2118, Uncharacterized protein conserved in archaea (DUF2118). This domain, found in various hypothetical archaeal proteins, has no known function.¡€0€ª€0€ €CDD¡€ €ÇQ¢€0€0€ €¡pfam09892, DUF2119, Uncharacterized protein conserved in archaea (DUF2119). This domain, found in various hypothetical archaeal proteins, has no known function.¡€0€ª€0€ €CDD¡€ €ÇR¢€0€0€ €¡pfam09893, DUF2120, Uncharacterized protein conserved in archaea (DUF2120). This domain, found in various hypothetical archaeal proteins, has no known function.¡€0€ª€0€ €CDD¡€ €ÇS¢€0€0€ €¡pfam09894, DUF2121, Uncharacterized protein conserved in archaea (DUF2121). This domain, found in various hypothetical archaeal proteins, has no known function.¡€0€ª€0€ €CDD¡€ €ÇT¢€0€0€ €‰pfam09895, DUF2122, RecB-family nuclease (DUF2122). This domain, found in various hypothetical archaeal proteins, has no known function.¡€0€ª€0€ €CDD¡€ €ÇU¢€0€0€ €¡pfam09897, DUF2124, Uncharacterized protein conserved in archaea (DUF2124). This domain, found in various hypothetical archaeal proteins, has no known function.¡€0€ª€0€ €CDD¡€ €ÇV¢€0€0€ €¥pfam09898, DUF2125, Uncharacterized protein conserved in bacteria (DUF2125). This domain, found in various hypothetical prokaryotic proteins, has no known function.¡€0€ª€0€ €CDD¡€ €ÇW¢€0€0€ €‚ßpfam09899, DUF2126, Putative amidoligase enzyme (DUF2126). Members of this family of bacterial domains are predominantly found in transglutaminase and transglutaminase-like proteins. Their exact function is, as yet, unknown, but they are likely to act as amidoligase enzymes Protein in this family are found in conserved gene neighborhoods encoding a glutamine amidotransferase-like thiol peptidase (in proteobacteria) or an Aig2 family cyclotransferase protein (in firmicutes).¡€0€ª€0€ €CDD¡€ €ÇX¢€0€0€ €Ÿpfam09900, DUF2127, Predicted membrane protein (DUF2127). This domain, found in various hypothetical prokaryotic and archaeal proteins, has no known function.¡€0€ª€0€ €CDD¡€ €ÇY¢€0€0€ €çpfam09902, DUF2129, Uncharacterized protein conserved in bacteria (DUF2129). This domain, found in various hypothetical prokaryotic proteins, has no known function. Structural modelling suggests this domain may bind nucleic acids.¡€0€ª€0€ €CDD¡€ €ÇZ¢€0€0€ €¥pfam09903, DUF2130, Uncharacterized protein conserved in bacteria (DUF2130). This domain, found in various hypothetical prokaryotic proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Ç[¢€0€0€ €œpfam09904, HTH_43, Winged helix-turn helix. This family, found in various hypothetical prokaryotic proteins, is a probable winged helix DNA-binding domain.¡€0€ª€0€ €CDD¡€ €Ç\¢€0€0€ €æpfam09905, VF530, DNA-binding protein VF530. VF530 contains a unique four-helix motif that shows some similarity to the C-terminal double-stranded DNA (dsDNA) binding domain of RecA, as well as other nucleic acid binding domains.¡€0€ª€0€ €CDD¡€ €Ç]¢€0€0€ €¥pfam09906, DUF2135, Uncharacterized protein conserved in bacteria (DUF2135). This domain, found in various hypothetical prokaryotic proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Ç^¢€0€0€ €‚Žpfam09907, HigB_toxin, HigB_toxin, RelE-like toxic component of a toxin-antitoxin system. HigB_toxin is a family of RelE-like prokaryotic proteins that function as mRNA interferases. HigB cleaves translated mRNA only, and cleavage depended on translation of the target RNAs. HigB belongs to the RelE super-family of RNases. The toxin-antitoxin gene-pair is induced by environmental stress factors.¡€0€ª€0€ €CDD¡€ €Ç_¢€0€0€ €¥pfam09909, DUF2138, Uncharacterized protein conserved in bacteria (DUF2138). This domain, found in various hypothetical prokaryotic proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Ç`¢€0€0€ €¡pfam09910, DUF2139, Uncharacterized protein conserved in archaea (DUF2139). This domain, found in various hypothetical archaeal proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Ça¢€0€0€ €¥pfam09911, DUF2140, Uncharacterized protein conserved in bacteria (DUF2140). This domain, found in various hypothetical prokaryotic proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Çb¢€0€0€ €¥pfam09912, DUF2141, Uncharacterized protein conserved in bacteria (DUF2141). This domain, found in various hypothetical prokaryotic proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Çc¢€0€0€ €’pfam09913, DUF2142, Predicted membrane protein (DUF2142). This domain, found in various hypothetical prokaryotic proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Çd¢€0€0€ €¥pfam09916, DUF2145, Uncharacterized protein conserved in bacteria (DUF2145). This domain, found in various hypothetical prokaryotic proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Çe¢€0€0€ €¥pfam09917, DUF2147, Uncharacterized protein conserved in bacteria (DUF2147). This domain, found in various hypothetical prokaryotic proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Çf¢€0€0€ €Ëpfam09918, DUF2148, Uncharacterized protein containing a ferredoxin domain (DUF2148). This domain, found in various hypothetical bacterial proteins containing a ferredoxin domain, has no known function.¡€0€ª€0€ €CDD¡€ €Çg¢€0€0€ €™pfam09919, DUF2149, Uncharacterized conserved protein (DUF2149). This domain, found in various hypothetical prokaryotic proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Çh¢€0€0€ €¡pfam09920, DUF2150, Uncharacterized protein conserved in archaea (DUF2150). This domain, found in various hypothetical archaeal proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Çi¢€0€0€ €¡pfam09921, DUF2153, Uncharacterized protein conserved in archaea (DUF2153). This domain, found in various hypothetical archaeal proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Çj¢€0€0€ €Fpfam09922, DUF2154, Cell wall-active antibiotics response 4TMS YvqF. ¡€0€ª€0€ €CDD¡€ €Çk¢€0€0€ €¥pfam09923, DUF2155, Uncharacterized protein conserved in bacteria (DUF2155). This domain, found in various hypothetical prokaryotic proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Çl¢€0€0€ €™pfam09924, DUF2156, Uncharacterized conserved protein (DUF2156). This domain, found in various hypothetical prokaryotic proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Çm¢€0€0€ €’pfam09925, DUF2157, Predicted membrane protein (DUF2157). This domain, found in various hypothetical prokaryotic proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Çn¢€0€0€ €„pfam09926, DUF2158, Uncharacterized small protein (DUF2158). Members of this family of prokaryotic proteins have no known function.¡€0€ª€0€ €CDD¡€ €Ço¢€0€0€ €‚pfam09928, DUF2160, Predicted small integral membrane protein (DUF2160). The members of this family of hypothetical prokaryotic proteins have no known function. It is thought that they are transmembrane proteins, but their function has not been inferred yet.¡€0€ª€0€ €CDD¡€ €Çp¢€0€0€ €‚zpfam09929, DUF2161, Putative PD-(D/E)XK phosphodiesterase (DUF2161). This family of proteins is functionally uncharacterized. This family of proteins is found in prokaryotes. Advanced homology-detection methods supported with superfamily-wide domain architecture and horizontal gene transfer analyses s have established this family to be a member of the PD-(D/E)XK superfamily.¡€0€ª€0€ €CDD¡€ €Çq¢€0€0€ €Æpfam09930, DUF2162, Predicted transporter (DUF2162). Members of this family of bacterial proteins are thought to be membrane transporters, but their exact function has not, as yet, been elucidated.¡€0€ª€0€ €CDD¡€ €Çr¢€0€0€ €™pfam09931, DUF2163, Uncharacterized conserved protein (DUF2163). This domain, found in various hypothetical prokaryotic proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Çs¢€0€0€ €™pfam09932, DUF2164, Uncharacterized conserved protein (DUF2164). This domain, found in various hypothetical prokaryotic proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Çt¢€0€0€ €¡pfam09933, DUF2165, Predicted small integral membrane protein (DUF2165). This domain, found in various hypothetical prokaryotic proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Çu¢€0€0€ €¥pfam09935, DUF2167, Protein of unknown function (DUF2167). This domain, found in various hypothetical membrane-anchored prokaryotic proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Çv¢€0€0€ €‚3pfam09936, Methyltrn_RNA_4, SAM-dependent RNA methyltransferase. This family has a Rossmanoid fold, with a deep trefoil knot in its C-terminal region. It has structural similarity to RNA methyltransferases, and is likely to function as an S-adenosyl-L-methionine (SAM)-dependent RNA 2'-O methyltransferase.¡€0€ª€0€ €CDD¡€ €Çw¢€0€0€ €¥pfam09937, DUF2169, Uncharacterized protein conserved in bacteria (DUF2169). This domain, found in various hypothetical prokaryotic proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Çx¢€0€0€ €¥pfam09938, DUF2170, Uncharacterized protein conserved in bacteria (DUF2170). This domain, found in various hypothetical prokaryotic proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Çy¢€0€0€ €¥pfam09939, DUF2171, Uncharacterized protein conserved in bacteria (DUF2171). This domain, found in various hypothetical prokaryotic proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Çz¢€0€0€ €‚pfam09940, DUF2172, Domain of unknown function (DUF2172). This domain, found in various hypothetical prokaryotic proteins, has no known function. An aminopeptidase domain is conserved within the family, but its relevance has not been established yet. Rebuilding from Structure 3kt9 shows this is an inserted (nested domain within the amino-peptidase). The function of this small domain is not known.¡€0€ª€0€ €CDD¡€ €Ç{¢€0€0€ €™pfam09941, DUF2173, Uncharacterized conserved protein (DUF2173). This domain, found in various hypothetical prokaryotic proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Ç|¢€0€0€ €¡pfam09943, DUF2175, Uncharacterized protein conserved in archaea (DUF2175). This domain, found in various hypothetical archaeal proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Ç}¢€0€0€ €pfam09945, DUF2177, Predicted membrane protein (DUF2177). This domain, found in various hypothetical bacterial proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Ç~¢€0€0€ €pfam09946, DUF2178, Predicted membrane protein (DUF2178). This domain, found in various hypothetical archaeal proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Ç¢€0€0€ €Ûpfam09947, DUF2180, Uncharacterized protein conserved in archaea (DUF2180). This domain, found in various hypothetical archaeal proteins, has no known function. A few of the family members contain a zinc finger domain.¡€0€ª€0€ €CDD¡€ €`ü¢€0€0€ €Úpfam09948, DUF2182, Predicted metal-binding integral membrane protein (DUF2182). This domain, found in various hypothetical bacterial membrane proteins having predicted metal-binding properties, has no known function.¡€0€ª€0€ €CDD¡€ €Ç€¢€0€0€ €—pfam09949, DUF2183, Uncharacterized conserved protein (DUF2183). This domain, found in various hypothetical bacterial proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Ç¢€0€0€ €£pfam09950, DUF2184, Uncharacterized protein conserved in bacteria (DUF2184). This domain, found in various hypothetical bacterial proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Ç‚¢€0€0€ €‘pfam09951, DUF2185, Protein of unknown function (DUF2185). This domain, found in various hypothetical bacterial proteins, has no known function.¡€0€ª€0€ €CDD¡€ €ǃ¢€0€0€ €‚çpfam09952, AbiEi_2, Transcriptional regulator, AbiEi antitoxin, Type IV TA system. AbiEi_2 is the cognate antitoxin of the type IV toxin-antitoxin 'innate immunity' bacterial abortive infection (Abi) system that protects bacteria from the spread of a phage infection. The Abi system is activated upon infection with phage to abort the cell thus preventing the spread of phage through viral replication. There are some 20 or more Abis, and they are predominantly plasmid-encoded lactococcal systems. TA, toxin-antitoxin, systems on plasmids function by killing cells that lose the plasmid upon division. AbiE phage resistance systems function as novel Type IV TAs and are widespread in bacteria and archaea. The cognate antitoxin is pfam13338.¡€0€ª€0€ €CDD¡€ €a¢€0€0€ €£pfam09953, DUF2187, Uncharacterized protein conserved in bacteria (DUF2187). This domain, found in various hypothetical bacterial proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Ç„¢€0€0€ €£pfam09954, DUF2188, Uncharacterized protein conserved in bacteria (DUF2188). This domain, found in various hypothetical bacterial proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Ç…¢€0€0€ €õpfam09955, DUF2189, Predicted integral membrane protein (DUF2189). Members of this family are found in various hypothetical prokaryotic proteins, as well as putative cytochrome c oxidases. Their exact function has not, as yet, been established.¡€0€ª€0€ €CDD¡€ €dž¢€0€0€ €Ípfam09956, DUF2190, Uncharacterized conserved protein (DUF2190). This domain, found in various hypothetical prokaryotic proteins, as well as in some putative RecA/RadA recombinases, has no known function.¡€0€ª€0€ €CDD¡€ €LJ¢€0€0€ €‚apfam09957, VapB_antitoxin, Bacterial antitoxin of type II TA system, VapB. VapB is the antitoxin of a bacterial toxin-antitoxin gene pair. The cognate toxin is VapC, pfam05016. The family contains several related antitoxins from Cyanobacteria and Actinobacterial families. Antitoxins of this class carry an N-terminal ribbon-helix-helix domain, RHH, that is highly conserved across all type II bacterial antitoxins, which dimerizes with the RHH domain of a second VapB molecule. A hinge section follows the RHH, with an additional pair of flexible alpha helices at the C-terminus. This C-terminus is the Toxin-binding region of the dimer, and so is specific to the cognate toxin, whereas the RHH domain has the specific function of lying across the RNA-binding groove of the toxin dimer and inactivating the active-site - a more general function of all antitoxins.¡€0€ª€0€ €CDD¡€ €Lj¢€0€0€ €¡pfam09958, DUF2192, Uncharacterized protein conserved in archaea (DUF2192). This domain, found in various hypothetical archaeal proteins, has no known function.¡€0€ª€0€ €CDD¡€ €lj¢€0€0€ €¡pfam09959, DUF2193, Uncharacterized protein conserved in archaea (DUF2193). This domain, found in various hypothetical archaeal proteins, has no known function.¡€0€ª€0€ €CDD¡€ €ÇŠ¢€0€0€ €£pfam09960, DUF2194, Uncharacterized protein conserved in bacteria (DUF2194). This domain, found in various hypothetical bacterial proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Ç‹¢€0€0€ €£pfam09961, DUF2195, Uncharacterized protein conserved in bacteria (DUF2195). This domain, found in various hypothetical bacterial proteins, has no known function.¡€0€ª€0€ €CDD¡€ €a ¢€0€0€ €—pfam09962, DUF2196, Uncharacterized conserved protein (DUF2196). This domain, found in various hypothetical bacterial proteins, has no known function.¡€0€ª€0€ €CDD¡€ €ÇŒ¢€0€0€ €£pfam09963, DUF2197, Uncharacterized protein conserved in bacteria (DUF2197). This domain, found in various hypothetical bacterial proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Ç¢€0€0€ €£pfam09964, DUF2198, Uncharacterized protein conserved in bacteria (DUF2198). This domain, found in various hypothetical bacterial proteins, has no known function.¡€0€ª€0€ €CDD¡€ €ÇŽ¢€0€0€ €£pfam09965, DUF2199, Uncharacterized protein conserved in bacteria (DUF2199). This domain, found in various hypothetical bacterial proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Ç¢€0€0€ €£pfam09966, DUF2200, Uncharacterized protein conserved in bacteria (DUF2200). This domain, found in various hypothetical bacterial proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Ç¢€0€0€ €·pfam09967, DUF2201, VWA-like domain (DUF2201). This domain, found in various hypothetical bacterial proteins, has no known function. However, it is clearly related to the VWA domain.¡€0€ª€0€ €CDD¡€ €a¢€0€0€ €“pfam09968, DUF2202, Uncharacterized protein domain (DUF2202). This domain, found in various hypothetical archaeal proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Ç‘¢€0€0€ €—pfam09969, DUF2203, Uncharacterized conserved protein (DUF2203). This domain, found in various hypothetical bacterial proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Ç’¢€0€0€ €üpfam09970, DUF2204, Nucleotidyl transferase of unknown function (DUF2204). This domain, found in various hypothetical archaeal proteins, has no known function. However, this family was identified as belonging to the nucleotidyltransferase superfamily.¡€0€ª€0€ €CDD¡€ €Ç“¢€0€0€ €pfam09971, DUF2206, Predicted membrane protein (DUF2206). This domain, found in various hypothetical archaeal proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Ç”¢€0€0€ €pfam09972, DUF2207, Predicted membrane protein (DUF2207). This domain, found in various hypothetical bacterial proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Ç•¢€0€0€ €pfam09973, DUF2208, Predicted membrane protein (DUF2208). This domain, found in various hypothetical archaeal proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Ç–¢€0€0€ €¡pfam09974, DUF2209, Uncharacterized protein conserved in archaea (DUF2209). This domain, found in various hypothetical archaeal proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Ç—¢€0€0€ €npfam09976, TPR_21, Tetratricopeptide repeat-like domain. This family resembles a single unit of a TPR repeat.¡€0€ª€0€ €CDD¡€ €ǘ¢€0€0€ €ªpfam09977, Tad_C, Putative Tad-like Flp pilus-assembly. This domain, found in various hypothetical prokaryotic proteins, is likely to be involved in Flp lius biogenesis.¡€0€ª€0€ €CDD¡€ €Ç™¢€0€0€ €ûpfam09979, DUF2213, Uncharacterized protein conserved in bacteria (DUF2213). Members of this family of bacterial proteins comprise various hypothetical and phage-related proteins. The exact function of these proteins has not, as yet, been determined.¡€0€ª€0€ €CDD¡€ €Çš¢€0€0€ €pfam09980, DUF2214, Predicted membrane protein (DUF2214). This domain, found in various hypothetical bacterial proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Ç›¢€0€0€ €£pfam09981, DUF2218, Uncharacterized protein conserved in bacteria (DUF2218). This domain, found in various hypothetical bacterial proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Çœ¢€0€0€ €£pfam09982, DUF2219, Uncharacterized protein conserved in bacteria (DUF2219). This domain, found in various hypothetical bacterial proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Ç¢€0€0€ €Ôpfam09983, DUF2220, Uncharacterized protein conserved in bacteria C-term(DUF2220). This domain, found in various hypothetical bacterial proteins, has no known function. The family represents just the C-terminus.¡€0€ª€0€ €CDD¡€ €Çž¢€0€0€ €‚]pfam09984, DUF2222, Uncharacterized signal transduction histidine kinase domain (DUF2222). Members of this family of domains are found in various BarA-like signal transduction histidine kinases, which are involved in the regulation of carbon metabolism via the csrA/csrB regulatory system. The role of this domain has not, as yet, been established.¡€0€ª€0€ €CDD¡€ €ÇŸ¢€0€0€ €‚pfam09985, Glucodextran_C, C-terminal binding-module, SLH-like, of glucodextranase. Glucodextran_C is the C-terminal domain of glucodextranase-like proteins found in various prokaryotic membrane-anchored proteins. It shows homology to the carbohydrate-binding unit of some glycosidases.¡€0€ª€0€ €CDD¡€ €Ç ¢€0€0€ €£pfam09986, DUF2225, Uncharacterized protein conserved in bacteria (DUF2225). This domain, found in various hypothetical bacterial proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Ç¡¢€0€0€ €¡pfam09987, DUF2226, Uncharacterized protein conserved in archaea (DUF2226). This domain, found in various hypothetical archaeal proteins, has no known function.¡€0€ª€0€ €CDD¡€ €æ½¢€0€0€ €ápfam09988, DUF2227, Uncharacterized metal-binding protein (DUF2227). Members of this family of hypothetical bacterial proteins possess metal binding properties; however, their exact function has not, as yet, been determined.¡€0€ª€0€ €CDD¡€ €Ç¢¢€0€0€ €ôpfam09989, DUF2229, CoA enzyme activase uncharacterized domain (DUF2229). Members of this family include various bacterial hypothetical proteins, as well as CoA enzyme activases. The exact function of this domain has not, as yet, been defined.¡€0€ª€0€ €CDD¡€ €Ç£¢€0€0€ €pfam09990, DUF2231, Predicted membrane protein (DUF2231). This domain, found in various hypothetical bacterial proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Ǥ¢€0€0€ €‚xpfam09991, DUF2232, Predicted membrane protein (DUF2232). This family of bacterial proteins are multi-pass membrane proteins with up to 10 (2 x 4/5) transmembrane regions. The exact function of this potential pore molecule is not known, but in many instances it is associated with ABC-transporter-like domains, implying that it is part of a secretion system that uses energy.¡€0€ª€0€ €CDD¡€ €Ç¥¢€0€0€ €‚¾pfam09992, NAGPA, Phosphodiester glycosidase. This is a family conserved from bacteria to humans. The structure of a member from Bacteroides has been crystallized and modelled onto the luminal region of the human member of the family, the transmembrane glycoprotein N-acetylglucosamine-1-phosphodiester alpha-N-acetylglucosaminidase. There is some conservation of potentially functional residues, implying that in the bacterial members this family acts in some way as a phosphodiester glycosidase. The human protein is also present, so the eukaryotic members are likely to be catalyzing the second step in the formation of the mannose 6-phosphate targeting signal on lysosomal enzyme oligosaccharides.¡€0€ª€0€ €CDD¡€ €Ǧ¢€0€0€ €¡pfam09994, DUF2235, Uncharacterized alpha/beta hydrolase domain (DUF2235). This domain, found in various hypothetical bacterial proteins, has no known function.¡€0€ª€0€ €CDD¡€ €ǧ¢€0€0€ €‚-pfam09995, DUF2236, Uncharacterized protein conserved in bacteria (DUF2236). This domain, found in various hypothetical bacterial proteins, has no known function. This family contains a highly conserved arginine and histidine that may be active site residues for an as yet unknown catalytic activity.¡€0€ª€0€ €CDD¡€ €Ǩ¢€0€0€ €£pfam09996, DUF2237, Uncharacterized protein conserved in bacteria (DUF2237). This domain, found in various hypothetical bacterial proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Ç©¢€0€0€ €pfam09997, DUF2238, Predicted membrane protein (DUF2238). This domain, found in various hypothetical bacterial proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Ǫ¢€0€0€ €£pfam09998, DUF2239, Uncharacterized protein conserved in bacteria (DUF2239). This domain, found in various hypothetical bacterial proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Ç«¢€0€0€ €¡pfam09999, DUF2240, Uncharacterized protein conserved in archaea (DUF2240). This domain, found in various hypothetical archaeal proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Ǭ¢€0€0€ €‚%pfam10000, ACT_3, ACT domain. This domain, found in various hypothetical bacterial proteins, has no known function. However, its structure is similar to the ACT domain which suggests that it binds to amino acids and regulates other protein activity. This family was formerly known as DUF2241.¡€0€ª€0€ €CDD¡€ €Ç­¢€0€0€ €©pfam10001, DUF2242, Uncharacterized protein conserved in bacteria (DUF2242). This domain is found in various hypothetical bacterial proteins, and has no known function.¡€0€ª€0€ €CDD¡€ €a.¢€0€0€ €pfam10002, DUF2243, Predicted membrane protein (DUF2243). This domain, found in various hypothetical bacterial proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Ç®¢€0€0€ €¥pfam10003, DUF2244, Integral membrane protein (DUF2244). This domain, found in various bacterial hypothetical and putative membrane proteins, has no known function.¡€0€ª€0€ €CDD¡€ €ǯ¢€0€0€ €£pfam10004, DUF2247, Uncharacterized protein conserved in bacteria (DUF2247). This domain, found in various hypothetical bacterial proteins, has no known function.¡€0€ª€0€ €CDD¡€ €ǰ¢€0€0€ €–pfam10005, zinc-ribbon_6, zinc-ribbon domain. This family appears to be a true zinc-ribbon, with two sets of putative zinc-binding domains in tandem.¡€0€ª€0€ €CDD¡€ €DZ¢€0€0€ €“pfam10006, DUF2249, Uncharacterized conserved protein (DUF2249). Members of this family of hypothetical bacterial proteins have no known function.¡€0€ª€0€ €CDD¡€ €Dz¢€0€0€ €pfam10007, DUF2250, Uncharacterized protein conserved in archaea (DUF2250). Members of this family of hypothetical archaeal proteins have no known function.¡€0€ª€0€ €CDD¡€ €a4¢€0€0€ €Ÿpfam10008, DUF2251, Uncharacterized protein conserved in bacteria (DUF2251). Members of this family of hypothetical bacterial proteins have no known function.¡€0€ª€0€ €CDD¡€ €dz¢€0€0€ €£pfam10009, DUF2252, Uncharacterized protein conserved in bacteria (DUF2252). This domain, found in various hypothetical bacterial proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Ç´¢€0€0€ €Ûpfam10011, DUF2254, Predicted membrane protein (DUF2254). Members of this family of bacterial proteins comprises various hypothetical and putative membrane proteins. Their exact function, has not, as yet, been defined.¡€0€ª€0€ €CDD¡€ €ǵ¢€0€0€ €Ÿpfam10012, DUF2255, Uncharacterized protein conserved in bacteria (DUF2255). Members of this family of hypothetical bacterial proteins have no known function.¡€0€ª€0€ €CDD¡€ €Ƕ¢€0€0€ €Ÿpfam10013, DUF2256, Uncharacterized protein conserved in bacteria (DUF2256). Members of this family of hypothetical bacterial proteins have no known function.¡€0€ª€0€ €CDD¡€ €Ç·¢€0€0€ €¢pfam10014, 2OG-Fe_Oxy_2, 2OG-Fe dioxygenase. This family contains 2-oxoglutarate (2OG) and Fe-dependent dioxygenases. It includes L-isoleucine dioxygenase (IDO).¡€0€ª€0€ €CDD¡€ €Ǹ¢€0€0€ €àpfam10015, DUF2258, Uncharacterized protein conserved in archaea (DUF2258). Members of this family of hypothetical bacterial archaeal have no known function. Structural modelling suggests this domain may bind nucleic acids.¡€0€ª€0€ €CDD¡€ €ǹ¢€0€0€ €Œpfam10016, DUF2259, Predicted secreted protein (DUF2259). Members of this family of hypothetical bacterial proteins have no known function.¡€0€ª€0€ €CDD¡€ €a<¢€0€0€ €‚Èpfam10017, Methyltransf_33, Histidine-specific methyltransferase, SAM-dependent. The mycobacterial members of this family are expressed from part of the ergothioneine biosynthetic gene cluster. EGTD is the histidine methyltransferase that transfers three methyl groups to the alpha-amino moiety of histidine, in the first stage of the production of this histidine betaine derivative that carries a thiol group attached to the C2 atom of an imidazole ring.¡€0€ª€0€ €CDD¡€ €Ǻ¢€0€0€ €‚¸pfam10018, Med4, Vitamin-D-receptor interacting Mediator subunit 4. Members of this family function as part of the Mediator (Med) complex, which links DNA-bound transcriptional regulators and the general transcription machinery, particularly the RNA polymerase II enzyme. They play a role in basal transcription by mediating activation or repression according to the specific complement of transcriptional regulators bound to the promoter.¡€0€ª€0€ €CDD¡€ €a>¢€0€0€ €£pfam10020, DUF2262, Uncharacterized protein conserved in bacteria (DUF2262). This domain, found in various hypothetical bacterial proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Ç»¢€0€0€ €²pfam10021, DUF2263, Uncharacterized protein conserved in bacteria (DUF2263). This domain, found in various hypothetical bacterial and eukaryotic proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Ǽ¢€0€0€ €Ÿpfam10022, DUF2264, Uncharacterized protein conserved in bacteria (DUF2264). Members of this family of hypothetical bacterial proteins have no known function.¡€0€ª€0€ €CDD¡€ €ǽ¢€0€0€ €­pfam10023, Aminopep, Putative aminopeptidase. This family of bacterial proteins has a conserved HEXXH motif, suggesting that members are putative peptidases of zincin fold.¡€0€ª€0€ €CDD¡€ €Ǿ¢€0€0€ €—pfam10025, DUF2267, Uncharacterized conserved protein (DUF2267). This domain, found in various hypothetical bacterial proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Ç¿¢€0€0€ €Äpfam10026, DUF2268, Predicted Zn-dependent protease (DUF2268). This domain, found in various hypothetical bacterial proteins, as well as predicted zinc dependent proteases, has no known function.¡€0€ª€0€ €CDD¡€ €ÇÀ¢€0€0€ €§pfam10027, DUF2269, Predicted integral membrane protein (DUF2269). Members of this family of bacterial hypothetical integral membrane proteins have no known function.¡€0€ª€0€ €CDD¡€ €ÇÁ¢€0€0€ €™pfam10028, DUF2270, Predicted integral membrane protein (DUF2270). This domain, found in various hypothetical bacterial proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Ç¢€0€0€ €¹pfam10029, DUF2271, Predicted periplasmic protein (DUF2271). This domain, found in various hypothetical bacterial proteins and misannotated lysozyme proteins, it has no known function.¡€0€ª€0€ €CDD¡€ €Çâ€0€0€ €‚pfam10030, DUF2272, Uncharacterized protein conserved in bacteria (DUF2272). Members of this family of hypothetical bacterial proteins have no known function. However, given its similarity to the CHAP domain it seems likely that this is an enzyme involved in cleaving peptidoglycan.¡€0€ª€0€ €CDD¡€ €ÇÄ¢€0€0€ €‘pfam10031, DUF2273, Small integral membrane protein (DUF2273). Members of this family of hypothetical bacterial proteins have no known function.¡€0€ª€0€ €CDD¡€ €ÇÅ¢€0€0€ €Ãpfam10032, Pho88, Phosphate transport (Pho88). Members of this family of proteins are involved in regulating inorganic phosphate transport, as well as telomere length regulation and maintenance.¡€0€ª€0€ €CDD¡€ €ÇÆ¢€0€0€ €‚mpfam10033, ATG13, Autophagy-related protein 13. Members of this family of phosphoproteins are involved in cytoplasm to vacuole transport (Cvt), and more specifically in Cvt vesicle formation. They are probably involved in the switching machinery regulating the conversion between the Cvt pathway and autophagy. Finally, ATG13 is also required for glycogen storage.¡€0€ª€0€ €CDD¡€ €ÇÇ¢€0€0€ €‚+pfam10034, Dpy19, Q-cell neuroblast polarisation. Dyp-19, formerly known as DUF2211, is a transmembrane domain family that is required to orient the neuroblast cells, QR and QL accurately on the anterior-posterior axis: QL and QR are born in the same anterior-posterior position, but polarise and migrate left-right asymmetrically, QL migrating towards the posterior and QR migrating towards the anterior. It is also required, with unc-40, to express mab-5 correctly in the Q cell descendants. The Dpy-19 protein derives from the C. elegans DUMPY mutant.¡€0€ª€0€ €CDD¡€ €ÇÈ¢€0€0€ €£pfam10035, DUF2179, Uncharacterized protein conserved in bacteria (DUF2179). This domain, found in various hypothetical bacterial proteins, has no known function.¡€0€ª€0€ €CDD¡€ €ÇÉ¢€0€0€ €‚Jpfam10036, RLL, Putative carnitine deficiency-associated protein. This family of proteins conserved from nematodes to humans is of approximately 250 amino acids. It is purported to be carnitine deficiency-associated protein but this could not be confirmed. It carries a characteristic RLL sequence-motif. The function is unknown.¡€0€ª€0€ €CDD¡€ €ÇÊ¢€0€0€ €‚gpfam10037, MRP-S27, Mitochondrial 28S ribosomal protein S27. Members of this family of small ribosomal proteins possess one of three conserved blocks of sequence found in proteins that stimulate the dissociation of guanine nucleotides from G-proteins, leaving open the possibility that MRP-S27 might be a functional partner of GTP-binding ribosomal proteins.¡€0€ª€0€ €CDD¡€ €ÇË¢€0€0€ €pfam10038, DUF2274, Protein of unknown function (DUF2274). Members of this family of hypothetical bacterial proteins have no known function.¡€0€ª€0€ €CDD¡€ €ÇÌ¢€0€0€ €Àpfam10039, DUF2275, Predicted integral membrane protein (DUF2275). This domain, found in various hypothetical bacterial proteins and in the RNA polymerase sigma factor, has no known function.¡€0€ª€0€ €CDD¡€ €aQ¢€0€0€ €—pfam10040, DUF2276, Uncharacterized conserved protein (DUF2276). This domain, found in various hypothetical bacterial proteins, has no known function.¡€0€ª€0€ €CDD¡€ €ÇÍ¢€0€0€ €“pfam10041, DUF2277, Uncharacterized conserved protein (DUF2277). Members of this family of hypothetical bacterial proteins have no known function.¡€0€ª€0€ €CDD¡€ €Ç΢€0€0€ €“pfam10042, DUF2278, Uncharacterized conserved protein (DUF2278). Members of this family of hypothetical bacterial proteins have no known function.¡€0€ª€0€ €CDD¡€ €ÇÏ¢€0€0€ €—pfam10043, DUF2279, Predicted periplasmic lipoprotein (DUF2279). This domain, found in various hypothetical bacterial proteins, has no known function.¡€0€ª€0€ €CDD¡€ €ÇТ€0€0€ €‚{pfam10044, LIN52, Retinal tissue protein. LIN52 is a family of proteins of approximately 112 amino acids in length which is conserved from nematodes to humans. The proposed tertiary structure is of almost entirely alpha helix interrupted only by loops located at proline residues. Three sites in the protein sequence reveal two types of possible post-translation modification. A serine residue, at position 41, is a candidate for protein kinase C phosphorylation. Glycine residues at position 69 and 91 are probable sites for acetylation by covalent amide linkage of myristate via N-myristoyl transferase. LIN52 is differentially expressed in the trout retina between parr and smolt developmental stages (smoltification). It is likely to be a house-keeping protein. LIN52 forms a complex (LINC) required for transcriptional activation of G2/M genes. The LINC core complex consists of at least five subunits including the chromatin-associated LIN-9 and RbAp48 proteins. LINC associates with a large number of E2F-regulated promoters in quiescent cells. Family members are required for spermatogenesis by repressing testis-specific gene expression.¡€0€ª€0€ €CDD¡€ €ÇÑ¢€0€0€ €“pfam10045, DUF2280, Uncharacterized conserved protein (DUF2280). Members of this family of hypothetical bacterial proteins have no known function.¡€0€ª€0€ €CDD¡€ €ÇÒ¢€0€0€ €÷pfam10046, BLOC1_2, Biogenesis of lysosome-related organelles complex-1 subunit 2. Members of this family of proteins play a role in cellular proliferation, as well as in the biogenesis of specialized organelles of the endosomal-lysosomal system.¡€0€ª€0€ €CDD¡€ €ÇÓ¢€0€0€ €pfam10047, DUF2281, Protein of unknown function (DUF2281). Members of this family of hypothetical bacterial proteins have no known function.¡€0€ª€0€ €CDD¡€ €ÇÔ¢€0€0€ €ºpfam10048, DUF2282, Predicted integral membrane protein (DUF2282). Members of this family of hypothetical bacterial proteins and putative signal peptide proteins have no known function.¡€0€ª€0€ €CDD¡€ €ÇÕ¢€0€0€ €pfam10049, DUF2283, Protein of unknown function (DUF2283). Members of this family of hypothetical bacterial proteins have no known function.¡€0€ª€0€ €CDD¡€ €ÇÖ¢€0€0€ €Ÿpfam10050, DUF2284, Predicted metal-binding protein (DUF2284). Members of this family of metal-binding hypothetical bacterial proteins have no known function.¡€0€ª€0€ €CDD¡€ €Ç×¢€0€0€ €pfam10051, DUF2286, Uncharacterized protein conserved in archaea (DUF2286). Members of this family of hypothetical archaeal proteins have no known function.¡€0€ª€0€ €CDD¡€ €ÇØ¢€0€0€ €pfam10052, DUF2288, Protein of unknown function (DUF2288). Members of this family of hypothetical bacterial proteins have no known function.¡€0€ª€0€ €CDD¡€ €ÇÙ¢€0€0€ €“pfam10053, DUF2290, Uncharacterized conserved protein (DUF2290). Members of this family of hypothetical bacterial proteins have no known function.¡€0€ª€0€ €CDD¡€ €ÇÚ¢€0€0€ €“pfam10054, DUF2291, Predicted periplasmic lipoprotein (DUF2291). Members of this family of hypothetical bacterial proteins have no known function.¡€0€ª€0€ €CDD¡€ €ÇÛ¢€0€0€ €pfam10055, DUF2292, Uncharacterized small protein (DUF2292). Members of this family of hypothetical bacterial proteins have no known function.¡€0€ª€0€ €CDD¡€ €ÇÜ¢€0€0€ €—pfam10056, DUF2293, Uncharacterized conserved protein (DUF2293). This domain, found in various hypothetical bacterial proteins, has no known function.¡€0€ª€0€ €CDD¡€ €ÇÝ¢€0€0€ €“pfam10057, DUF2294, Uncharacterized conserved protein (DUF2294). Members of this family of hypothetical bacterial proteins have no known function.¡€0€ª€0€ €CDD¡€ €ÇÞ¢€0€0€ €Ðpfam10058, zinc_ribbon_10, Predicted integral membrane zinc-ribbon metal-binding protein. This domain, found in various hypothetical bacterial and eukaryotic metal-binding proteins is a probably zinc-ribbon.¡€0€ª€0€ €CDD¡€ €Çߢ€0€0€ €–pfam10060, DUF2298, Uncharacterized membrane protein (DUF2298). This domain, found in various hypothetical bacterial proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Çࢀ0€0€ €“pfam10061, DUF2299, Uncharacterized conserved protein (DUF2299). Members of this family of hypothetical bacterial proteins have no known function.¡€0€ª€0€ €CDD¡€ €Çᢀ0€0€ €¬pfam10062, DUF2300, Predicted secreted protein (DUF2300). This domain, found in various bacterial hypothetical and putative signal peptide proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Ç⢀0€0€ €Ÿpfam10063, DUF2301, Uncharacterized integral membrane protein (DUF2301). This domain, found in various hypothetical bacterial proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Ç㢀0€0€ €“pfam10065, DUF2303, Uncharacterized conserved protein (DUF2303). Members of this family of hypothetical bacterial proteins have no known function.¡€0€ª€0€ €CDD¡€ €Ç䢀0€0€ €’pfam10066, DUF2304, Uncharacterized conserved protein (DUF2304). Members of this family of hypothetical archaeal proteins have no known function.¡€0€ª€0€ €CDD¡€ €Ç墀0€0€ €Œpfam10067, DUF2306, Predicted membrane protein (DUF2306). Members of this family of hypothetical bacterial proteins have no known function.¡€0€ª€0€ €CDD¡€ €Ç梀0€0€ €‚Opfam10069, DICT, Sensory domain in DIguanylate Cyclases and Two-component system. DICT is a sensory domain found associated with GGDEF, EAL, HD-GYP, STAS, and two component systems (histidine-kinase type). It assumes an alpha+beta fold with a 4-stranded beta-sheet and might have a role in light response (Natural history of sensor domains in bacterial signaling systems by Aravind L, LM Iyer, Anantharaman V, from 'Sensory Mechanisms in Bacteria: Molecular Aspects of Signal Recognition.' Caister Academic Press. 2010) - see (http://de.scribd.com/doc/28576661/Bacterial-Signaling-Chapter).¡€0€ª€0€ €CDD¡€ €Ç碀0€0€ €Ÿpfam10070, DUF2309, Uncharacterized protein conserved in bacteria (DUF2309). Members of this family of hypothetical bacterial proteins have no known function.¡€0€ª€0€ €CDD¡€ €Ç袀0€0€ €úpfam10071, DUF2310, Zn-ribbon-containing, possibly nucleic-acid-binding protein (DUF2310). Members of this family of proteobacterial zinc ribbon proteins are thought to bind to nucleic acids, however their exact function has not as yet been defined.¡€0€ª€0€ €CDD¡€ €Ç颀0€0€ €ápfam10073, DUF2312, Uncharacterized protein conserved in bacteria (DUF2312). Members of this family of hypothetical bacterial proteins have no known function. Structural modelling suggests this domain may bind nucleic acids.¡€0€ª€0€ €CDD¡€ €Çꢀ0€0€ €—pfam10074, DUF2285, Uncharacterized conserved protein (DUF2285). This domain, found in various hypothetical bacterial proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Ç뢀0€0€ €‚ˆpfam10075, CSN8_PSD8_EIF3K, CSN8/PSMD8/EIF3K family. This domain is conserved from plants to humans. It is a signature protein motif found in components of CSN (COP9 signalosome) where it functions as a structural scaffold for subunit-subunit interactions within the complex and is a key regulator of photomorphogenic development. It is found in Eukaryotic translation initiation factor 3 subunit K, a component of the eukaryotic translation initiation factor 3 (eIF-3) complex required for the initiation of protein synthesis. It is also found in 26S proteasome non-ATPase regulatory subunit 8 (PSMD8), a regulatory subunit of the 26S proteasome.¡€0€ª€0€ €CDD¡€ €Ç좀0€0€ €¸pfam10076, DUF2313, Uncharacterized protein conserved in bacteria (DUF2313). Members of this family of proteins comprise various hypothetical and putative bacteriophage tail proteins.¡€0€ª€0€ €CDD¡€ €Çí¢€0€0€ €‚pfam10077, DUF2314, Uncharacterized protein conserved in bacteria (DUF2314). This domain is found in various bacterial hypothetical proteins, as well as putative ankyrin repeat proteins. The exact function of the domains comprising this family has not, as yet, been determined.¡€0€ª€0€ €CDD¡€ €Ç0€0€ €Ÿpfam10078, DUF2316, Uncharacterized protein conserved in bacteria (DUF2316). Members of this family of hypothetical bacterial proteins have no known function.¡€0€ª€0€ €CDD¡€ €Ç0€0€ €Ãpfam10079, BshC, Bacillithiol biosynthesis BshC. Members of this protein family include BshC, which is an enzyme required for bacillithiol biosynthesis and described as a cysteine-adding enzyme.¡€0€ª€0€ €CDD¡€ €Çð¢€0€0€ €Œpfam10080, DUF2318, Predicted membrane protein (DUF2318). Members of this family of hypothetical bacterial proteins have no known function.¡€0€ª€0€ €CDD¡€ €Çñ¢€0€0€ €‚pfam10081, Abhydrolase_9, Alpha/beta-hydrolase family. This is a family of alpha/beta hydrolases which may function as lipases. This domain is the catalytic domain and includes the catalytic triad and the GXSXG sequence motif which is a characteristic of these enzymes.¡€0€ª€0€ €CDD¡€ €Çò¢€0€0€ €epfam10082, BBP2_2, Putative beta-barrel porin 2. This domain is a putative beta-barrel porin type 2.¡€0€ª€0€ €CDD¡€ €Çó¢€0€0€ €Ÿpfam10083, DUF2321, Uncharacterized protein conserved in bacteria (DUF2321). Members of this family of hypothetical bacterial proteins have no known function.¡€0€ª€0€ €CDD¡€ €ay¢€0€0€ €Ÿpfam10084, DUF2322, Uncharacterized protein conserved in bacteria (DUF2322). Members of this family of hypothetical bacterial proteins have no known function.¡€0€ª€0€ €CDD¡€ €Çô¢€0€0€ €‚pfam10086, DUF2324, Putative membrane peptidase family (DUF2324). This domain, found in various hypothetical bacterial proteins, has no known function. This family appears to be related to the prenyl protease 2 family pfam02517, suggesting this family may be peptidases.¡€0€ª€0€ €CDD¡€ €Çõ¢€0€0€ €Ÿpfam10087, DUF2325, Uncharacterized protein conserved in bacteria (DUF2325). Members of this family of hypothetical bacterial proteins have no known function.¡€0€ª€0€ €CDD¡€ €Çö¢€0€0€ €£pfam10088, DUF2326, Uncharacterized protein conserved in bacteria (DUF2326). This domain, found in various hypothetical bacterial proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Ç÷¢€0€0€ €‚kpfam10090, HPTransfase, Histidine phosphotransferase. HPTransfase is a family of essential histidine phosphotransferases. It controls the activity of the master bacterial cell-cycle regulator CtrA through phosphorylation. It behaves as a homodimer by adopting the domain architecture of the intracellular part of class I histidine kinases. Each subunit consists of two distinct domains: an N-terminal helical hairpin domain and a C-terminal [alpha]/[beta] domain. The two N-terminal domains are adjacent within the dimer, forming a four-helix bundle. The C-terminal domain adopts an atypical Bergerat ATP-binding fold.¡€0€ª€0€ €CDD¡€ €Çø¢€0€0€ €‚³pfam10091, Glycoamylase, Putative glucoamylase. The structure of UniProt:Q5LIB7 has an alpha/alpha toroid fold and is similar structurally to a number of glucoamylases. Most of these structural homologs are glucoamylases, involved in breaking down complex sugars (e.g. starch). The biologically relevant state is likely to be monomeric. The putative active site is located at the centre of the toroid with a well defined large cavity.¡€0€ª€0€ €CDD¡€ €Çù¢€0€0€ €Ÿpfam10092, DUF2330, Uncharacterized protein conserved in bacteria (DUF2330). Members of this family of hypothetical bacterial proteins have no known function.¡€0€ª€0€ €CDD¡€ €Çú¢€0€0€ €Ÿpfam10093, DUF2331, Uncharacterized protein conserved in bacteria (DUF2331). Members of this family of hypothetical bacterial proteins have no known function.¡€0€ª€0€ €CDD¡€ €Çû¢€0€0€ €Ÿpfam10094, DUF2332, Uncharacterized protein conserved in bacteria (DUF2332). Members of this family of hypothetical bacterial proteins have no known function.¡€0€ª€0€ €CDD¡€ €Çü¢€0€0€ €Ÿpfam10095, DUF2333, Uncharacterized protein conserved in bacteria (DUF2333). Members of this family of hypothetical bacterial proteins have no known function.¡€0€ª€0€ €CDD¡€ €Çý¢€0€0€ €£pfam10096, DUF2334, Uncharacterized protein conserved in bacteria (DUF2334). This domain, found in various hypothetical bacterial proteins, has no known function.¡€0€ª€0€ €CDD¡€ €Çþ¢€0€0€ €Œpfam10097, DUF2335, Predicted membrane protein (DUF2335). Members of this family of hypothetical bacterial proteins have no known function.¡€0€ª€0€ €CDD¡€ €Çÿ¢€0€0€ €Ÿpfam10098, DUF2336, Uncharacterized protein conserved in bacteria (DUF2336). Members of this family of hypothetical bacterial proteins have no known function.¡€0€ª€0€ €CDD¡€ €È¢€0€0€ €êpfam10099, RskA, Anti-sigma-K factor rskA. This domain, formerly known as DUF2337, is the anti-sigma-K factor, RskA. In Mycobacterium tuberculosis the protein positively regulates expression of the antigenic proteins MPB70 and MPB83.¡€0€ª€0€ €CDD¡€ €È¢€0€0€ €Ÿpfam10100, DUF2338, Uncharacterized protein conserved in bacteria (DUF2338). Members of this family of hypothetical bacterial proteins have no known function.¡€0€ª€0€ €CDD¡€ €È¢€0€0€ €pfam10101, DUF2339, Predicted membrane protein (DUF2339). This domain, found in various hypothetical bacterial proteins, has no known function.¡€0€ª€0€ €CDD¡€ €È¢€0€0€ €‚ pfam10102, DUF2341, Domain of unknown function (DUF2341). Members of this family are found in various bacterial proteins, including MotA/TolQ/ExbB proton channels and other transport proteins. The exact function of this set of domains has not, as yet, been determined.¡€0€ª€0€ €CDD¡€ €È¢€0€0€ €ûpfam10103, Zinicin_2, Zinicin-like metallopeptidase. This family of proteins has a conserved HEXXH motif, suggesting they are putative peptidases of zincin fold. The structure of this family has similarity to Peptidase_M1 (pfam01433, Structure 3CMN).¡€0€ª€0€ €CDD¡€ €È¢€0€0€ €‚!pfam10104, Brr6_like_C_C, Di-sulfide bridge nucleocytoplasmic transport domain. Brr6_like_C_C is the highly conserved C-terminal region of a group of proteins found in fungi. It carries four highly conserved cysteine residues. It is suggested that members of the family interact with each other via di-sulfide bridges to form a complex which is involved in nucleocytoplasmic transport. Brr6 in yeast is an essential integral membrane protein of the NE-ER, wit two predicted transmembrane domains, and is a dosage suppressor of Apq12, pfam12716.¡€0€ª€0€ €CDD¡€ €È¢€0€0€ €Ãpfam10105, DUF2344, Uncharacterized protein conserved in bacteria (DUF2344). This domain, found in various hypothetical bacterial proteins and Radical Sam domain proteins, has no known function.¡€0€ª€0€ €CDD¡€ €È¢€0€0€ €¿pfam10106, DUF2345, Uncharacterized protein conserved in bacteria (DUF2345). Members of this family are found in various bacterial hypothetical proteins, as well as Rhs element Vgr proteins.¡€0€ª€0€ €CDD¡€ €È¢€0€0€ €ßpfam10107, Endonuc_Holl, Endonuclease related to archaeal Holliday junction resolvase. This domain is found in various predicted bacterial endonucleases which are distantly related to archaeal Holliday junction resolvases.¡€0€ª€0€ €CDD¡€ €È ¢€0€0€ €¼pfam10108, DNA_pol_B_exo2, Predicted 3'-5' exonuclease related to the exonuclease domain of PolB. This domain is found in various prokaryotic 3'-5' exonucleases and hypothetical proteins.¡€0€ª€0€ €CDD¡€ €a¢€0€0€ €®pfam10109, Phage_TAC_7, Phage tail assembly chaperone proteins, E, or 41 or 14. This is family of various Myoviridae bacteriophage tail assembly chaperone, or TAC, proteins.¡€0€ª€0€ €CDD¡€ €È ¢€0€0€ €Ôpfam10110, GPDPase_memb, Membrane domain of glycerophosphoryl diester phosphodiesterase. Members of this family comprise the membrane domain of the prokaryotic enzyme glycerophosphoryl diester phosphodiesterase.¡€0€ª€0€ €CDD¡€ €È ¢€0€0€ €Êpfam10111, Glyco_tranf_2_2, Glycosyltransferase like family 2. Members of this family of prokaryotic proteins include putative glucosyltransferase, which are involved in bacterial capsule biosynthesis.¡€0€ª€0€ €CDD¡€ €È ¢€0€0€ €Ãpfam10112, Halogen_Hydrol, 5-bromo-4-chloroindolyl phosphate hydrolysis protein. Members of this family of prokaryotic proteins mediate the hydrolysis of 5-bromo-4-chloroindolyl phosphate bonds.¡€0€ª€0€ €CDD¡€ €È ¢€0€0€ €‡pfam10113, Fibrillarin_2, Fibrillarin-like archaeal protein. Members of this family of proteins include archaeal fibrillarin homologs.¡€0€ª€0€ €CDD¡€ €È¢€0€0€ €‚ðpfam10114, PocR, Sensory domain found in PocR. PocR, a ligand binding domain, has a novel variant of the PAS-like Fold. Evidence suggests that it binds small hydrocarbon derivatives such as 1,3-propanediol. In (Natural history of sensor domains in bacterial signaling systems by Aravind L, LM Iyer, Anantharaman V, from 'Sensory Mechanisms in Bacteria: Molecular Aspects of Signal Recognition.' Caister Academic Press. 2010) - see (http://de.scribd.com/doc/28576661/Bacterial-Signaling-Chapter).¡€0€ª€0€ €CDD¡€ €È¢€0€0€ €‚pfam10115, HlyU, Transcriptional activator HlyU. This domain, found in various hypothetical prokaryotic proteins, has no known function. One of the sequences in this family corresponds to the transcriptional activator HlyU, indicating a possible similar role in other members.¡€0€ª€0€ €CDD¡€ €È¢€0€0€ €´pfam10116, Host_attach, Protein required for attachment to host cells. Members of this family of bacterial proteins are required for the attachment of the bacterium to host cells.¡€0€ª€0€ €CDD¡€ €È¢€0€0€ €Øpfam10117, McrBC, McrBC 5-methylcytosine restriction system component. Members of this family of bacterial proteins modify the specificity of mcrB restriction by expanding the range of modified sequences restricted.¡€0€ª€0€ €CDD¡€ €È¢€0€0€ €£pfam10118, Metal_hydrol, Predicted metal-dependent hydrolase. Members of this family of proteins comprise various bacterial transition metal-dependent hydrolases.¡€0€ª€0€ €CDD¡€ €È¢€0€0€ €äpfam10119, MethyTransf_Reg, Predicted methyltransferase regulatory domain. Members of this family of domains are found in various prokaryotic methyltransferases, where they regulate the activity of the methyltransferase domain.¡€0€ª€0€ €CDD¡€ €È¢€0€0€ €‚!pfam10120, ThiP_synth, Thiamine-phosphate synthase. This family is thiamine-phosphate synthase, and it belongs to the SCOP phosphomethylpyrimidine kinase C-terminal domain-like family. Vitamin B1 (thiamine pyrophosphate) is involved in several microbial metabolic functions. Thiamine biosynthesis is accomplished by joining two intermediate molecules that are synthesized separately, HMP-PP and HET-P. In the archaeon Natrialba magadii, ThiE and ThiN, are known to join HMP-PP ( hydroxymethylpyrimidine pyrophosphate) and HET-P (hydroxyethylthiazole phosphate) to generate thiamine phosphate. Whereas ThiE in Natrialba magadii is a mono-functional protein, ThiN exists as a C-terminal domain in a ThiDN fusion protein - examples of all three forms, from various prokaryotes, are found in this family.¡€0€ª€0€ €CDD¡€ €È¢€0€0€ €†pfam10122, Mu-like_Com, Mu-like prophage protein Com. Members of this family of proteins comprise the translational regulator of mom.¡€0€ª€0€ €CDD¡€ €a¢€0€0€ €‹pfam10123, Mu-like_Pro, Mu-like prophage I protein. Members of this family of proteins comprise various viral Mu-like prophage I proteins.¡€0€ª€0€ €CDD¡€ €È¢€0€0€ €Êpfam10124, Mu-like_gpT, Mu-like prophage major head subunit gpT. Members of this family of proteins comprise various caudoviral prophage proteins, including the Mu-like prophage major head subunit gpT.¡€0€ª€0€ €CDD¡€ €È¢€0€0€ €çpfam10125, NADHdeh_related, NADH dehydrogenase I, subunit N related protein. This family comprises a set of NADH dehydrogenase I, subunit N related proteins found in archaea. Their exact function, has not, as yet, been determined.¡€0€ª€0€ €CDD¡€ €È¢€0€0€ €ôpfam10126, Nit_Regul_Hom, Uncharacterized protein, homolog of nitrogen regulatory protein PII. This domain, found in various hypothetical archaeal proteins, has no known function. It is distantly similar to the nitrogen regulatory protein PII.¡€0€ª€0€ €CDD¡€ €È¢€0€0€ €Þpfam10127, Nuc-transf, Predicted nucleotidyltransferase. Members of this family of bacterial proteins catalyze the transfer of nucleotide residues from nucleoside diphosphates or triphosphates into dimer or polymer forms.¡€0€ª€0€ €CDD¡€ €È¢€0€0€ €êpfam10128, OpcA_G6PD_assem, Glucose-6-phosphate dehydrogenase subunit. Members of this family are found in various prokaryotic OpcA and glucose-6-phosphate dehydrogenase proteins. The exact function of the domain is, as yet, unknown.¡€0€ª€0€ €CDD¡€ €È¢€0€0€ €pfam10129, OpgC_C, OpgC protein. This domain, found in various hypothetical and OpgC prokaryotic proteins. It is likely to act as an acyltransferase enzyme.¡€0€ª€0€ €CDD¡€ €È¢€0€0€ €pfam10130, PIN_2, PIN domain. Members of this family of bacterial domains are predicted to be RNases (from similarities to 5'-exonucleases).¡€0€ª€0€ €CDD¡€ €a¥¢€0€0€ €‚$pfam10131, PTPS_related, 6-pyruvoyl-tetrahydropterin synthase related domain; membrane protein. This domain is found in various bacterial hypothetical membrane proteins, as well as in tetratricopeptide TPR_2 repeat protein. The exact function of the domain has not, as yet, been established.¡€0€ª€0€ €CDD¡€ €a¦¢€0€0€ €Ìpfam10133, RNA_bind_2, Predicted RNA-binding protein. Members of this family of bacterial proteins are thought to have RNA-binding properties, however, their exact function has not, as yet, been defined.¡€0€ª€0€ €CDD¡€ €È¢€0€0€ €Çpfam10134, RPA, Replication initiator protein A. Members of this family of bacterial proteins are single-stranded DNA binding proteins that are involved in DNA replication, repair and recombination.¡€0€ª€0€ €CDD¡€ €È¢€0€0€ €ƒpfam10135, Rod-binding, Rod binding protein. Members of this family are involved in the assembly of the prokaryotic flagellar rod.¡€0€ª€0€ €CDD¡€ €È¢€0€0€ €³pfam10136, SpecificRecomb, Site-specific recombinase. Members of this family of bacterial proteins are found in various putative site-specific recombinase transmembrane proteins.¡€0€ª€0€ €CDD¡€ €È ¢€0€0€ €åpfam10137, TIR-like, Predicted nucleotide-binding protein containing TIR-like domain. Members of this family of bacterial nucleotide-binding proteins contain a TIR-like domain. Their exact function has not, as yet, been defined.¡€0€ª€0€ €CDD¡€ €È!¢€0€0€ €‘pfam10138, vWA-TerF-like, vWA found in TerF C terminus. vWA domain fused to TerD domain typified by the TerF protein. Some times found as solos.¡€0€ª€0€ €CDD¡€ €È"¢€0€0€ €Ñpfam10139, Virul_Fac, Putative bacterial virulence factor. Members of this family of prokaryotic proteins include various putative virulence factor effector proteins. Their exact function is, as yet, unknown.¡€0€ª€0€ €CDD¡€ €È#¢€0€0€ €‚ªpfam10140, YukC, WXG100 protein secretion system (Wss), protein YukC. Members of this family of proteins include predicted membrane proteins homologous to YukC in B. subtilis. The YukC protein family would participate to the formation of a translocon required for the secretion of WXG100 proteins (pfam06013) in monoderm bacteria, the WXG100 protein secretion system (Wss). This family includes EssB in Staphylococcus aureus.¡€0€ª€0€ €CDD¡€ €È$¢€0€0€ €‚pfam10141, ssDNA-exonuc_C, Single-strand DNA-specific exonuclease, C terminal domain. Members of this set of prokaryotic domains are found in a set of single-strand DNA-specific exonucleases, including RecJ. Their exact function has not, as yet, been determined.¡€0€ª€0€ €CDD¡€ €È%¢€0€0€ €·pfam10142, PhoPQ_related, PhoPQ-activated pathogenicity-related protein. Members of this family of bacterial proteins are involved in the virulence of some pathogenic proteobacteria.¡€0€ª€0€ €CDD¡€ €È&¢€0€0€ €‚•pfam10143, PhosphMutase, 2,3-bisphosphoglycerate-independent phosphoglycerate mutase. Members of this family are found in various bacterial 2,3-bisphosphoglycerate-independent phosphoglycerate mutase enzymes, which catalyze the interconversion of 2-phosphoglycerate and 3-phosphoglycerate in the reaction: [2-phospho-D-glycerate + 2,3-diphosphoglycerate = 3-phospho-D-glycerate + 2,3-diphosphoglycerate].¡€0€ª€0€ €CDD¡€ €È'¢€0€0€ €Ápfam10144, SMP_2, Bacterial virulence factor haemolysin. Members of this family of bacterial proteins are membrane proteins that effect the expression of haemolysin under anaerobic conditions.¡€0€ª€0€ €CDD¡€ €È(¢€0€0€ €‹pfam10145, PhageMin_Tail, Phage-related minor tail protein. Members of this family are found in putative phage tail tape measure proteins.¡€0€ª€0€ €CDD¡€ €È)¢€0€0€ €‚gpfam10146, zf-C4H2, Zinc finger-containing protein. This is a family of proteins which appears to have a highly conserved zinc finger domain at the C terminal end, described as -C-X2-CH-X3-H-X5-C-X2-C-. The structure is predicted to contain a coiled coil. Members are annotated as being tumor-associated antigen HCA127 in humans but this could not confirmed.¡€0€ª€0€ €CDD¡€ €È*¢€0€0€ €‚Úpfam10147, CR6_interact, Growth arrest and DNA-damage-inducible proteins-interacting protein 1. Members of this family of proteins act as negative regulators of G1 to S cell cycle phase progression by inhibiting cyclin-dependent kinases. Inhibitory effects are additive with GADD45 proteins but occur also in the absence of GADD45 proteins. Furthermore, they act as a repressor of the orphan nuclear receptor NR4A1 by inhibiting AB domain-mediated transcriptional activity.¡€0€ª€0€ €CDD¡€ €È+¢€0€0€ €¡pfam10148, SCHIP-1, Schwannomin-interacting protein 1. Members of this family are coiled coil protein involved in linking membrane proteins to the cytoskeleton.¡€0€ª€0€ €CDD¡€ €a¶¢€0€0€ €£pfam10149, TM231, Transmembrane protein 231. This is a family of transmembrane proteins, given the number 231, of unknown function. It is conserved in eukaryotes.¡€0€ª€0€ €CDD¡€ €È,¢€0€0€ €‰pfam10150, RNase_E_G, Ribonuclease E/G family. Ribonuclease E and Ribonuclease G are related enzymes that cleave a wide variety of RNAs.¡€0€ª€0€ €CDD¡€ €È-¢€0€0€ €‚9pfam10151, TMEM214, TMEM214, C-terminal, caspase 4 activator. This is the N-terminal domain of transmembrane family 214, from eukaryotes. The family is localized on the endoplasmic reticulum where it recruits procaspase 4 to the ER and subsequently allows this to be cleaved to caspase 4 so leading to apoptosis.¡€0€ª€0€ €CDD¡€ €È.¢€0€0€ €‚>pfam10152, DUF2360, Predicted coiled-coil domain-containing protein (DUF2360). This is the conserved 140 amino acid region of a family of proteins conserved from nematodes to humans. One C. elegans member is annotated as a Daf-16-dependent longevity protein 1 but this could not be confirmed. The function is unknown.¡€0€ª€0€ €CDD¡€ €È/¢€0€0€ €Tpfam10153, Efg1, rRNA-processing protein Efg1. Efg1 is involved in rRNA processing.¡€0€ª€0€ €CDD¡€ €È0¢€0€0€ €pfam10154, DUF2362, Uncharacterized conserved protein (DUF2362). This is a family of proteins conserved from nematodes to humans. The function is not known.¡€0€ª€0€ €CDD¡€ €È1¢€0€0€ €¹pfam10155, DUF2363, Uncharacterized conserved protein (DUF2363). This is a region of 120 amino acids of a family of proteins conserved from plants to humans. The function is not known.¡€0€ª€0€ €CDD¡€ €È2¢€0€0€ €‚ˆpfam10156, Med17, Subunit 17 of Mediator complex. This Mediator complex subunit was formerly known as Srb4 in yeasts or Trap80 in Drosophila and human. The Med17 subunit is located within the head domain and is essential for cell viability to the extent that a mutant strain of cerevisiae lacking it shows all RNA polymerase II-dependent transcription ceasing at non-permissive temperatures.¡€0€ª€0€ €CDD¡€ €È3¢€0€0€ €¡pfam10157, DUF2365, Uncharacterized conserved protein (DUF2365). This is a family of conserved proteins found from nematodes to humans. The function is unknown.¡€0€ª€0€ €CDD¡€ €È4¢€0€0€ €îpfam10158, LOH1CR12, tumor suppressor protein. This is a region of 130 amino acids that is the most conserved region of hypothetical proteins involved in loss of heterozygosity and thus tumor suppression. The exact function is not known.¡€0€ª€0€ €CDD¡€ €a¿¢€0€0€ €‚™pfam10159, MMtag, Kinase phosphorylation protein. This is a glycine-rich domain that is the most highly conserved region of a family of proteins that in vertebrates are associated with tumors in multiple myelomas. The region may contain phosphorylation sites for several protein kinases, as well as N-myristoylation sites and nuclear localization signals, so it might act as a signal molecule in the nucleus.¡€0€ª€0€ €CDD¡€ €È5¢€0€0€ €ßpfam10160, Tmemb_40, Predicted membrane protein. This is a region of 280 amino acids from a group of proteins conserved from plants to humans. It is predicted to be a membrane protein but its function is otherwise unknown.¡€0€ª€0€ €CDD¡€ €È6¢€0€0€ €‚pfam10161, DDDD, Putative mitochondrial precursor protein. This is a family of small conserved proteins found from nematodes to humans. The C-terminal region is rich in asparagine. Members are putatively assigned to be mitochondrial precursor proteins but this could not be confirmed.¡€0€ª€0€ €CDD¡€ €È7¢€0€0€ €Åpfam10162, G8, G8 domain. This domain is found in disease proteins PKHD1 and KIAA1199 and is named G8 after its 8 conserved glycines. It is predicted to contain 10 beta strands and an alpha helix.¡€0€ª€0€ €CDD¡€ €È8¢€0€0€ €¿pfam10163, EnY2, Transcription factor e(y)2. EnY2 is a small transcription factor which is combined in a complex with the TAFII40 protein. The protein is conserved from paramecium to humans.¡€0€ª€0€ €CDD¡€ €È9¢€0€0€ €‚jpfam10164, DUF2367, Uncharacterized conserved protein (DUF2367). This is a highly conserved family of proteins which contains three pairs of cysteine residues within a length of 42 amino acids and is rich in proline residues towards the N-terminus. The function is unknown. Several members are putatively assigned as brain protein i3 but this was not validated.¡€0€ª€0€ €CDD¡€ €È:¢€0€0€ €Ùpfam10165, Ric8, Guanine nucleotide exchange factor synembryn. Ric8 is involved in the EGL-30 neurotransmitter signalling pathway. It is a guanine nucleotide exchange factor that regulates neurotransmitter secretion.¡€0€ª€0€ €CDD¡€ €È;¢€0€0€ €pfam10166, DUF2368, Uncharacterized conserved protein (DUF2368). This family is conserved from nematodes to humans. The function is not known.¡€0€ª€0€ €CDD¡€ €È<¢€0€0€ €Ûpfam10167, NEP, Uncharacterized conserved protein. This is the N-terminal 80 residues of a family of proteins conserved from plants to humans. It contains a characteristic NEP sequence motif. The function is not known.¡€0€ª€0€ €CDD¡€ €È=¢€0€0€ €‚Vpfam10168, Nup88, Nuclear pore component. Nup88 can be divided into two structural domains; the N-terminal two-thirds of the protein has no obvious structural motifs but is the region for binding to Nup98, one of the components of the nuclear pore. the C-terminal end is a predicted coiled-coil domain. Nup88 is overexpressed in tumor cells.¡€0€ª€0€ €CDD¡€ €È>¢€0€0€ €‚Epfam10169, Laps, Learning-associated protein. This is a family of 121-amino acid secretory proteins. Laps functions in the regulation of neuronal cell adhesion and/or movement and synapse attachment. Laps binds to the ApC/EBP (Aplysia CCAAT/enhancer binding protein) promoter and activates the transcription of ApC/EBP mRNA.¡€0€ª€0€ €CDD¡€ €È?¢€0€0€ €‚cpfam10170, C6_DPF, Cysteine-rich domain. This is the N-terminal approximately 100 amino acids of a family of proteins found from nematodes to humans. It contains between six and eight highly conserved cysteine residues and a characteristic DPF sequence motif. One member is putatively named as receptor for egg jelly protein but this could not confirmed.¡€0€ª€0€ €CDD¡€ €È@¢€0€0€ €pfam10171, DUF2366, Uncharacterized conserved protein (DUF2366). This is a family of proteins conserved from nematodes to humans. The function is not known.¡€0€ª€0€ €CDD¡€ €ÈA¢€0€0€ €èpfam10172, DDA1, Det1 complexing ubiquitin ligase. DDA1 (De-etiolated 1, Damaged DNA binding protein 1 associated 1) protein binds strongly with DDB1 and Det1 forming a DDD complex which is part of the ubiquitin conjugation system.¡€0€ª€0€ €CDD¡€ €ÈB¢€0€0€ €Ûpfam10173, Mit_KHE1, Mitochondrial K+-H+ exchange-related. The members of this family function as mitochondrial potassium-hydrogen exchange transporters. The family is part of a large mitochondrial KHE protein complex.¡€0€ª€0€ €CDD¡€ €ÈC¢€0€0€ €‚‚pfam10174, Cast, RIM-binding protein of the cytomatrix active zone. This is a family of proteins that form part of the CAZ (cytomatrix at the active zone) complex which is involved in determining the site of synaptic vesicle fusion. The C-terminus is a PDZ-binding motif that binds directly to RIM (a small G protein Rab-3A effector). The family also contains four coiled-coil domains.¡€0€ª€0€ €CDD¡€ €ÈD¢€0€0€ €Õpfam10175, MPP6, M-phase phosphoprotein 6. This is a family of M-phase phosphoprotein 6s which is necessary for generation of the 3' end of the 5.8S rRNA precursor. It preferentially binds to poly(C) and poly(U).¡€0€ª€0€ €CDD¡€ €ÈE¢€0€0€ €çpfam10176, DUF2370, Protein of unknown function (DUF2370). This family is conserved from fungi to humans. The human member is annotated as a Golgi-associated protein-Nedd4 WW domain-binding protein but this could not be confirmed.¡€0€ª€0€ €CDD¡€ €ÈF¢€0€0€ €pfam10177, DUF2371, Uncharacterized conserved protein (DUF2371). This is a family of proteins conserved from nematodes to humans. The function is not known.¡€0€ª€0€ €CDD¡€ €ÈG¢€0€0€ €‚ pfam10178, PAC3, Proteasome assembly chaperone 3. PAC3 is a family of eukaryotic proteasome assembly chaperone 3 proteins conserved from fungi to plants to humans. PAC3 plays a crucial part in the assembly of the 20S core proteasome unit, in conjunction with PAC4.¡€0€ª€0€ €CDD¡€ €ÈH¢€0€0€ €­pfam10179, DUF2369, Uncharacterized conserved protein (DUF2369). This is a proline-rich region of a group of proteins found from plants to fungi. The function is not known.¡€0€ª€0€ €CDD¡€ €ÈI¢€0€0€ €¸pfam10180, DUF2373, Uncharacterized conserved protein (DUF2373). This is the C-terminal conserved region of a family of proteins found from fungi to humans. The function is not known.¡€0€ª€0€ €CDD¡€ €ÈJ¢€0€0€ €‚hpfam10181, PIG-H, GPI-GlcNAc transferase complex, PIG-H component. PIG-H is a family of conserved proteins that complexes with three other proteins to form the GPI-GnT (glycosylphosphatidylinositol anchor biosynthesis transferase) complex. It appears to be a peripheral membrane protein facing the cytoplasm involved in the first step in GPI anchor formation.¡€0€ª€0€ €CDD¡€ €ÈK¢€0€0€ €‚mpfam10182, Flo11, Flo11 domain. This presumed domain is found at the N-terminus of the S. cerevisiae Flo11 protein. Flo11 is required for diploid pseudohyphal formation and haploid invasive growth. It belongs to a family of proteins involved in invasive growth, cell-cell adhesion, and mating, many of which can substitute for each other under abnormal conditions.¡€0€ª€0€ €CDD¡€ €ÈL¢€0€0€ €Îpfam10183, ESSS, ESSS subunit of NADH:ubiquinone oxidoreductase (complex I). This subunit is part of the mitochondrial NADH:ubiquinone oxidoreductase (complex I). It carries mitochondrial import sequences.¡€0€ª€0€ €CDD¡€ €ÈM¢€0€0€ €¡pfam10184, DUF2358, Uncharacterized conserved protein (DUF2358). DUF2358 is a family of conserved proteins found from plants to humans. The function is unknown.¡€0€ª€0€ €CDD¡€ €ÈN¢€0€0€ €‚ôpfam10185, Mesd, Chaperone for wingless signalling and trafficking of LDL receptor. Mesd is a family of highly conserved proteins found from nematodes to humans. The final C-terminal residues, KEDL, are the endoplasmic reticulum retention sequence as it is an ER protein specifically required for the intracellular trafficking of members of the low-density lipoprotein family of receptors (LDLRs). The N- and C-terminal sequences are predicted to adopt a random coil conformation, with the exception of an isolated predicted helix within the N-terminal region, The central folded domain flanked by natively unstructured regions is the necessary structure for facilitating maturation of LRP6 (Low-Density Lipoprotein Receptor-Related Protein 6 Maturation).¡€0€ª€0€ €CDD¡€ €ÈO¢€0€0€ €‚ˆpfam10186, Atg14, Vacuolar sorting 38 and autophagy-related subunit 14. The Atg14 or Apg14 proteins are hydrophilic proteins with a predicted molecular mass of 40.5 kDa, and have a coiled-coil motif at the N terminus region. Yeast cells with mutant Atg14 are defective not only in autophagy but also in sorting of carboxypeptidase Y (CPY), a vacuolar-soluble hydrolase, to the vacuole. Subcellular fractionation indicate that Apg14p and Apg6p are peripherally associated with a membrane structure(s). Apg14p was co-immunoprecipitated with Apg6p, suggesting that they form a stable protein complex. These results imply that Apg6/Vps30p has two distinct functions: in the autophagic process and in the vacuolar protein sorting pathway. Apg14p may be a component specifically required for the function of Apg6/Vps30p through the autophagic pathway. There are 17 auto-phagosomal component proteins which are categorized into six functional units, one of which is the AS-PI3K complex (Vps30/Atg6 and Atg14). The AS-PI3K complex and the Atg2-Atg18 complex are essential for nucleation, and the specific function of the AS-PI3K apparently is to produce phosphatidylinositol 3-phosphate (PtdIns(3)P) at the pre-autophagosomal structure (PAS). The localization of this complex at the PAS is controlled by Atg14. Autophagy mediates the cellular response to nutrient deprivation, protein aggregation, and pathogen invasion in humans, and malfunction of autophagy has been implicated in multiple human diseases including cancer. This effect seems to be mediated through direct interaction of the human Atg14 with Beclin 1 in the human phosphatidylinositol 3-kinase class III complex.¡€0€ª€0€ €CDD¡€ €ÈP¢€0€0€ €‚Hpfam10187, Nefa_Nip30_N, N-terminal domain of NEFA-interacting nuclear protein NIP30. This is a the N-terminal 100 amino acids of a family of proteins conserved from plants to humans. The full-length protein has putatively been called NEFA-interacting nuclear protein NIP30, however no reference could be found to confirm this.¡€0€ª€0€ €CDD¡€ €ÈQ¢€0€0€ €‚pfam10188, Oscp1, Organic solute transport protein 1. Oscp1 is a family of proteins conserved from plants to humans. It is called organic solute transport protein or oxido-red- nitro domain-containing protein 1, however no reference could be find to confirm the function of the protein.¡€0€ª€0€ €CDD¡€ €ÈR¢€0€0€ €‚npfam10189, Ints3, Integrator complex subunit 3. The Integrator complex is involved in small nuclear RNA (snRNA) U1 and U2 transcription, and in their 3'-box- dependent processing. This complex associates with the C- terminal domain of RNA polymerase II largest subunit and is recruited to the U1 and U2 snRNAs genes. This entry represents subunit 3 of this complex.¡€0€ª€0€ €CDD¡€ €ÈS¢€0€0€ €÷pfam10190, Tmemb_170, Putative transmembrane protein 170. Tmem170 is a family of putative transmembrane proteins conserved from fungi to nematodes to humans. The protein is only of approximately 130 amino acids in length. The function is unknown.¡€0€ª€0€ €CDD¡€ €ÈT¢€0€0€ €ñpfam10191, COG7, Golgi complex component 7 (COG7). COG7 is a component of the conserved oligomeric Golgi complex which is required for normal Golgi morphology and localization. Mutation in COG7 causes a congenital disorder of glycosylation.¡€0€ª€0€ €CDD¡€ €aࢀ0€0€ €‚°pfam10192, GpcrRhopsn4, Rhodopsin-like GPCR transmembrane domain. This region of 270 amino acids is the seven transmembrane alpha-helical domains included within five GPCRRHODOPSN4 motifs of a G-protein-coupled-receptor (GPCR) protein, conserved from nematodes to humans. GPCRs are integral membrane receptors whose intracellular actions are mediated by signalling pathways involving G proteins and downstream secondary messengers.¡€0€ª€0€ €CDD¡€ €ÈU¢€0€0€ €‚˜pfam10193, Telomere_reg-2, Telomere length regulation protein. This family is the central conserved 110 amino acid region of a group of proteins called telomere-length regulation or clock abnormal protein-2 which are conserved from plants to humans. The full-length protein regulates telomere length and contributes to silencing of sub-telomeric regions. In vitro the protein binds to telomeric DNA repeats.¡€0€ª€0€ €CDD¡€ €ÈV¢€0€0€ €‚ppfam10195, Phospho_p8, DNA-binding nuclear phosphoprotein p8. P8 is a short 80-82 amino acid protein that is conserved from nematodes to humans. It carries at least one protein kinase C domain suggesting a possible role in signal transduction and it is thought to be a phosphoprotein, but the sites of phosphorylation and the kinases involved remain to be determined.¡€0€ª€0€ €CDD¡€ €ÈW¢€0€0€ €‚pfam10197, Cir_N, N-terminal domain of CBF1 interacting co-repressor CIR. This is a 45 residue conserved region at the N-terminal end of a family of proteins referred to as CIRs (CBF1-interacting co-repressors). CBF1 (centromere-binding factor 1) acts as a transcription factor that causes repression by binding specifically to GTGGGAA motifs in responsive promoters, and it requires CIR as a co-repressor. CIR binds to histone deacetylase and to SAP30 and serves as a linker between CBF1 and the histone deacetylase complex.¡€0€ª€0€ €CDD¡€ €ÈX¢€0€0€ €‚½pfam10198, Ada3, Histone acetyltransferases subunit 3. Ada3 is a family of proteins conserved from yeasts to humans. It is an essential component of the Ada transcriptional coactivator (alteration/deficiency in activation) complex. Ada3 plays a key role in linking histone acetyltransferase-containing complexes to p53 (tumor suppressor protein) thereby regulating p53 acetylation, stability and transcriptional activation following DNA damage.¡€0€ª€0€ €CDD¡€ €ÈY¢€0€0€ €‚Üpfam10199, Adaptin_binding, Alpha and gamma adaptin binding protein p34. p34 is a protein involved in membrane trafficking. It is known to interact with both alpha and gamma adaptin. It has been speculated that p34 may play a chaperone role such as preventing the soluble adaptors from co-assembling with soluble clathrin, or helping to remove the adaptors from the coated vesicle. Another possible function is in aiding the recruitment of soluble adaptors onto the membrane.¡€0€ª€0€ €CDD¡€ €ÈZ¢€0€0€ €‚2pfam10200, Ndufs5, NADH:ubiquinone oxidoreductase, NDUFS5-15kDa. This is a family of short, approximately 105 amino acid residue, proteins which form part of NADH:ubiquinone oxidoreductase complex I. Complex I is the first multisubunit inner membrane protein complex of the mitochondrial electron transport chain and it transfers two electrons from NADH to ubiquinone. The protein carries four highly conserved cysteine residues but these do not appear to be in a configuration which would favour metal binding so the exact function of the protein is uncertain.¡€0€ª€0€ €CDD¡€ €a碀0€0€ €‚Wpfam10203, Pet191_N, Cytochrome c oxidase assembly protein PET191. Pet191_N is the conserved N-terminal of a family of conserved proteins found from nematodes to humans. It carries six highly conserved cysteine residues. Pet191 is required for the assembly of active cytochrome c oxidase but does not form part of the final assembled complex.¡€0€ª€0€ €CDD¡€ €È[¢€0€0€ €‚Žpfam10204, DuoxA, Dual oxidase maturation factor. DuoxA (Dual oxidase maturation factor) is the essential protein necessary for the final release of DUOX2 (an NADPH:O2 oxidoreductase flavoprotein) from the endoplasmic reticulum. Dual oxidases (DUOX1 and DUOX2) constitute the catalytic core of the hydrogen peroxide generator, which generates H2O2 at the apical membrane of thyroid follicular cells, essential for iodination of thyroglobulin by thyroid peroxidases. DuoxA carries five membrane-integral regions including a reverse signal-anchor with external N-terminus (type III) and two N-glycosylation sites. It is conserved from nematodes to humans.¡€0€ª€0€ €CDD¡€ €È\¢€0€0€ €ùpfam10205, KLRAQ, Predicted coiled-coil domain-containing protein. This is the N-terminal 100 amino acid domain of a family of proteins conserved from nematodes to humans. It carries a characteristic KLRAQ sequence-motif. The function is not known.¡€0€ª€0€ €CDD¡€ €È]¢€0€0€ €‚«pfam10206, WRW, Mitochondrial F1F0-ATP synthase, subunit f. This is a family of small proteins of approximately 110 amino acids, which are highly conserved from nematodes to humans. Some members of the family have been annotated in Swiss-Prot as being the f subunit of mitochondrial F1F0-ATP synthase but this could not be confirmed. The sequence has a well-conserved WRW motif. The exact function of the protein is not known.¡€0€ª€0€ €CDD¡€ €È^¢€0€0€ €‚¬pfam10208, Armet, Degradation arginine-rich protein for mis-folding. This is a family of small proteins of approximately 170 residues which contain four di-sulfide bridges that are highly conserved from nematodes to humans. Armet is a soluble protein resident in the endoplasmic reticulum and induced by ER stress. It appears to be involved with dealing with mis-folded proteins in the ER, thus in quality control of ER stress.¡€0€ª€0€ €CDD¡€ €È_¢€0€0€ €špfam10209, DUF2340, Uncharacterized conserved protein (DUF2340). This is a family of small proteins of approximately 150 amino acids of unknown function.¡€0€ª€0€ €CDD¡€ €È`¢€0€0€ €ÿpfam10210, MRP-S32, Mitochondrial 28S ribosomal protein S32. This entry is of a family of short, approximately 100 amino acid residues, proteins which are mitochondrial 28S ribosomal proteins named as MRP-S32. Their exact function could not be confirmed.¡€0€ª€0€ €CDD¡€ €Èa¢€0€0€ €‚ pfam10211, Ax_dynein_light, Axonemal dynein light chain. Axonemal dynein light chain proteins play a dynamic role in flagellar and cilia motility. Eukaryotic cilia and flagella are complex organelles consisting of a core structure, the axoneme, which is composed of nine microtubule doublets forming a cylinder that surrounds a pair of central singlet microtubules. This ultra-structural arrangement seems to be one of the most stable micro-tubular assemblies known and is responsible for the flagellar and ciliary movement of a large number of organisms ranging from protozoan to mammals. This light chain interacts directly with the N-terminal half of the heavy chains.¡€0€ª€0€ €CDD¡€ €Èb¢€0€0€ €‚pfam10212, TTKRSYEDQ, Predicted coiled-coil domain-containing protein. This is the C-terminal 500 amino acids of a family of proteins with a predicted coiled-coil domain conserved from nematodes to humans. It carries a characteristic TTKRSYEDQ sequence-motif. The function is not known.¡€0€ª€0€ €CDD¡€ €Èc¢€0€0€ €‚èpfam10213, MRP-S28, Mitochondrial ribosomal subunit protein. This is a conserved region of approx. 125 residues of one of the proteins that makes up the small subunit of the mitochondrial ribosome. In Saccharomyces cerevisiae the protein is MRP-S24 whereas in humans it is MRP-S28. The human mitochondrial ribosome has 29 distinct proteins in the small subunit and these have homologs in, for example, Drosophila melanogaster, Caenorhabditis elegans, and in the genomes of several fungi.¡€0€ª€0€ €CDD¡€ €añ¢€0€0€ €‚0pfam10214, Rrn6, RNA polymerase I-specific transcription-initiation factor. RNA polymerase I-specific transcription-initiation factor Rrn6 and Rrn7 represent components of a multisubunit transcription factor essential for the initiation of rDNA transcription by Pol I. These proteins are found in fungi.¡€0€ª€0€ €CDD¡€ €Èd¢€0€0€ €‚ÿpfam10215, Ost4, Oligosaccaryltransferase. Ost4 is a very short, approximately 30 residues, enzyme found from fungi to vertebrates. It is a member of the ER oligosaccaryltansferase complex, EC 2.4.1.119, that catalyzes the asparagine-linked glycosylation of proteins. It appears to be an integral membrane protein that mediates the en bloc transfer of a preassembled high-mannose oligosaccharide onto asparagine residues of nascent polypeptides as they enter the lumen of the rough endoplasmic reticulum (RER).¡€0€ª€0€ €CDD¡€ €Èe¢€0€0€ €‚hpfam10216, ChpXY, CO2 hydration protein (ChpXY). This small family of proteins includes paralogues ChpX and ChpY in Synechococcus sp. PCC7942 and other cyanobacteria, associated with distinct NAD(P)H dehydrogenase complexes. These proteins collectively enable light-dependent CO2 hydration and CO2 uptake; loss of both blocks growth at low CO2 concentrations.¡€0€ª€0€ €CDD¡€ €Èf¢€0€0€ €îpfam10217, DUF2039, Uncharacterized conserved protein (DUF2039). This entry is a region of approximately 100 residues containing three pairs of cysteine residues. The region is conserved from plants to humans but its function is unknown.¡€0€ª€0€ €CDD¡€ €Èg¢€0€0€ €Þpfam10218, DUF2054, Uncharacterized conserved protein (DUF2054). This entry contains 14 conserved cysteines, three of which are CC-dimers. The region is of approximately 200 residues in length but its function is unknown.¡€0€ª€0€ €CDD¡€ €Èh¢€0€0€ €ïpfam10220, Smg8_Smg9, Smg8_Smg9. Smg8 and Smg9 are two subunits of the Smg-1 complex. They suppress Smg-1 kinase activity in the isolated Smg-1 complex, and are involved in nonsense-mediated mRNA decay (NMD) in both mammals and nematodes.¡€0€ª€0€ €CDD¡€ €Èi¢€0€0€ €‚Epfam10221, DUF2151, Cell cycle and development regulator. This is a set of proteins conserved from worms to humans. The proteins are a PAN GU kinase substrate, Mat89Bb, essential for S-M cycles of early Drosophila embryogenesis, Xenopus embryonic cell cycles and morphogenesis, and cell division in cultured mammalian cells.¡€0€ª€0€ €CDD¡€ €Èj¢€0€0€ €—pfam10222, DUF2152, Uncharacterized conserved protein (DUF2152). This is a family of proteins conserved from worms to humans. Its function is unknown.¡€0€ª€0€ €CDD¡€ €aù¢€0€0€ €§pfam10223, DUF2181, Uncharacterized conserved protein (DUF2181). This is region of approximately 250 residues conserved from worms to humans. Its function is unknown.¡€0€ª€0€ €CDD¡€ €Èk¢€0€0€ €Æpfam10224, DUF2205, Predicted coiled-coil protein (DUF2205). This entry represent a highly conserved 100 residue region which is likely to be a coiled-coil structure. The exact function is unknown.¡€0€ª€0€ €CDD¡€ €Èl¢€0€0€ €ípfam10225, NEMP, NEMP family. This entry includes a group of nuclear envelope integral membrane proteins from animals and plants, including NEMP1 from Xenopus laevis. NEMP1 is a RanGTP-binding protein and is involved in eye development.¡€0€ª€0€ €CDD¡€ €Èm¢€0€0€ €‚/pfam10226, CCDC85, CCDC85 family. This entry includes human CCDC85A/B/C and C. elegans Picc-1 protein. Picc-1 serves as a linker protein which helps to recruit the Rho GTPase-activating protein, pac-1, to adherens junctions. Human CCDC85B suppresses the beta-catenin activity in a p53-dependent manner.¡€0€ª€0€ €CDD¡€ €Èn¢€0€0€ €¢pfam10228, DUF2228, Uncharacterized conserved protein (DUF2228). This is a family of conserved proteins of approximately 700 residues found from worms to humans.¡€0€ª€0€ €CDD¡€ €Èo¢€0€0€ €ípfam10229, MMADHC, Methylmalonic aciduria and homocystinuria type D protein. This entry represents methylmalonic aciduria and homocystinuria type D protein and homologs. These proteins are involved in cobalamin (vitamin B12) metabolism.¡€0€ª€0€ €CDD¡€ €Èp¢€0€0€ €¼pfam10230, LIDHydrolase, Lipid-droplet associated hydrolase. This family of proteins is conserved from plants to humans. The function is as a lipid-droplet hydrolase in the yeast members.¡€0€ª€0€ €CDD¡€ €Èq¢€0€0€ €¥pfam10231, DUF2315, Uncharacterized conserved protein (DUF2315). This is a family of small conserved proteins found from worms to humans. The function is not known.¡€0€ª€0€ €CDD¡€ €Èr¢€0€0€ €‚Ýpfam10232, Med8, Mediator of RNA polymerase II transcription complex subunit 8. Arc32, or Med8, is one of the subunits of the Mediator complex of RNA polymerase II. The region conserved contains two alpha helices putatively necessary for binding to other subunits within the core of the Mediator complex. The N-terminus of Med8 binds to the essential core Head part of Mediator and the C-terminus hinges to Med18 on the non-essential part of the Head that also includes Med20.¡€0€ª€0€ €CDD¡€ €Ès¢€0€0€ €‚‹pfam10233, Cg6151-P, Uncharacterized conserved protein CG6151-P. This is a family of small, less than 200 residue long, proteins which are named as CG6151-P proteins that are conserved from fungi to humans. The function is unknown. The fungal members have a characteristic ICP sequence motif. Some members are annotated as putative clathrin-coated vesicle protein but this could not be defined.¡€0€ª€0€ €CDD¡€ €Èt¢€0€0€ €‚pfam10234, Cluap1, Clusterin-associated protein-1. This protein is conserved from worms to humans. The protein of 413 amino acids contains a central coiled-coil domain, possibly the region that binds to clusterin. Cluap1 expression is highest in the nucleus and gradually increases during late S to G2/M phases of the cell cycle and returns to the basal level in the G0/G1 phases. In addition, it is upregulated in colon cancer tissues compared to corresponding non-cancerous mucosa. It thus plays a crucial role in the life of the cell.¡€0€ª€0€ €CDD¡€ €Èu¢€0€0€ €‚Àpfam10235, Cript, Microtubule-associated protein CRIPT. The CRIPT protein is a cytoskeletal protein involved in microtubule production. The C-terminal domain is essential for binding to the PDZ3 domain of the SAP90 protein, one of a super-family of PDZ-containing proteins that play an important role in coupling the membrane ion channels with their signalling partners. SAP90 is concentrated in the post synaptic density of glutamatergic neurons.¡€0€ª€0€ €CDD¡€ €Èv¢€0€0€ €‚Ípfam10236, DAP3, Mitochondrial ribosomal death-associated protein 3. This is a family of conserved proteins which were originally described as death-associated-protein-3 (DAP-3). The proteins carry a P-loop DNA-binding motif, and induce apoptosis. DAP3 has been shown to be a pro-apoptotic factor in the mitochondrial matrix and to be crucial for mitochondrial biogenesis and so has also been designated as MRP-S29 (mitochondrial ribosomal protein subunit 29).¡€0€ª€0€ €CDD¡€ €Èw¢€0€0€ €‚opfam10237, N6-adenineMlase, Probable N6-adenine methyltransferase. This is a protein of approximately 200 residues which is conserved from plants to humans. It contains a highly conserved QFW motif close to the N-terminus and a DPPF motif in the centre. The DPPF motif is characteristic of N-6 adenine-specific DNA methylases, and this family is found in eukaryotes.¡€0€ª€0€ €CDD¡€ €Èx¢€0€0€ €‚Npfam10238, Eapp_C, E2F-associated phosphoprotein. This entry represents the conserved C-terminal portion of an E2F binding protein. E2F transcription factors play an essential role in cell proliferation and apoptosis and their activity is frequently deregulated in human cancers. E2F activity is regulated by a variety of mechanisms, frequently mediated by proteins binding to individual members or a subgroup of the family. EAPP interacts with a subset of E2F factors and influences E2F-dependent promoter activity. EAPP is present throughout the cell cycle but disappears during mitosis.¡€0€ª€0€ €CDD¡€ €Èy¢€0€0€ €Ïpfam10239, DUF2465, Protein of unknown function (DUF2465). FAM98A and B proteins are found from worms to humans but their function is unknown. This entry is of a family of proteins that is rich in glycines.¡€0€ª€0€ €CDD¡€ €Èz¢€0€0€ €‚Upfam10240, DUF2464, Multivesicular body subunit 12. MVB12A (also known as CFBP) and MVB12B are subunits of the ESCRT-I complex, which mediates the sorting of ubiquitinated cargo protein from the plasma membrane to the endosomal vesicle. MVB12A plays a key role in the ligand-mediated internalization and down-regulation of the EGF receptor.¡€0€ª€0€ €CDD¡€ €È{¢€0€0€ €ßpfam10241, KxDL, Uncharacterized conserved protein. This is a family of short proteins which are conserved over a region of 80 residues. There is a characteristic KxDL motif towards the C-terminus. The function is unknown.¡€0€ª€0€ €CDD¡€ €È|¢€0€0€ €‚pfam10242, L_HMGIC_fpl, Lipoma HMGIC fusion partner-like protein. This is a group of proteins expressed from a series of genes referred to as Lipoma HMGIC fusion partner-like. The proteins carry four highly conserved transmembrane domains in this entry. In certain instances, eg in LHFPL5, mutations cause deafness in humans and hypospadias, and LHFPL1 is transcribed in six liver tumor cell lines.¡€0€ª€0€ €CDD¡€ €b ¢€0€0€ €‚Cpfam10243, MIP-T3, Microtubule-binding protein MIP-T3. This protein, which interacts with both microtubules and TRAF3 (tumor necrosis factor receptor-associated factor 3), is conserved from worms to humans. The N-terminal region is the microtubule binding domain and is well-conserved; the C-terminal 100 residues, also well-conserved, constitute the coiled-coil region which binds to TRAF3. The central region of the protein is rich in lysine and glutamic acid and carries KKE motifs which may also be necessary for tubulin-binding, but this region is the least well-conserved.¡€0€ª€0€ €CDD¡€ €È}¢€0€0€ €æpfam10244, MRP-L51, Mitochondrial ribosomal subunit. MRP-L51 is a family of small proteins from the intact 55 S mitochondrial ribosome. It has otherwise been referred to as bMRP-64. The exact function of this family is not known.¡€0€ª€0€ €CDD¡€ €È~¢€0€0€ €‚úpfam10245, MRP-S22, Mitochondrial 28S ribosomal protein S22. This is the conserved N-terminus and central portion of the mitochondrial small subunit 28S ribosomal protein S22. Mammalian mitochondria carry out the synthesis of 13 polypeptides that are essential for oxidative phosphorylation and, hence, for the synthesis of the majority of the ATP used by eukaryotic organisms. The number of proteins produced by prokaryotes is smaller, reflected in the lower number of ribosomal proteins present in them.¡€0€ª€0€ €CDD¡€ €È¢€0€0€ €‚Ppfam10246, MRP-S35, Mitochondrial ribosomal protein MRP-S35. This is a family of short mitochondrial ribosomal proteins, less than 200 amino acids long. that are highly conserved from worms to humans. The structure has previously been referred to as MRP-S18 but the current numbering fits the preferred nomenclature from these authors.¡€0€ª€0€ €CDD¡€ €È€¢€0€0€ €‚pfam10247, Romo1, Reactive mitochondrial oxygen species modulator 1. This is a family of small, approximately 100 amino acid, proteins found from yeasts to humans. The majority of endogenous reactive oxygen species (ROS) in cells are produced by the mitochondrial respiratory chain. An increase or imbalance in ROS alters the intracellular redox homeostasis, triggers DNA damage, and may contribute to cancer development and progression. Members of this family are mitochondrial reactive oxygen species modulator 1 (Romo1) proteins that are responsible for increasing the level of ROS in cells. Increased Romo1 expression can have a number of other effects including: inducing premature senescence of cultured human fibroblasts and increased resistance to 5-fluorouracil.¡€0€ª€0€ €CDD¡€ €È¢€0€0€ €‚Apfam10248, Mlf1IP, Myelodysplasia-myeloid leukemia factor 1-interacting protein. This entry is the conserved central region of a group of proteins that are putative transcriptional repressors. The structure contains a putative 14-3-3 binding motif involved in the subcellular localization of various regulatory molecules, and it may be that interaction with the transcription factor DREF could be regulated through this motif. DREF regulates proliferation-related genes in Drosophila. Mlf1IP is expressed in both the nuclei and the cytoplasm and thus may have multi-functions.¡€0€ª€0€ €CDD¡€ €È‚¢€0€0€ €‚pfam10249, NDUFB10, NADH-ubiquinone oxidoreductase subunit 10. NDUFB10 is a family of conserved proteins of up to 180 residues. It is one of the 41 protein subunits within the hydrophobic fraction of the NADH:ubiquinone oxidoreductase (complex I), a multiprotein complex located in the inner mitochondrial membrane whose main function is the transport of electrons from NADH to ubiquinone, which is accompanied by translocation of protons from the mitochondrial matrix to the intermembrane space. NDUFB10 is encoded in the nucleus.¡€0€ª€0€ €CDD¡€ €ȃ¢€0€0€ €‚*pfam10250, O-FucT, GDP-fucose protein O-fucosyltransferase. This is a family of conserved proteins representing the enzyme responsible for adding O-fucose to EGF (epidermal growth factor-like) repeats. Six highly conserved cysteines are present in O-FucT-1 as well as a DXD-like motif (ERD), conserved in mammals, Drosophila, and C. elegans. Both features are characteristic of several glycosyltransferase families. The enzyme is a membrane-bound protein released by proteolysis and, as for most glycosyltransferases, is strongly activated by manganese.¡€0€ª€0€ €CDD¡€ €È„¢€0€0€ €‚Upfam10251, PEN-2, Presenilin enhancer-2 subunit of gamma secretase. This entry is a short 101 peptide protein which is the smallest subunit of the gamma-secretase aspartyl protease complex that catalyzes the intramembrane cleavage of a subset of type I transmembrane proteins. The other active constituents of the complex are presenilin (PS) nicastrin and anterior pharynx defective-1 (APH-1) protein. PEN-2 adopts a hairpin orientation in the membrane with its N- and C-terminal domains facing the luminal/extracellular space, and the C-terminal domain maintains PS stability within the complex.¡€0€ª€0€ €CDD¡€ €È…¢€0€0€ €‚_pfam10252, PP28, Casein kinase substrate phosphoprotein PP28. This domain is a region of 70 residues conserved in proteins from plants to humans and contains a serine/arginine rich motif. In rats the full protein is a casein kinase substrate, and this region contains phosphorylation sites for both cAMP-dependent protein kinase and casein kinase II.¡€0€ª€0€ €CDD¡€ €Ȇ¢€0€0€ €‚Npfam10253, PRCC, Mitotic checkpoint regulator, MAD2B-interacting. This family constitutes the major, conserved, portion of PRCC proteins. In humans this family interacts with MAD2B, the mitotic checkpoint protein. In Schizosaccharomyces pombe this protein is part of the Cwf-complex that is known to be involved in pre-mRNA splicing.¡€0€ª€0€ €CDD¡€ €ȇ¢€0€0€ €‚Ìpfam10254, Pacs-1, PACS-1 cytosolic sorting protein. PACS-1 is a cytosolic sorting protein that directs the localization of membrane proteins in the trans-Golgi network (TGN)/endosomal system. PACS-1 connects the clathrin adaptor AP-1 to acidic cluster sorting motifs contained in the cytoplasmic domain of cargo proteins such as furin, the cation-independent mannose-6-phosphate receptor and in viral proteins such as human immunodeficiency virus type 1 Nef.¡€0€ª€0€ €CDD¡€ €Ȉ¢€0€0€ €épfam10255, Paf67, RNA polymerase I-associated factor PAF67. RNA polymerase I is a multisubunit enzyme and its transcription competence is dependent on the presence of PAF67. This family of proteins is conserved from worms to humans.¡€0€ª€0€ €CDD¡€ €ȉ¢€0€0€ €ªpfam10256, Erf4, Golgin subfamily A member 7/ERF4 family. This family of proteins includes Golgin subfamily A member 7 proteins as well as Ras modification protein ERF4.¡€0€ª€0€ €CDD¡€ €ÈŠ¢€0€0€ €‚pfam10257, RAI16-like, Retinoic acid induced 16-like protein. This is the conserved N-terminal 450 residues of a family of proteins described as retinoic acid-induced protein 16-like proteins. The exact function is not known. The proteins are found from worms to humans.¡€0€ª€0€ €CDD¡€ €È‹¢€0€0€ €‚7pfam10258, RNA_GG_bind, PHAX RNA-binding domain. RNA_GG_bind is the highly conserved U3 snoRNA-binding domain of PHAX (phosphorylated adaptor for RNA export) whose function is to transport U3 snoRNA from the nucleus after transcription. It is characterized by having two pairs of adjacent glycines, as GGx12GG.¡€0€ª€0€ €CDD¡€ €ÈŒ¢€0€0€ €‚rpfam10259, Rogdi_lz, Rogdi leucine zipper containing protein. This is a family of conserved proteins which have been suggested as containing leucine-zipper domains. A leucine zipper domain is a region of 30 amino acids with leucines repeating every seven or eight residues; these proteins do have many such leucines. The protein in Drosophila comes from the gene ROGDI.¡€0€ª€0€ €CDD¡€ €È¢€0€0€ €ápfam10260, SAYSvFN, Uncharacterized conserved domain (SAYSvFN). This domain of approximately 75 residues contains a highly conserved SATSv/iFN motif. The function is unknown but the domain is conserved from plants to humans.¡€0€ª€0€ €CDD¡€ €ÈŽ¢€0€0€ €‚Œpfam10261, Scs3p, Inositol phospholipid synthesis and fat-storage-inducing TM. This is a family of transmembrane proteins which are variously annotated as possibly being inositol phospholipid synthesis protein and fat-storage-inducing. The members are conserved from yeasts to humans and are localized to the endoplasmic reticulum where they are involved in triglyceride lipid droplet formation.¡€0€ª€0€ €CDD¡€ €È¢€0€0€ €‚pfam10262, Rdx, Rdx family. This entry is an approximately 100 residue region of selenoprotein-T, conserved from plants to humans. The protein binds to UDP-glucose:glycoprotein glucosyltransferase (UGTR), the endoplasmic reticulum (ER)-resident protein, which is known to be involved in the quality control of protein folding. Selenium (Se) plays an essential role in cell survival and most of the effects of Se are probably mediated by selenoproteins, including selenoprotein T. However, despite its binding to UGTR and that its mRNA is up-regulated in extended asphyxia, the function of the protein and hence of this region of it is unknown. Selenoprotein W contains selenium as selenocysteine in the primary protein structure and levels of this selenoprotein are affected by selenium.¡€0€ª€0€ €CDD¡€ €È¢€0€0€ €èpfam10263, SprT-like, SprT-like family. This family represents a domain found in eukaryotes and prokaryotes. The domain contains a characteristic motif of the zinc metallopeptidases. This family includes the bacterial SprT protein.¡€0€ª€0€ €CDD¡€ €È‘¢€0€0€ €‚1pfam10264, Stork_head, Winged helix Storkhead-box1 domain. This is the conserved N-terminal winged helix domain of Storkhead-box1 protein which is likely to be a DNA binding domain. In humans the full-length protein controls polyploidization of extravillus trophoblast and is implicated in pre-eclampsia.¡€0€ª€0€ €CDD¡€ €È’¢€0€0€ €‚Spfam10265, Miga, Mitoguardin. Mitoguardin (Miga) was first identified in flies as a mitochondrial outer-membrane protein that promotes mitochondrial fusion. Later, the mammalian Miga homologs, Miga1 and Miga2, were identified. They are found to promote mitochondrial fusion by regulating mitochondrial phospholipid metabolism via MitoPLD.¡€0€ª€0€ €CDD¡€ €È“¢€0€0€ €‚¹pfam10266, Strumpellin, Hereditary spastic paraplegia protein strumpellin. This is a family of proteins conserved from plants to humans, in which two closely situated point mutations in the human protein lead to the condition of hereditary spastic paraplegia. Strumpellin contains one known domain called a spectrin repeat that consists of three alpha-helices of a characteristic length wrapped in a left-handed coiled coil. The spectrin proteins have multiple copies of this repeat, which can then form multimers in the cell. Spectrin associates with the cell membrane via spectrin repeats in the ankyrin protein. The spectrin repeat is a structural platform for cytoskeletal protein assemblies.¡€0€ª€0€ €CDD¡€ €È”¢€0€0€ €Âpfam10267, Tmemb_cc2, Predicted transmembrane and coiled-coil 2 protein. This family of transmembrane coiled-coil containing proteins is conserved from worms to humans. Its function is unknown.¡€0€ª€0€ €CDD¡€ €È•¢€0€0€ €âpfam10268, Tmemb_161AB, Predicted transmembrane protein 161AB. Transmemb_161AB is a family of conserved proteins found from worms to humans. Members are putative transmembrane proteins but otherwise the function is not known.¡€0€ª€0€ €CDD¡€ €È–¢€0€0€ €‚4pfam10269, Tmemb_185A, Transmembrane Fragile-X-F protein. This is a family of conserved transmembrane proteins that appear in humans to be expressed from a region upstream of the FragileXF site and to be intimately linked with the Fragile-X syndrome. Absence of TMEM185A does not necessarily lead to developmental delay, but might in combination with other, yet unknown, factors. Otherwise, the lack of the TMEM185A protein is either disposable (redundant) or its function can be complemented by the highly similar chromosome 2 retro-pseudogene product, TMEM185B.¡€0€ª€0€ €CDD¡€ €È—¢€0€0€ €‚ˆpfam10270, MMgT, Membrane magnesium transporter. This entry represents a novel family of membrane magnesium transporters (MMgT). The proteins, MMgT1 and MMgT2, are localized to the Golgi complex and post-Golgi vesicles, including the early endosomes, suggesting that they may provide regulated pathways for Mg(2+) transport in the Golgi and post-Golgi organelles of epithelium-derived cells.¡€0€ª€0€ €CDD¡€ €Ș¢€0€0€ €»pfam10271, Tmp39, Putative transmembrane protein. This is a family of conserved proteins found from worms to humans. They are putative transmembrane proteins but the function is unknown.¡€0€ª€0€ €CDD¡€ €È™¢€0€0€ €Úpfam10272, Tmpp129, Putative transmembrane protein precursor. This is a family of proteins conserved from worms to humans. The proteins are purported to be transmembrane protein-precursors but the function is unknown.¡€0€ª€0€ €CDD¡€ €Èš¢€0€0€ €ñpfam10273, WGG, Pre-rRNA-processing protein TSR2. This entry represents the central conserved section of a family of proteins described as pre-rRNA-processing protein TSR2. The region has a distinctive WGG motif but the function is unknown.¡€0€ª€0€ €CDD¡€ €È›¢€0€0€ €‚±pfam10274, ParcG, Parkin co-regulated protein. This family of proteins is transcribed anti-sense along the DNA to the Parkin gene product and the two appear to be transcribed under the same promoter. The protein has predicted alpha-helical and beta-sheet domains which suggest its function is in the ubiquitin/proteasome system. Mutations in parkin are the genetic cause of early-onset and autosomal recessive juvenile parkinsonism.¡€0€ª€0€ €CDD¡€ €瀢€0€0€ €‚ pfam10275, Peptidase_C65, Peptidase C65 Otubain. This family of proteins conserved from plants to humans is a highly specific ubiquitin iso-peptidase that removes ubiquitin from proteins. The modification of cellular proteins by ubiquitin (Ub) is an important event that underlies protein stability and function in eukaryote being a dynamic and reversible process. Otubain carries several key conserved domains: (i) the OTU (ovarian tumor domain) in which there is an active cysteine protease triad (ii) a nuclear localization signal, (iii) a Ub interaction motif (UIM)-like motif phi-xx-A-xxxs-xx-Ac (where phi indicates an aromatic amino acid, x indicates any amino acid and Ac indicates an acidic amino acid), (iv) a Ub-associated (UBA)-like domain and (v) the LxxLL motif.¡€0€ª€0€ €CDD¡€ €Èœ¢€0€0€ €~pfam10276, zf-CHCC, Zinc-finger domain. This is a short zinc-finger domain conserved from fungi to humans. It is Cx8Hx14Cx2C.¡€0€ª€0€ €CDD¡€ €È¢€0€0€ €‚pfam10277, Frag1, Frag1/DRAM/Sfk1 family. This family includes Frag1, DRAM and Sfk1 proteins. Frag1 (FGF receptor activating protein 1) is a protein that is conserved from fungi to humans. There are four potential iso-prenylation sites throughout the peptide, viz CILW, CIIW and CIGL. Frag1 is a membrane-spanning protein that is ubiquitously expressed in adult tissues suggesting an important cellular function. Dram is a family of proteins conserved from nematodes to humans with six hydrophobic transmembrane regions and an Endoplasmic Reticulum signal peptide. It is a lysosomal protein that induces macro-autophagy as an effector of p53-mediated death, where p53 is the tumor-suppressor gene that is frequently mutated in cancer. Expression of Dram is stress-induced. This region is also part of a family of small plasma membrane proteins, referred to as Sfk1, that may act together with or upstream of Stt4p to generate normal levels of the essential phospholipid PI4P, thus allowing proper localization of Stt4p to the actin cytoskeleton.¡€0€ª€0€ €CDD¡€ €Èž¢€0€0€ €‚ pfam10278, Med19, Mediator of RNA pol II transcription subunit 19. Med19 represents a family of conserved proteins which are members of the multi-protein co-activator Mediator complex. Mediator is required for activation of RNA polymerase II transcription by DNA binding transactivators.¡€0€ª€0€ €CDD¡€ €b/¢€0€0€ €‚:pfam10279, Latarcin, Latarcin precursor. This family represents the precursor proteins for a number of short antimicrobial peptides called Latarcins. Latarcins were discovered in the venom of the spider Lachesana tarabaevi. Latarcins are likely to adopt amphipathic alpha-helical structure in the plasma membrane.¡€0€ª€0€ €CDD¡€ €ïÿ¢€0€0€ €‚#pfam10280, Med11, Mediator complex protein. Mediator is a large, modular protein complex that is conserved from yeast to human and conveys regulatory signals from DNA-binding transcription factors to RNA polymerase II. Not only are the polypeptides conserved but the structural organisation is also largely conserved. One or two subunits are either fungal or vertebral specific but Med11 is one of the subunits that is conserved from fungi to humans. Med11 appears to be necessary for the full and successful assembly of the core head sub-region.¡€0€ª€0€ €CDD¡€ €ÈŸ¢€0€0€ €¬pfam10281, Ish1, Putative stress-responsive nuclear envelope protein. This family of proteins found in fungi is a putative stress-responsive nuclear envelope protein Ish1.¡€0€ª€0€ €CDD¡€ €È ¢€0€0€ €‚/pfam10282, Lactonase, Lactonase, 7-bladed beta-propeller. This entry contains bacterial 6-phosphogluconolactonases (6PGL)YbhE-type (EC:3.1.1.31) which hydrolyse 6-phosphogluconolactone to 6-phosphogluconate. The entry also contains the fungal muconate lactonising enzyme carboxy-cis,cis-muconate cyclase (EC:5.5.1.5) and muconate cycloisomerase (EC:5.5.1.1), which convert cis,cis-muconates to muconolactones and vice versa as part of the microbial beta-ketoadipate pathway. Structures of proteins in this family have revealed a 7-bladed beta-propeller fold.¡€0€ª€0€ €CDD¡€ €È¡¢€0€0€ €‚çpfam10283, zf-CCHH, Zinc-finger (CX5CX6HX5H) motif. This domain is a zinc-finger motif that in humans is part of the APLF, aprataxin- and PNK-like forkead association domain-containing protein. The ZnF is highly conserved both in primary sequence and in the spacing between the putative zinc coordinating residues and is configured CX5CX6HX5H. Many of the proteins containing the APLF-like ZnF are involved in DNA strand break repair and/or contain domains implicated in DNA metabolism.¡€0€ª€0€ €CDD¡€ €È¢¢€0€0€ €‚³pfam10284, Luciferase_3H, Luciferase helical bundle domain. This domain is found associated with the the catalytic domain of dinoflagellate luciferase. Luciferase is involved in catalyzing the light emitting reaction in bioluminescence. The structure of this domain has been solved. This domain has a three helix bundle structure that holds four important histidines that are thought to play a role in the pH regulation of the enzyme.¡€0€ª€0€ €CDD¡€ €Ð¢€0€0€ €‚kpfam10285, Luciferase_cat, Luciferase catalytic domain. This domain is the catalytic domain of dinoflagellate luciferase. Luciferase is involved in catalyzing the light emitting reaction in bioluminescence. The structure of this domain has been solved. The core part of the domain is a 10 stranded beta barrel that is structurally similar to lipocalins and FABP.¡€0€ª€0€ €CDD¡€ €Ð¢€0€0€ €‚pfam10287, DUF2401, Putative TOS1-like glycosyl hydrolase (DUF2401). This family of proteins is conserved in fungi. One member is annotated putatively as OPEL, a house-keeping protein, but this could not be confirmed. It contains 5 highly conserved cysteines two of which form a characteristic CGC sequence motif. It has recently been shown that this family is related to known glycosyl hydrolases.¡€0€ª€0€ €CDD¡€ €È£¢€0€0€ €‚’pfam10288, CTU2, Cytoplasmic tRNA 2-thiolation protein 2. CTU2 is a family of proteins necessary for the formation of the wobble nucleoside 5-methoxycarbonylmethyl-2-thiouridine in Saccharomyces cerevisiae. The family is conserved from plants to humans ]1]. It plays a central role in the 2-thiolation of 5-methoxycarbonylmethyl-2-thiouridine, or the wobble nucleoside. This wobble modification in tRNAs, 5-methoxycarbonylmethyl-2-thiouridine (mcm(5)s(2)U), is required for the proper decoding of NNR codons in eukaryotes. The 2-thio group gives rigidity by largely fixing the C3'-endo ribose puckering, ensuring stable and accurate codon-anticodon pairing.¡€0€ª€0€ €CDD¡€ €Ȥ¢€0€0€ €¿pfam10290, DUF2403, Glycine-rich protein domain (DUF2403). This domain is found in the N-terminal region of members of DUF2401 pfam10287. The function of this glycine-rich region is unknown.¡€0€ª€0€ €CDD¡€ €È¥¢€0€0€ €‚pfam10291, muHD, Muniscin C-terminal mu homology domain. The muniscins are a family of endocytic adaptors that is conserved from yeast to humans.This C-terminal domain is structurally similar to mu homology domains, and is the region of the muniscin proteins involved in the interactions with the endocytic adaptor-scaffold proteins Ede1-eps15. This interaction influences muniscin localization. The muniscins provide a combined adaptor-membrane-tubulation activity that is important for regulating endocytosis.¡€0€ª€0€ €CDD¡€ €Ȧ¢€0€0€ €‚pfam10292, 7TM_GPCR_Srab, Serpentine type 7TM GPCR receptor class ab chemoreceptor. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srab is part of the Sra superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The expression pattern of the srab genes is biologically intriguing. Of the six promoters successfully expressed in transgenic organisms, one was exclusively expressed in the tail phasmid neurons, two were exclusively expressed in a head amphid neuron, and two were expressed both in the head and tail neurons as well as a limited number of other cells.¡€0€ª€0€ €CDD¡€ €ȧ¢€0€0€ €špfam10293, DUF2405, Domain of unknown function (DUF2405). This is a conserved region of a family of proteins conserved in fungi. The function is unknown.¡€0€ª€0€ €CDD¡€ €Ȩ¢€0€0€ €‚ pfam10294, Methyltransf_16, Lysine methyltransferase. Methyltrans_16 is a lysine methyltransferase. characterized members of this family are protein methyltransferases targetting Lys residues in specific proteins, including calmodulin, VCP, Kin17 and Hsp70 proteins.¡€0€ª€0€ €CDD¡€ €È©¢€0€0€ €‰pfam10295, DUF2406, Uncharacterized protein (DUF2406). This is a family of small proteins conserved in fungi. The function is not known.¡€0€ª€0€ €CDD¡€ €Ȫ¢€0€0€ €‚ pfam10296, MMM1, Maintenance of mitochondrial morphology protein 1. MMM1 is conserved from plants to humans. MMM1 is an integral ER protein. It is N-glycosylated, and forms a complex with Mdm10, Mdm12and Mdm34 to tether the mitochondria to the endoplasmic reticulum.¡€0€ª€0€ €CDD¡€ €È«¢€0€0€ €‚cpfam10297, Hap4_Hap_bind, Minimal binding motif of Hap4 for binding to Hap2/3/5. In Saccharomyces cerevisiae, the haem-activated protein complex Hap2/3/4/5 plays a major role in the transcription of genes involved in respiration. Hap4_Hap_bind is the essential domain of Hap4 which allows it to associate with Hap2, Hap3 and Hap5 to form the Hap complex.¡€0€ª€0€ €CDD¡€ €b<¢€0€0€ €‚pfam10298, WhiA_N, WhiA N-terminal LAGLIDADG-like domain. This domain is found at the N terminal of sporulation factor WhiA. This domain is related to the LAGLIDADG Homing endonuclease domain while the C terminal domain of WhiA is predicted to be a DNA binding helix-turn-helix domain.¡€0€ª€0€ €CDD¡€ €Ȭ¢€0€0€ €Èpfam10300, DUF3808, Protein of unknown function (DUF3808). This is a family of proteins conserved from fungi to humans. Members of this family also carry a TPR_2 domain pfam07719 at their C-terminus.¡€0€ª€0€ €CDD¡€ €b>¢€0€0€ €«pfam10302, DUF2407, DUF2407 ubiquitin-like domain. This is a family of proteins found in fungi. The function is not known. This domain is related to the ubiquitin domain.¡€0€ª€0€ €CDD¡€ €È­¢€0€0€ €…pfam10303, DUF2408, Protein of unknown function (DUF2408). This is a family of proteins conserved in fungi. The function is unknown.¡€0€ª€0€ €CDD¡€ €È®¢€0€0€ €‚7pfam10304, RTP1_C2, Required for nuclear transport of RNA pol II C-terminus 2. This domain is found towards the C-terminus of required for the nuclear transport of RNA pol II protein (RTP1). RTP1 is required for the nuclear localization of RNA polymerase II. This family is found in association with pfam10363.¡€0€ª€0€ €CDD¡€ €ȯ¢€0€0€ €åpfam10305, Fmp27_SW, RNA pol II promoter Fmp27 protein domain. Fmp27_SW is a conserved domain of a family of proteins involved in RNA polymerase II transcription initiation. It contains characteristic SW and GKG sequence motifs.¡€0€ª€0€ €CDD¡€ €Ȱ¢€0€0€ €ïpfam10306, FLILHELTA, Hypothetical protein FLILHELTA. This is a family of conserved proteins found in fungi. It contains a characteristic FL(I)LHE(L)TA sequence motif, where the bracketed residues are I, L or V. The function is not known.¡€0€ª€0€ €CDD¡€ €ȱ¢€0€0€ €»pfam10307, DUF2410, Hypothetical protein (DUF2410). This is a family of proteins conserved in fungi. The function is not known.There are two characteristic sequence motifs, GGWW and TGR.¡€0€ª€0€ €CDD¡€ €Ȳ¢€0€0€ €úpfam10309, NCBP3, Nuclear cap-binding protein subunit 3. NCBP3 and NCBP1 form an alternative cap-binding complex in higher eukaryotes. NCBP3 binds mRNA, associates with components of the mRNA processing machinery and contributes to polyA RNA export.¡€0€ª€0€ €CDD¡€ €ȳ¢€0€0€ €Ëpfam10310, Mtc1, Maintenance of telomere capping protein 1. In Saccharomyces cerevisiae, maintenance of telomere capping protein 1 (Mtc1) may interact with ribosomes and is involved in telomere capping.¡€0€ª€0€ €CDD¡€ €È´¢€0€0€ €‚ pfam10311, Ilm1, Increased loss of mitochondrial DNA protein 1. This is a family of proteins of approximately 200 residues that are conserved in fungi. Ilm1 is part of the peroxisome, a complex that is the sole site of beta-oxidation in Saccharomyces cerevisiae and known to be required for optimal growth in the presence of fatty acid. Ilm1 may participate in the control of the C16/C18 ratio since it interacts strongly with Mga2p, a transcription factor that controls expression of Ole1, the sole fatty acyl desaturase in S. cerevisiae responsible for conversion of the saturated fatty acids stearate (C18) and palmitate (C16) to oleate and palmitoleate, respectively.¡€0€ª€0€ €CDD¡€ €ȵ¢€0€0€ €‚åpfam10312, Cactin_mid, Conserved mid region of cactin. This is the conserved middle region of a family of proteins referred to as cactins. The region contains two of three predicted coiled-coil domains. Most members of this family have a CactinC_cactus pfam09732 domain at the C-terminal end. Upstream of Mid_cactin in Drosophila members are a serine-rich region, some non-typical RD motifs and three predicted bipartite nuclear localization signals, none of which are well-conserved. Cactin associates with IkappaB-cactus as one of the intracellular members of the Rel (NF-kappaB) pathway which is conserved in invertebrates and vertebrates. In mammals, this pathway controls the activities of the immune and inflammatory response genes as well as viral genes, and is critical for cell growth and survival. In Drosophila, the Rel pathway functions in the innate cellular and humoral immune response, in muscle development, and in the establishment of dorsal-ventral polarity in the early embryo.¡€0€ª€0€ €CDD¡€ €ȶ¢€0€0€ €×pfam10313, DUF2415, Uncharacterized protein domain (DUF2415). This is a short, 30 residue domain, from a family of proteins conserved in fungi. The function is unknown. There is a characteristic DLL sequence motif.¡€0€ª€0€ €CDD¡€ €È·¢€0€0€ €”pfam10315, Aim19, Altered inheritance of mitochondria protein 19. This is a family of conserved proteins found in fungi. The function is not known.¡€0€ª€0€ €CDD¡€ €ȸ¢€0€0€ €‚·pfam10316, 7TM_GPCR_Srbc, Serpentine type 7TM GPCR chemoreceptor Srbc. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srbc is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'.¡€0€ª€0€ €CDD¡€ €ȹ¢€0€0€ €‚¯pfam10317, 7TM_GPCR_Srd, Serpentine type 7TM GPCR chemoreceptor Srd. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srd is part of the larger Str superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'.¡€0€ª€0€ €CDD¡€ €Ⱥ¢€0€0€ €‚¨pfam10318, 7TM_GPCR_Srh, Serpentine type 7TM GPCR chemoreceptor Srh. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srh is part of the Str superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'.¡€0€ª€0€ €CDD¡€ €È»¢€0€0€ €‚,pfam10319, 7TM_GPCR_Srj, Serpentine type 7TM GPCR chemoreceptor Srj. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srj is part of the Str superfamily of chemoreceptors. The srj family is designated as the out-group based on its location in preliminary phylogenetic analyses of the entire superfamily. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'.¡€0€ª€0€ €CDD¡€ €ȼ¢€0€0€ €‚·pfam10320, 7TM_GPCR_Srsx, Serpentine type 7TM GPCR chemoreceptor Srsx. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srsx is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'.¡€0€ª€0€ €CDD¡€ €矢€0€0€ €‚¬pfam10321, 7TM_GPCR_Srt, Serpentine type 7TM GPCR chemoreceptor Srt. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srt is a member of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'.¡€0€ª€0€ €CDD¡€ €Ƚ¢€0€0€ €‚¬pfam10322, 7TM_GPCR_Sru, Serpentine type 7TM GPCR chemoreceptor Sru. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Sru is a member of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'.¡€0€ª€0€ €CDD¡€ €Ⱦ¢€0€0€ €‚¬pfam10323, 7TM_GPCR_Srv, Serpentine type 7TM GPCR chemoreceptor Srv. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srv is a member of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'.¡€0€ª€0€ €CDD¡€ €È¿¢€0€0€ €‚"pfam10324, 7TM_GPCR_Srw, Serpentine type 7TM GPCR chemoreceptor Srw. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz.¡€0€ª€0€ €CDD¡€ €ÈÀ¢€0€0€ €‚pfam10325, 7TM_GPCR_Srz, Serpentine type 7TM GPCR chemoreceptor Srz. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srz is a solo families amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srz appear to be under strong adaptive evolutionary pressure.¡€0€ª€0€ €CDD¡€ €ÈÁ¢€0€0€ €‚Npfam10326, 7TM_GPCR_Str, Serpentine type 7TM GPCR chemoreceptor Str. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Str is a member of the Str superfamily of chemoreceptors. Almost a quarter (22.5%) of str and srj family genes and pseudogenes in C. elegans appear to have been newly formed by gene duplications since the species split. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'.¡€0€ª€0€ €CDD¡€ €È¢€0€0€ €‚¨pfam10327, 7TM_GPCR_Sri, Serpentine type 7TM GPCR chemoreceptor Sri. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Sri is part of the Str superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'.¡€0€ª€0€ €CDD¡€ €Èâ€0€0€ €‚¨pfam10328, 7TM_GPCR_Srx, Serpentine type 7TM GPCR chemoreceptor Srx. Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'.¡€0€ª€0€ €CDD¡€ €ÈÄ¢€0€0€ €øpfam10329, DUF2417, Region of unknown function (DUF2417). This is a region of a family of proteins conserved in fungi some of whose members also have the Abhydrolase_1, pfam00561, domain in their sequence. The function of this region is not known.¡€0€ª€0€ €CDD¡€ €ÈÅ¢€0€0€ €‚pfam10330, Stb3, Putative Sin3 binding protein. This is a family of the conserved N-terminal end of a group of proteins conserved in fungi. It is likely to be a Sin3 binding protein. Sin3p does not bind DNA directly even though the yeast SIN3 gene functions as a transcriptional repressor. Sin3p is part of a large multiprotein complex. Stb3 appears to bind directly to ribosomal RNA Processing Elements (RRPE) although there are no obvious domains which would accord with this, implying that Stb3 may be a novel RNA-binding protein.¡€0€ª€0€ €CDD¡€ €ÈÆ¢€0€0€ €Ýpfam10332, DUF2418, Protein of unknown function (DUF2418). This is a conserved 100 residue central region of a family of proteins found in fungi. It carries a characteristic EYD sequence motif. The function is not known.¡€0€ª€0€ €CDD¡€ €ÈÇ¢€0€0€ €‚¤pfam10333, Pga1, GPI-Mannosyltransferase II co-activator. Pga1 is found only in yeasts and not in mammals. It localizes in the ER as a glycosylated integral membrane protein. It binds to the GPI-mannosyltransferase II subunit of the GPI and it is responsible for the second mannose addition to GPI precursors. The GPI-anchoring complex is a glycolipid that functions as a membrane anchor for many cell-surface proteins.¡€0€ª€0€ €CDD¡€ €bZ¢€0€0€ €‡pfam10334, ArAE_2, Aromatic acid exporter family member 2. This is a family of proteins conserved in fungi. The function is not known.¡€0€ª€0€ €CDD¡€ €ÈÈ¢€0€0€ €êpfam10335, DUF294_C, Putative nucleotidyltransferase substrate binding domain. This domain is found associated with presumed nucleotidyltransferase domains and seems to be distantly related to other helical substrate binding domains.¡€0€ª€0€ €CDD¡€ €ÈÉ¢€0€0€ €‡pfam10336, DUF2420, Protein of unknown function (DUF2420). This is a family of proteins conserved in fungi. The function is not known.¡€0€ª€0€ €CDD¡€ €ÈÊ¢€0€0€ €‚^pfam10337, ArAE_2_N, Putative ER transporter, 6TM, N-terminal. This is a family of proteins conserved in fungi. The function is not known. This family is the C-terminal half of some member proteins which contain the DUF2421 pfam10334 domain at their N-terminus. These proteins are putative endoplasmic reticulum tranpsorters, with a total of 12 TMs.¡€0€ª€0€ €CDD¡€ €ÈË¢€0€0€ €‡pfam10338, DUF2423, Protein of unknown function (DUF2423). This is a family of proteins conserved in fungi. The function is not known.¡€0€ª€0€ €CDD¡€ €ÈÌ¢€0€0€ €‚«pfam10339, Vel1p, Yeast-specific zinc responsive. This is a small family of proteins from Saccharomyces and related species. The function is not known but member proteins are highly induced in zinc-depleted conditions and have increased expression in NAP1-deletion mutants. The S. cerevisiae genes are named VEL by association with Velum formation in the wine making process http://www.ajevonline.org/content/48/1/55.abstract.¡€0€ª€0€ €CDD¡€ €b`¢€0€0€ €‚Spfam10340, Say1_Mug180, Steryl acetyl hydrolase. This entry includes budding yeast steryl acetyl hydrolase 1 (Say1) and fission yeast Mug180. Say1 is a a membrane-anchored deacetylase required for the deacetylation of acetylated sterols. It is involved in the resistance to eugenol and pregnenolone toxicity. Mug180 has a role in meiosis.¡€0€ª€0€ €CDD¡€ €ÈÍ¢€0€0€ €‚Ápfam10341, TPP1, Shelterin complex subunit, TPP1/ACD. TPP1 is a component of the telomerase holoenzyme, involved in telomere replication. It has been demonstrated that TPP1 dimerizes and binds to DNA and RNA. Furthermore, TPP1 stimulates the dissociation of RNA/DNA hetero-duplexes. Yeast telomerase protein TPP1 (Est3 in yeast) is a novel type of GTPase. The key residues in yeast EST3 are an Asp at residue 86 and the Arg at residue 110. The Asp is totally conserved in the family, whereas the Arg is not so well conserved. The N-terminal of TPP1 is likely to be the binding surface for TINF2, whereas the C-terminus probably binds to POT1, thereby tethering POT1 to the shelterin complex. The complex bound to telomeric DNA increases the activity and processivity of the human telomerase core enzyme, thus helping to maintain the length of the telomeres. This domain is conserved from fungi to mammals, hence family Telomere_Pot1 has been merged into the family. The human shelterin complex includes six proteins: telomere repeat binding factor 1 (TRF1), TRF2, repressor/activator protein 1 (RAP1), TRF1-interacting nuclear protein 2 (TIN2), TIN2-interacting protein 1 (TPP1) and protection of telomeres 1 (POT1).¡€0€ª€0€ €CDD¡€ €È΢€0€0€ €‚ pfam10342, GPI-anchored, Ser-Thr-rich glycosyl-phosphatidyl-inositol-anchored membrane family. Some members of this family appear to be serine- threonine-rich membrane-anchored proteins, anchored by glycosyl-phosphatidylinositol. In A. fumigatus these proteins play a role in fungal cell wall organisation. In Lentinula edodes this family is involved in fruiting body formation, and may have a more general role in signalling in other organisms as it interacts with MAPK. The family is also found in archaea and bacteria.¡€0€ª€0€ €CDD¡€ €ÈÏ¢€0€0€ €‚Spfam10343, Q_salvage, Potential Queuosine, Q, salvage protein family. Q_salvage proteins occur in most Eukarya as well as in a few bacteria possible via horizontal gene-transfer. Queuosine (Q) is a chemical modification found at the wobble position of tRNAs that have GUN anticodons. Most bacteria synthesize queuosine de novo, whereas eukaryotes rely solely on salvaging this essential component from the environment or the gut flora. The exact enzymatic function of the domain has yet to be determined, but structural similarity with DNA glycosidases suggests a ribonucleoside hydrolase role.¡€0€ª€0€ €CDD¡€ €ÈТ€0€0€ €‚¿pfam10344, Fmp27, Mitochondrial protein from FMP27. This family contains mitochondrial FMP27 proteins which in yeasts together with SEN1 are long genes that exist in a looped conformation, effectively bringing together their promoter and terminator regions. Pol-II is located at both ends of FMP27 when this gene is transcribed from a GAL1 promoter under induced and non-induced conditions. The exact function of the Fmp27 protein is not certain.¡€0€ª€0€ €CDD¡€ €ÈÑ¢€0€0€ €‚pfam10345, Cohesin_load, Cohesin loading factor. Cohesin_load is a common cohesin loading factor protein that is conserved in fungi. It is associated with the cohesin complex and is required in G1 for cohesin binding to chromosomes but dispensable in G2 when cohesion has been established. It is referred to as both Ssl3, in pombe, and Scc4, in S.cerevisiae. It complexes with Mis4.¡€0€ª€0€ €CDD¡€ €ÈÒ¢€0€0€ €‚ pfam10346, Con-6, Conidiation protein 6. Con-6 is the conserved N-terminal region of a family of small proteins found in fungi. It is expressed at approximately 6 hours after the induction of development and is induced just prior to major constriction-chain growth.¡€0€ª€0€ €CDD¡€ €ÈÓ¢€0€0€ €‚;pfam10347, Fmp27_GFWDK, RNA pol II promoter Fmp27 protein domain. Fmp27_GFWDK is a conserved domain of a family of proteins involved in RNA polymerase II transcription initiation. It contains characteristic GFWDK sequence motifs. Some members are associated with domain Fmp27_SW (pfam10305) towards the N terminus.¡€0€ª€0€ €CDD¡€ €ÈÔ¢€0€0€ €ìpfam10348, DUF2427, Domain of unknown function (DUF2427). This is the N-terminal region of a family of proteins conserved in fungi. Several members are annotated as being Ftp1 but this could not be confirmed. The function is not known.¡€0€ª€0€ €CDD¡€ €ÈÕ¢€0€0€ €‚Ypfam10350, DUF2428, Putative death-receptor fusion protein (DUF2428). This is a family of proteins conserved from plants to humans. The function is not known. Several members have been annotated as being HEAT repeat-containing proteins while others are designated as death-receptor interacting proteins, but neither of these could be confirmed.¡€0€ª€0€ €CDD¡€ €ÈÖ¢€0€0€ €‚˜pfam10351, Apt1, Golgi-body localization protein domain. This is the C-terminus of a family of proteins conserved from plants to humans. The plant members are localized to the Golgi proteins and appear to regulate membrane trafficking, as they are required for rapid vesicle accumulation at the tip of the pollen tube. The C-terminus probably contains the Golgi localization signal and it is well-conserved.¡€0€ª€0€ €CDD¡€ €È×¢€0€0€ €špfam10353, DUF2430, Protein of unknown function (DUF2430). This is a family of short, 111 residue, proteins found in S. pombe. The function is not known.¡€0€ª€0€ €CDD¡€ €ÐZ¢€0€0€ €¨pfam10354, DUF2431, Domain of unknown function (DUF2431). This is the N-terminal domain of a family of proteins found from plants to humans. The function is not known.¡€0€ª€0€ €CDD¡€ €ÈØ¢€0€0€ €‚hpfam10355, Ytp1, Protein of unknown function (Ytp1). This is a family of proteins found in fungi. The region appears to contain regions similar to mitochondrial electron transport proteins. The C-terminal domain is hydrophobic and negatively charged. There are consensus sites for both N-linked glycosylation and cAMP-dependent protein kinase phosphorylation.¡€0€ª€0€ €CDD¡€ €ÈÙ¢€0€0€ €{pfam10356, DUF2034, Protein of unknown function (DUF2034). This protein is expressed in fungi but its function is unknown.¡€0€ª€0€ €CDD¡€ €bo¢€0€0€ €‚¯pfam10357, Kin17_mid, Domain of Kin17 curved DNA-binding protein. Kin17_mid is the conserved central 169 residue region of a family of Kin17 proteins. Towards the N-terminal end there is a zinc-finger domain, and in human and mouse members there is a RecA-like domain further downstream. The Kin17 protein in humans forms intra-nuclear foci during cell proliferation and is re-distributed in the nucleoplasm during the cell cycle.¡€0€ª€0€ €CDD¡€ €ÈÚ¢€0€0€ €‚pfam10358, NT-C2, N-terminal C2 in EEIG1 and EHBP1 proteins. This version of the C2 domain was initally identified in the vertebrate estrogen early-induced gene 1 (EEIG1), and its Drosophila ortholog required for uptake of dsRNA via the endocytotic machinery to induce RNAi silencing. It is also in C.elegans ortholog Sym-3 (SYnthetic lethal with Mec-3) and the mammalian protein EHBP1 (EH domain Binding Protein-1) that regulates endocytotic recycling and two plant proteins, RPG that regulates Rhizobium-directed polar growth and PMI1 (Plastid Movement Impaired 1) that is essential for intracellular movement of chloroplasts in response to blue light.¡€0€ª€0€ €CDD¡€ €ÈÛ¢€0€0€ €‚6pfam10359, Fmp27_WPPW, RNA pol II promoter Fmp27 protein domain. Fmp27_WPPW is a conserved domain of a family of proteins involved in RNA polymerase II transcription initiation. It contains characteristic HQR and WPPW sequence motifs. and is towards the C-terminal in members which contain Fmp27_SW pfam10305.¡€0€ª€0€ €CDD¡€ €ÈÜ¢€0€0€ €¥pfam10360, DUF2433, Protein of unknown function (DUF2433). This is a conserved 120 residue region of a family of proteins found in fungi. The function is not known.¡€0€ª€0€ €CDD¡€ €ÈÝ¢€0€0€ €‡pfam10361, DUF2434, Protein of unknown function (DUF2434). This is a family of proteins conserved in fungi. The function is not known.¡€0€ª€0€ €CDD¡€ €ÈÞ¢€0€0€ €‚7pfam10363, RTP1_C1, Required for nuclear transport of RNA pol II C-terminus 1. This domain is found towards the C-terminus of required for the nuclear transport of RNA pol II protein (RTP1). RTP1 is required for the nuclear localization of RNA polymerase II. This family is found in association with pfam10304.¡€0€ª€0€ €CDD¡€ €Èߢ€0€0€ €‚pfam10364, NKWYS, Putative capsular polysaccharide synthesis protein. Found only in Vibrio species, pombe and one other fungi, this is a the N-terminal 150 residues of a family of proteins of unknown function. There is a characteristic NKWYS sequence motif.¡€0€ª€0€ €CDD¡€ €bv¢€0€0€ €„pfam10365, DUF2436, Domain of unknown function (DUF2436). This domain is found on peptidase C25 proteins and has no known function.¡€0€ª€0€ €CDD¡€ €bw¢€0€0€ €‚­pfam10366, Vps39_1, Vacuolar sorting protein 39 domain 1. This domain is found on the vacuolar sorting protein Vps39 which is a component of the C-Vps complex. Vps39 is thought to be required for the fusion of endosomes and other types of transport intermediates with the vacuole. In Saccharomyces cerevisiae, Vps39 has been shown to stimulate nucleotide exchange. The precise function of this domain has not been characterized.¡€0€ª€0€ €CDD¡€ €Èࢀ0€0€ €‚Ìpfam10367, Vps39_2, Vacuolar sorting protein 39 domain 2. This domain is found on the vacuolar sorting protein Vps39 which is a component of the C-Vps complex. Vps39 is thought to be required for the fusion of endosomes and other types of transport intermediates with the vacuole. In Saccharomyces cerevisiae, Vps39 has been shown to stimulate nucleotide exchange. This domain is involved in localization and in mediating the interactions of Vps39 with Vps11.¡€0€ª€0€ €CDD¡€ €Èᢀ0€0€ €‚?pfam10368, YkyA, Putative cell-wall binding lipoprotein. YkyA is a family of proteins containing a lipoprotein signal and a hydrolase domain. It is similar to cell wall binding proteins and might also be recognisable by a host immune defense system. It is thus likely to belong to pathways important for pathogenicity.¡€0€ª€0€ €CDD¡€ €È⢀0€0€ €‚Œpfam10369, ALS_ss_C, Small subunit of acetolactate synthase. ALS_ss_C is the C-terminal half of a family of proteins which are the small subunits of acetolactate synthase. Acetolactate synthase is a tetrameric enzyme, containing probably two large and two small subunits, which catalyzes the first step in branched-chain amino acid biosynthesis. This reaction is sensitive to certain herbicides.¡€0€ª€0€ €CDD¡€ €È㢀0€0€ €‚"pfam10370, DUF2437, Domain of unknown function (DUF2437). This is the N-terminal 50 amino acids of a group of bacterial proteins annotated as fumarylacetoacetate hydrolase-containing enzymes. In most cases members are associated with FAA_hydrolase pfam01557 further towards the C-terminus.¡€0€ª€0€ €CDD¡€ €È䢀0€0€ €‚fpfam10371, EKR, Domain of unknown function. EKR is a short, 33 residue, domain found in bacterial and some lower eukaryotic species which lies between a POR (pyruvate ferredoxin/flavodoxin oxidoreductase) pfam01558 and the 4Fe-4S binding domain Fer4 pfam00037. It contains a characteristic EKR sequence motif. The exact function of this domain is not known.¡€0€ª€0€ €CDD¡€ €È墀0€0€ €ñpfam10372, YojJ, Bacterial membrane-spanning protein N-terminus. YojJ is the N-terminus of a family of bacterial proteins some of which are associated with DUF147 pfam02457 towards the C-terminus. It is a putative membrane-spanning protein.¡€0€ª€0€ €CDD¡€ €È梀0€0€ €»pfam10373, EST1_DNA_bind, Est1 DNA/RNA binding domain. Est1 is a protein which recruits or activates telomerase at the site of polymerization. This is the DNA/RNA binding domain of EST1.¡€0€ª€0€ €CDD¡€ €È碀0€0€ €ºpfam10374, EST1, Telomerase activating protein Est1. Est1 is a protein which recruits or activates telomerase at the site of polymerization. Structurally it resembles a TPR-like repeat.¡€0€ª€0€ €CDD¡€ €È袀0€0€ €‚äpfam10375, GRAB, GRIP-related Arf-binding domain. The GRAB (GRIP-related Arf-binding) domain is towards the C-terminus of Rud3 type proteins. This domain is related to the GRIP domain, but the conserved tyrosine residue found at position 4 in all GRIP domains is replaced by a leucine residue. The Arf small GTPase is localized to the cis-Golgi where it recruits proteins via their GRAB domain, as part of the transport of cargo from the endoplasmic reticulum to the plasma membrane.¡€0€ª€0€ €CDD¡€ €È颀0€0€ €‚ïpfam10376, Mei5, Double-strand recombination repair protein. Mei5 is one of a pair of meiosis-specific proteins which facilitate the loading of Dmc1 on to Rad51 on DNA at double-strand breaks during recombination. Recombination is carried out by a large protein complex based around the two RecA homologs, Rad51 and Dmc1. This complex may play both a catalytic and a structural role in the interaction between homologous chromosomes during meiosis. Mei5 is seen to contain a coiled-coli region.¡€0€ª€0€ €CDD¡€ €Èꢀ0€0€ €‚Ppfam10377, ATG11, Autophagy-related protein 11. The function of this family is conflicting. In the fission yeast, Schizosaccharomyces pombe, this protein has been shown to interact with the telomere cap complex. However, in budding yeast, Saccharomyces cerevisiae, this protein is called ATG11 and is shown to be involved in autophagy.¡€0€ª€0€ €CDD¡€ €È뢀0€0€ €‚pfam10378, RRM, Putative RRM domain. This is a putative RRM, RNA-binding, domain found only in fungi. It occurs in proteins annotated as Nrd1 yeast proteins, which are known to carry RRM domains. It is not homologous with any of the other RRM domains, eg RRM_1 pfam00076.¡€0€ª€0€ €CDD¡€ €È좀0€0€ €ƒpfam10379, nec1, Virulence protein nec1. This is a family of virulence proteins that are found in pathogenic Streptomyces species.¡€0€ª€0€ €CDD¡€ €b…¢€0€0€ €¬pfam10380, CRF1, Transcription factor CRF1. CRF1 is a transcription factor that co-represses ribosomal genes with FHL1 via the TOR signalling pathway and protein kinase A.¡€0€ª€0€ €CDD¡€ €Èí¢€0€0€ €‚spfam10381, Autophagy_C, Autophagocytosis associated protein C-terminal. Autophagocytosis is a starvation-induced process responsible for transport of cytoplasmic proteins to the vacuole. The small C-terminal domain is likely to be a distinct binding region for the stability of the autophagosome complex. It carries a highly characteristic conserved FLKF sequence motif.¡€0€ª€0€ €CDD¡€ €È0€0€ €àpfam10382, DUF2439, Protein of unknown function (DUF2439). Proteins in this family have been implicated in telomere maintenance in Saccharomyces cerevisiae and in meiotic chromosome segregation in Schizosaccharomyces pombe.¡€0€ª€0€ €CDD¡€ €È0€0€ €‚pfam10383, Clr2, Transcription-silencing protein Clr2. Clr2 is a chromatin silencing protein, one of a quartet of proteins forming the core of SHREC, a multienzyme effector complex that mediates hetero-chromatic transcriptional gene silencing in fission yeast. Clr2 does not have any obvious well-conserved domains but, along with the other core proteins, binds to the histone deacetylase Clr3, and on its own might also have a role in chromatin organisation at the cnt domain, the site of kinetochore assembly.¡€0€ª€0€ €CDD¡€ €Èð¢€0€0€ €‚pfam10384, Scm3, Centromere protein Scm3. Scm3 is a centromere protein that has been shown in Saccharomyces cerevisiae to be required for G2/M progression and Cse4 localization. The C terminal region of Scm3 proteins is variable in size and sometimes consists of DNA binding motifs.¡€0€ª€0€ €CDD¡€ €Èñ¢€0€0€ €‚àpfam10385, RNA_pol_Rpb2_45, RNA polymerase beta subunit external 1 domain. RNA polymerases catalyze the DNA-dependent polymerization of RNA. Prokaryotes contain a single RNA polymerase compared with three in eukaryotes (not including mitochondrial or chloroplast polymerases). This domain in prokaryotes spans the gap between domains 4 and 5 of the yeast protein. It is also known as the external 1 region of the polymerase and is bound in association with the external 2 region.¡€0€ª€0€ €CDD¡€ €Èò¢€0€0€ €Çpfam10386, DUF2441, Protein of unknown function (DUF2441). This is a family of highly conserved, predicted, proteins from Bacillus species. The structure forms a homo-dimer. The function is unknown.¡€0€ª€0€ €CDD¡€ €bŒ¢€0€0€ €‚ pfam10387, DUF2442, Protein of unknown function (DUF2442). This family of bacterial and fungal proteins has several members annotated as being putative molybdopterin-guanine dinucleotide biosynthesis protein A; however this could not be verified. Hence the function is not known. This family also includes the DUF3532 that was found to be related and was merged into this family. Members of this family also fall into the NE0471 N-terminal domain-like superfamily, a family of proteins with a unique fold in SCOP:143880.¡€0€ª€0€ €CDD¡€ €Èó¢€0€0€ €‚¬pfam10388, YkuI_C, EAL-domain associated signalling protein domain. In Bacillus species this highly conserved region of the YkuI protein lies immediately downstream of the EAL (diguanylate cyclase/phosphodiesterase domain 2) pfam00563 domain so that together they form a monomer which dimerizes for its enzymatic action. The region contains three alpha helices and five beta strands and is the C-terminal half of the structure.¡€0€ª€0€ €CDD¡€ €Èô¢€0€0€ €‚pfam10389, CoatB, Bacteriophage coat protein B. CoatB is a single filamentous bacteriophage alpha helix of approximately 44 residues. It is likely to assemble into a complex of 35 monomers in a Catherine-wheel like formation. It is the major coat protein of the virion.¡€0€ª€0€ €CDD¡€ €Èõ¢€0€0€ €‚ãpfam10390, ELL, RNA polymerase II elongation factor ELL. ELL is a family of RNA polymerase II elongation factors. It is bound stably to elongation-associated factors 1 and 2, EAFs, and together these act as a strong regulator of transcription activity. by direct interaction with Pol II. ELL binds to pol II on its own but the affinity is greatly increased by the cooperation of EAF. Some members carry an Occludin domain pfam07303 just downstream. There is no S. cerevisiae member.¡€0€ª€0€ €CDD¡€ €Èö¢€0€0€ €‚¦pfam10391, DNA_pol_lambd_f, Fingers domain of DNA polymerase lambda. DNA polymerases catalyze the addition of dNMPs onto the 3-prime ends of DNA chains. There is a general polymerase fold consisting of three subdomains that have been likened to the fingers, palm, and thumb of a right hand. DNA_pol_lambd_f is the central three-helical region of DNA polymerase lambda referred to as the F and G helices of the fingers domain. Contacts with DNA involve this conserved helix-hairpin-helix motif in the fingers region which interacts with the primer strand. This motif is common to several DNA binding proteins and confers a sequence-independent interaction with the DNA backbone.¡€0€ª€0€ €CDD¡€ €È÷¢€0€0€ €‚Xpfam10392, COG5, Golgi transport complex subunit 5. The COG complex, the peripheral membrane oligomeric protein complex involved in intra-Golgi protein trafficking, consists of eight subunits arranged in two lobes bridged by Cog1. Cog5 is in the smaller, B lobe, bound in with Cog6-8, and is itself bound to Cog1 as well as, strongly, to Cog7.¡€0€ª€0€ €CDD¡€ €ð6¢€0€0€ €‚èpfam10393, Matrilin_ccoil, Trimeric coiled-coil oligomerization domain of matrilin. This short domain is a coiled coil structure and has a single cysteine residue at the start which is likely to form a di-sulfide bridge with a corresponding cysteine in an upstream EGF (pfam00008) domain thereby spanning a VWA (pfam00092) domain. All three domains can be associated together as in the cartilage matrix protein matrilin, where this domain is likely to be responsible for oligomerization.¡€0€ª€0€ €CDD¡€ €Èø¢€0€0€ €‚“pfam10394, Hat1_N, Histone acetyl transferase HAT1 N-terminus. This domain is the N-terminal half of the structure of histone acetyl transferase HAT1. It is often found in association with the C-terminal part of the GNAT Acetyltransf_1 (pfam00583) domain. It seems to be motifs C and D of the structure. Histone acetyltransferases (HATs) catalyze the transfer of an acetyl group from acetyl-CoA to the lysine E-amino groups on the N-terminal tails of histones. HATs are involved in transcription since histones tend to be hyper-acetylated in actively transcribed regions of chromatin, whereas in transcriptionally silent regions histones are hypo-acetylated.¡€0€ª€0€ €CDD¡€ €Èù¢€0€0€ €‚pfam10395, Utp8, Utp8 family. Utp8 is an essential component of the nuclear tRNA export machinery in Saccharomyces cerevisiae. It is a tRNA binding protein that acts at a step between tRNA maturation /aminoacylation, and translocation of the tRNA across the nuclear pore complex.¡€0€ª€0€ €CDD¡€ €Èú¢€0€0€ €‚˜pfam10396, TrmE_N, GTP-binding protein TrmE N-terminus. This family represents the shorter, B, chain of the homo-dimeric structure which is a guanine nucleotide-binding protein that binds and hydrolyses GTP. TrmE is homologous to the tetrahydrofolate-binding domain of N,N-dimethylglycine oxidase and indeed binds formyl-tetrahydrofolate. TrmE actively participates in the formylation reaction of uridine and regulates the ensuing hydrogenation reaction of a Schiff's base intermediate. This B chain is the N-terminal portion of the protein consisting of five beta-strands and three alpha helices and is necessary for mediating dimer formation within the protein.¡€0€ª€0€ €CDD¡€ €Èû¢€0€0€ €‚ƒpfam10397, ADSL_C, Adenylosuccinate lyase C-terminus. This is the C-terminal seven alpha helices of the structure whose full length represents the enzyme adenylosuccinate lyase. This sequence lies C-terminal to the conserved motif necessary for beta-elimination reactions, Adenylosuccinate lyase catalyzes two steps in the synthesis of purine nucleotides: the conversion of succinylaminoimidazole-carboxamide ribotide into aminoimidazole-carboxamide ribotide, the eighth step of the de novo pathway, and the formation of adenosine monophosphate (AMP) from adenylosuccinate, the second step in the conversion of inosine monophosphate into AMP.¡€0€ª€0€ €CDD¡€ €Èü¢€0€0€ €épfam10398, DUF2443, Protein of unknown function (DUF2443). This is a small family of highly conserved proteins from bacteria, in particular Helicobacter species, The structure is a bundle of alpha helices. The function is not known.¡€0€ª€0€ €CDD¡€ €Èý¢€0€0€ €‚™pfam10399, UCR_Fe-S_N, Ubiquitinol-cytochrome C reductase Fe-S subunit TAT signal. This is the N-terminal region of the E or R chain, Ubiquitinol-cytochrome C reductase Fe-S subunit, of the hetero-hexameric cytochrome bc1 complex. This region is a TAT-signal region. The cytochrome bc1 complex is an oligomeric membrane protein complex that is a component of respiratory and photosynthetic electron transfer chains. The enzyme couples the transfer of electrons from ubiquinol to cytochrome c with the the generation of a protein gradient across the membrane. The motif is also associated with Rieske (pfam00355), UCR_TM (pfam02921) and Ubiq-Cytc-red_N (pfam09165).¡€0€ª€0€ €CDD¡€ €Èþ¢€0€0€ €ópfam10400, Vir_act_alpha_C, Virulence activator alpha C-term. This structure is homo-dimeric, and the domain here is the C-terminal half of the structure, often associated with PadR upstream, (pfam03551), which is a transcriptional regulator.¡€0€ª€0€ €CDD¡€ €Èÿ¢€0€0€ €‚pfam10401, IRF-3, Interferon-regulatory factor 3. This is the interferon-regulatory factor 3 chain of the hetero-dimeric structure which also contains the shorter chain CREB-binding protein. These two subunits make up the DRAF1 (double-stranded RNA-activated factor 1). Viral dsRNA produced during viral transcription or replication leads to the activation of DRAF1. The DNA-binding specificity of DRAF1 correlates with transcriptional induction of ISG (interferon-alpha,beta-stimulated gene). IRF-3 preexists in the cytoplasm of uninfected cells and translocates to the nucleus following viral infection. Translocation of IRF-3 is accompanied by an increase in serine and threonine phosphorylation, and association with the CREB coactivator occurs only after infection.¡€0€ª€0€ €CDD¡€ €É¢€0€0€ €xpfam10403, BHD_1, Rad4 beta-hairpin domain 1. This short domain is found in the Rad4 protein. This domain binds to DNA.¡€0€ª€0€ €CDD¡€ €É¢€0€0€ €xpfam10404, BHD_2, Rad4 beta-hairpin domain 2. This short domain is found in the Rad4 protein. This domain binds to DNA.¡€0€ª€0€ €CDD¡€ €É¢€0€0€ €xpfam10405, BHD_3, Rad4 beta-hairpin domain 3. This short domain is found in the Rad4 protein. This domain binds to DNA.¡€0€ª€0€ €CDD¡€ €É¢€0€0€ €‚üpfam10406, TAF8_C, Transcription factor TFIID complex subunit 8 C-term. This is the C-terminal, Delta, part of the TAF8 protein. The N-terminal is generally the histone fold domain, Bromo_TP (pfam07524). TAF8 is one of the key subunits of the transcription factor for pol II, TFIID. TAF8 is one of the several general cofactors which are typically involved in gene activation to bring about the communication between gene-specific transcription factors and components of the general transcription machinery.¡€0€ª€0€ €CDD¡€ €É¢€0€0€ €‚ôpfam10407, Cytokin_check_N, Cdc14 phosphatase binding protein N-terminus. Cytokinesis in yeasts involves a family of proteins whose essential function is to bind Cdc14-family phosphatase and prevent this from being sequestered and inhibited in the nucleolus. This is the highly conserved N-terminus of a family of proteins which act as cytokinesis checkpoint controls by allowing cells to cope with cytokinesis defects. These proteins are required for rDNA silencing and mini-chromosome maintenance.¡€0€ª€0€ €CDD¡€ €É¢€0€0€ €‚îpfam10408, Ufd2P_core, Ubiquitin elongating factor core. This is the most conserved part of the core region of Ufd2P ubiquitin elongating factor or E4, running from helix alpha-11 to alpha-38. It consists of 31 helices of variable length connected by loops of variable size forming a compact unit; the helical packing pattern of the compact unit consists of five structural repeats that resemble tandem Armadillo (ARM) repeats. This domain is involved in ubiquitination as it binds Cdc48p and escorts ubiquitinated proteins from Cdc48p to the proteasome for degradation. The core is structurally similar to the nuclear transporter protein importin-alpha. The core is associated with the U-box at the C-terminus, pfam04564, which has ligase activity.¡€0€ª€0€ €CDD¡€ €É¢€0€0€ €‚pfam10409, PTEN_C2, C2 domain of PTEN tumor-suppressor protein. This is the C2 domain-like domain, in greek key form, of the PTEN protein, phosphatidyl-inositol triphosphate phosphatase, and it is the C-terminus. This domain may well include a CBR3 loop which means it plays a central role in membrane binding. This domain associates across an extensive interface with the N-terminal phosphatase domain DSPc (pfam00782) suggesting that the C2 domain productively positions the catalytic part of the protein onto the membrane.¡€0€ª€0€ €CDD¡€ €É¢€0€0€ €‚pfam10410, DnaB_bind, DnaB-helicase binding domain of primase. This domain is the C-terminal region three-helical domain of primase. Primases synthesize short RNA strands on single-stranded DNA templates, thereby generating the hybrid duplexes required for the initiation of synthesis by DNA polymerases. Primases are recruited to single-stranded DNA by helicases, and this domain is the region of the primase which binds DnaB-helicase. It is associated with the Toprim domain (pfam01751) which is the central catalytic core.¡€0€ª€0€ €CDD¡€ €É¢€0€0€ €‚pfam10411, DsbC_N, Disulfide bond isomerase protein N-terminus. This is the N-terminal domain of the disulfide bond isomerase DsbC. The whole molecule is V-shaped, where each arm is a DsbC monomer of two domains linked by a hinge; and the N-termini of each monomer join to form the dimer interface at the base of the V, so are vital for dimerisation. DsbC is required for disulfide bond formation and functions as a disulfide bond isomerase during oxidative protein-folding in bacterial periplasm. It also has chaperone activity.¡€0€ª€0€ €CDD¡€ €É ¢€0€0€ €‚üpfam10412, TrwB_AAD_bind, Type IV secretion-system coupling protein DNA-binding domain. The plasmid conjugative coupling protein TrwB forms hexamers from six structurally very similar protomers. This hexamer contains a central channel running from the cytosolic pole (made up by the AADs) to the membrane pole ending at the transmembrane pore shaped by 12 transmembrane helices, rendering an overall mushroom-like structure. The TrwB_AAD (all-alpha domain) domain appears to be the DNA-binding domain of the structure. TrwB, a basic integral inner-membrane nucleoside-triphosphate-binding protein, is the structural prototype for the type IV secretion system coupling proteins, a family of proteins essential for macromolecular transport between cells and export.¡€0€ª€0€ €CDD¡€ €É ¢€0€0€ €‚«pfam10413, Rhodopsin_N, Amino terminal of the G-protein receptor rhodopsin. Rhodopsin is the archetypal G-protein-coupled receptor. Such receptors participate in virtually all physiological processes, as signalling molecules. They utilize heterotrimeric guanosine triphosphate (GTP)-binding proteins to transduce extracellular signals to intracellular events. Rhodopsin is important because of the pivotal role it plays in visual signal transduction. Rhodopsin is a dimeric transmembrane protein and its intradiskal surface consists of this amino terminal domain and three loops connecting six of the seven transmembrane helices. The N-terminus is a compact domain of alpha-helical regions with breaks and bends at proline residues outside the membrane. The transmembrane part of rhodopsin is represented by 7tm_1 (pfam00001). The N-terminal domain is extracellular is and is necessary for successful dimerisation and molecular stability.¡€0€ª€0€ €CDD¡€ €É ¢€0€0€ €‚Ápfam10414, CysG_dimerizer, Sirohaem synthase dimerisation region. Bacterial sulfur metabolism depends on the iron-containing porphinoid sirohaem. CysG, S-adenosyl-L-methionine (SAM)-dependent bis-methyltransferase, dehydrogenase and ferrochelatase, synthesizes sirohaem from uroporphyrinogen III via reactions which encompass two branchpoint intermediates in tetrapyrrole biosynthesis, diverting flux first from protoporphyrin IX biosynthesis and then from cobalamin (vitamin B12) biosynthesis. CysG is a dimer of two structurally similar protomers held together asymmetrically through a number of salt-bridges across complementary residues in the CysG_dimerizer region to produce a series of active sites, accounting for CysG's multifunctionality, catalyzing four diverse reactions: two SAM-dependent methylations, NAD+-dependent tetrapyrrole dehydrogenation and metal chelation. The CysG_dimerizer region holding the two protomers together is of 74 residues.¡€0€ª€0€ €CDD¡€ €É ¢€0€0€ €‚pfam10415, FumaraseC_C, Fumarase C C-terminus. Fumarase C catalyzes the stereo-specific interconversion of fumarate to L-malate as part of the Kreb's cycle. The full-length protein forms a tetramer with visible globular shape. FumaraseC_C is the C-terminal 65 residues referred to as domain 3. The core of the molecule consists of a bundle of 20 alpha-helices from the five-helix bundle of domain 2. The projections from the core of the tetramer are generated from domains 1 and 3 of each subunit. FumaraseC_C does not appear to be part of either the active site or the activation site but is helical in structure forming a little bundle.¡€0€ª€0€ €CDD¡€ €É ¢€0€0€ €‚ýpfam10416, IBD, Transcription-initiator DNA-binding domain IBD. In Trichomonas vaginalis, thought to be the earliest extant eukaryote, the sole initiator element for control of the start of transcription is Inr, and this is recognized by the initiator binding protein IBP39. IBP39 contains an N-terminal Inr binding domain, IBD, connected via a flexible, proteolytically sensitive, linker (residues 127-145) to a C-terminal domain. The IBD structure reveals a winged-helix-wing conformation with each element binding to DNA, the central helix-turn-helix contributing the majority of the specificity-determining contacts with the Inr core motif TCAPy(T/A). The binding of IBP39 to the Inr directly recruits RNA polymerase II and in this way initiates transcription.¡€0€ª€0€ €CDD¡€ €É¢€0€0€ €‚qpfam10417, 1-cysPrx_C, C-terminal domain of 1-Cys peroxiredoxin. This is the C-terminal domain of 1-Cys peroxiredoxin (1-cysPrx), a member of the peroxiredoxin superfamily which protect cells against membrane oxidation through glutathione (GSH)-dependent reduction of phospholipid hydroperoxides to corresponding alcohols. The C-terminal domain is crucial for providing the extra cysteine necessary for dimerisation of the whole molecule. Loss of the enzyme's peroxidase activity is associated with oxidation of the catalytic cysteine, upstream of this domain; and glutathionylation, presumably through its disruption of protein structure, facilitates access for GSH, resulting in spontaneous reduction of the mixed disulfide to the sulfhydryl and consequent activation of the enzyme. The domain is associated with family AhpC-TSA, pfam00578, which carries the catalytic cysteine.¡€0€ª€0€ €CDD¡€ €É¢€0€0€ €‚0pfam10418, DHODB_Fe-S_bind, Iron-sulfur cluster binding domain of dihydroorotate dehydrogenase B. Lactococcus lactis is one of the few organisms with two dihydroorotate dehydrogenases, DHODs, A and B. The B enzyme is a prototype for DHODs in Gram-positive bacteria that use NAD+ as the second substrate. DHODB is a hetero-tetramer composed of a central homodimer of PyrDB subunits resembling the DHODA structure and two PyrK subunits along with three different cofactors: FMN, FAD, and a [2Fe-2S] cluster. The [2Fe-2S] iron-sulfur cluster binds to this C-terminal domain of the PyrK subunit, which is at the interface between the flavin and NAD binding domains and contains three beta-strands. The four cysteine residues at the N-terminal part of this domain are the ones that bind, in pairs, to the iron-sulfur cluster. The conformation of the whole molecule means that the iron-sulfur cluster is localized in a well-ordered part of this domain close to the FAD binding site. The FAD and and NAD binding domains are FAD_binding_6, pfam00970 and NAD_binding_1, pfam00175.¡€0€ª€0€ €CDD¡€ €É¢€0€0€ €‚ pfam10419, TFIIIC_sub6, TFIIIC subunit. This is a family of proteins subunits of TFIIIC. TFIIIC in yeast and humans is required for transcription of tRNA and 5 S RNA genes by RNA polymerase III. Yeast members of this family are fused to phosphoglycerate mutase domain.¡€0€ª€0€ €CDD¡€ €É¢€0€0€ €‚åpfam10420, IL12p40_C, Cytokine interleukin-12p40 C-terminus. IL12p40_C is the largely beta stranded C-terminal, D3, domain of interleukin-12p40 or interleukin-12B. This interleukin is produced on stimulation by macrophage-engulfed micro-organisms and other stimuli, when it dimerizes with interleukin-12p35 to form a heterodimer which then binds to receptors on natural killer cells to activate them to destroy the micro-organisms. This domain contains two disulfide bridges, one of which serves to bind p40 to p35 and the other to hold the beta strands within the domain together. The cupped shape of the p35 binding interface matches the elbow-like bend between D2 and D3 in p40. The domain is often associated with family fn3, pfam00041.¡€0€ª€0€ €CDD¡€ €É¢€0€0€ €‚Ñpfam10421, OAS1_C, 2'-5'-oligoadenylate synthetase 1, domain 2, C-terminus. This is the largely alpha-helical, C-terminal half of 2'-5'-oligoadenylate synthetase 1, being described as domain 2 of the enzyme and homologous to a tandem ubiquitin repeat. It carries the region of enzymic activity between 320 and 344 at the extreme C-terminal end. Oligoadenylate synthetases are antiviral enzymes that counteract vial attack by degrading viral RNA. The enzyme uses ATP in 2'-specific nucleotidyl transfer reactions to synthesize 2'.5'-oligoadenylates, which activate latent ribonuclease, resulting in degradation of viral RNA and inhibition of virus replication. This domain is often associated with NTP_transf_2 pfam01909.¡€0€ª€0€ €CDD¡€ €É¢€0€0€ €‚pfam10422, LRS4, Monopolin complex subunit LRS4. Monopolin is a protein complex, originally identified in Saccharomyces cerevisiae, that is required for the segregation of homologous centromeres to opposite poles of a dividing cell during meiosis I. The orthologous complex in Schizosaccharomyces pombe is not required for meiosis I chromosome segregation, but is proposed to play a similar physiological role in clamping microtubule binding sites. In S.cerevisiae this subunit is called LRS4, and in S. pombe it is known as Mde4.¡€0€ª€0€ €CDD¡€ €çꢀ0€0€ €‚†pfam10423, AMNp_N, Bacterial AMP nucleoside phosphorylase N-terminus. This is the N-terminal domain of bacterial AMP nucleoside phosphorylase (AMNp). The N- and C-termini form distinct domains which intertwine with each other to form a stable monomer which associates with five other monomers to yield the active hexamer. The N-terminus consists of a long helix and a four-stranded sheet with a novel topology. The C-terminus binds the nucleoside whereas the N-terminus acts as the enzymatic regulatory domain. AMNp (EC:3.2.2.4) catalyzes the hydrolysis of AMP to form adenine and ribose 5-phosphate. thereby regulating intracellular AMP levels.¡€0€ª€0€ €CDD¡€ €É¢€0€0€ €‚çpfam10425, SdrG_C_C, C-terminus of bacterial fibrinogen-binding adhesin. This is the C-terminal half of a bacterial fibrinogen-binding adhesin SdrG. SdrG is a Gram-positive cell-wall-anchored adhesin that allows attachment of the bacterium to host tissues via specific binding to the beta-chain of human fibrinogen (Fg). SdrG binds to its ligand with a dynamic "dock, lock, and latch" mechanism which represents a general mode of ligand-binding for structurally related cell wall-anchored proteins in most Gram-positive bacteria. The C-terminal part of SdrG(276-596) is integral to the folding of the immunoglobulin-like whole to create the docking grooves necessary for Fg binding. The domain is associated with families of Cna_B, pfam05738.¡€0€ª€0€ €CDD¡€ €É¢€0€0€ €‚‘pfam10426, zf-RAG1, Recombination-activating protein 1 zinc-finger domain. This is a C2-H2 zinc-finger domain closely resembling the classical TFIIIA-type zinc-finger, CX3FX5LX2-3H, despite having a valine and a tyrosine at the core instead of a phenylalanine and a leucine, hence CX3VX1LX2YX2H. The structure, nevertheless, contains the characteristic two-stranded beta-sheet and alpha-helix of a classical zinc-finger. The domain binds one zinc and, in complex with the zinc-RING-finger domain, helps to stabilize the whole of the dimerisation region of recombination activating protein 1 (RAG1). The function of the whole is to bind double-stranded DNA.¡€0€ª€0€ €CDD¡€ €b¯¢€0€0€ €¤pfam10427, Ago_hook, Argonaute hook. This region has been called the argonaute hook. It has been shown to bind to the Piwi domain pfam02171 of Argnonaute proteins.¡€0€ª€0€ €CDD¡€ €É¢€0€0€ €Œpfam10428, SOG2, RAM signalling pathway protein. SOG2 proteins in Saccharomyces cerevisiae are involved in cell separation and cytokinesis.¡€0€ª€0€ €CDD¡€ €É¢€0€0€ €‚ÿpfam10429, Mtr2, Nuclear pore RNA shuttling protein Mtr2. Mtr2 is a monomeric, dual-action, RNA-shuttle protein found in yeasts. Transport across the nuclear-cytoplasmic membrane is via the macro-molecular membrane-spanning nuclear pore complex, NPC. The pore is lined by a subset of NPC members called nucleoporins that present FG (Phe-Gly) receptors, characteristically GLFG and FXFG motifs, for shuttling RNAs and proteins. RNA cargo is bound to soluble transport proteins (nuclear export factors) such as Mex67 in yeasts, and TAP in metazoa, which pass along the pore by binding to successive FG receptors. Mtr2 when bound to Mex67 maximises this FG-binding. Mtr2 also acts independently of Mex67 in transporting the large ribosomal RNA subunit through the pore.¡€0€ª€0€ €CDD¡€ €b²¢€0€0€ €/pfam10430, Ig_Tie2_1, Tie-2 Ig-like domain 1. ¡€0€ª€0€ €CDD¡€ €É¢€0€0€ €‚“pfam10431, ClpB_D2-small, C-terminal, D2-small domain, of ClpB protein. This is the C-terminal domain of ClpB protein, referred to as the D2-small domain, and is a mixed alpha-beta structure. Compared with the D1-small domain (included in AAA, pfam00004) it lacks the long coiled-coil insertion, and instead of helix C4 contains a beta-strand (e3) that is part of a three stranded beta-pleated sheet. In Thermophilus the whole protein forms a hexamer with the D1-small and D2-small domains located on the outside of the hexamer, with the long coiled-coil being exposed on the surface. The D2-small domain is essential for oligomerization, forming a tight interface with the D2-large domain of a neighboring subunit and thereby providing enough binding energy to stabilize the functional assembly. The domain is associated with two Clp_N, pfam02861, at the N-terminus as well as AAA, pfam00004 and AAA_2, pfam07724.¡€0€ª€0€ €CDD¡€ €É¢€0€0€ €‚‰pfam10432, bact-PGI_C, Bacterial phospho-glucose isomerase C-terminal SIS domain. This is the C-terminal SIS domain of a bacterial phospho-glucose isomerase EC:5.3.1.9 protein which is similar to eukaryote homologs to the extent that the sequence includes the cluster of threonines and serines that forms the sugar phosphate-binding site in conventional PGI. This domain contributes a good proportion of the active catalytic site residues. This PGI uses the same catalytic mechanisms for both glucose ring-opening and isomerisation for the interconversion of glucose 6-phosphate to fructose 6-phosphate. It is associated with family SIS, pfam01380.¡€0€ª€0€ €CDD¡€ €É¢€0€0€ €‚"pfam10433, MMS1_N, Mono-functional DNA-alkylating methyl methanesulfonate N-term. MMS1 is a protein that protects against replication-dependent DNA damage in Saccharomyces cerevisiae. MMS1 belongs to the DDB1 family of cullin 4 adaptors and the two proteins are homologous. MMS1 bridges the interaction of MMS22 and Crt10 with Cul8/Rtt101. Cul8/Rtt101 is a cullin protein involved in the regulation of DNA replication subsequent to DNA damage. The N-terminal region of MMS1 and the C-terminal of MMS22 are required for the the MMS1-MMS22 interaction. The human HIV-1 virion-associated protein Vpr assembles with DDB1 through interaction with DCAF1 (chromatin assembly factor) to form an E3 ubiquitin ligase that targets cellular substrates for proteasome-mediated degradation and subsequent G2 arrest.¡€0€ª€0€ €CDD¡€ €É¢€0€0€ €‚6pfam10434, MAM1, Monopolin complex protein MAM1. Monopolin is a protein complex, originally identified in Saccharomyces cerevisiae, that is required for the segregation of homologous centromeres to opposite poles of a dividing cell during meiosis I. MAM1 is required in S. cerevisiae for monopolar attachment.¡€0€ª€0€ €CDD¡€ €É¢€0€0€ €‚kpfam10435, BetaGal_dom2, Beta-galactosidase, domain 2. This is the second domain of the five-domain beta-galactosidase enzyme that altogether catalyzes the hydrolysis of beta(1-3) and beta(1-4) galactosyl bonds in oligosaccharides as well as the inverse reaction of enzymatic condensation and trans-glycosylation. This domain is made up of 16 antiparallel beta-strands and an alpha-helix at its C terminus. The fold of this domain appears to be unique. In addition, the last seven strands of the domain form a subdomain with an immunoglobulin-like (I-type Ig) fold in which the first strand is divided between the two beta-sheets. In penicillin spp this strand is interrupted by a 12-residue insertion which forms an additional edge-strand to the second beta-sheet of the sub-domain. The remainder of the second domain forms a series of beta-hairpins at its N terminus, four strands of which are contiguous with part of the Ig-like sub-domain, forming in total a seven-stranded antiparallel beta-sheet. This domain is associated with family Glyco_hydro_35, pfam01301, which is N-terminal to it, but itself has no metazoan members.¡€0€ª€0€ €CDD¡€ €É¢€0€0€ €‚¬pfam10436, BCDHK_Adom3, Mitochondrial branched-chain alpha-ketoacid dehydrogenase kinase. Catabolism and synthesis of leucine, isoleucine and valine are finely balanced, allowing the body to make the most of dietary input but removing excesses to prevent toxic build-up of their corresponding keto-acids. This is the butyryl-CoA dehydrogenase, subunit A domain 3, a largely alpha-helical bundle of the enzyme BCDHK. This enzyme is the regulator of the dehydrogenase complex that breaks branched-chain amino-acids down, by phosphorylating and thereby inactivating it when synthesis is required. The domain is associated with family HATPase_c pfam02518 which is towards the C-terminal.¡€0€ª€0€ €CDD¡€ €É¢€0€0€ €‚ pfam10437, Lip_prot_lig_C, Bacterial lipoate protein ligase C-terminus. This is the C-terminal domain of a bacterial lipoate protein ligase. There is no conservation between this C-terminus and that of vertebrate lipoate protein ligase C-termini, but both are associated with the domain BPL_LipA_LipB pfam03099, further upstream. This domain is required for adenylation of lipoic acid by lipoate protein ligases. The domain is not required for transfer of lipoic acid from the adenylate to the lipoyl domain. Upon adenylation, this domain rotates 180 degrees away from the active site cleft. Therefore, the domain does not interact with the lipoyl domain during transfer.¡€0€ª€0€ €CDD¡€ €É¢€0€0€ €‚pfam10438, Cyc-maltodext_C, Cyclo-malto-dextrinase C-terminal domain. This domain is at the very C-terminus of cyclo-malto-dextrinase proteins and consists of 8 beta strands, is largely globular and appears to help stabilize the acitve sites created by upstream domains, Cyc-maltodext_N pfam09087, and Alpha-amylase pfam00128. Cyclo-malto-dextrinases hydrolyse cyclodextrans to maltose and glucose and catalyze trans-glycosylation of oligosaccharides to the C3-, C4- or C6-hydroxyl groups of various acceptor sugar molecules.¡€0€ª€0€ €CDD¡€ €É ¢€0€0€ €‚Òpfam10439, Bacteriocin_IIc, Bacteriocin class II with double-glycine leader peptide. This is a family of bacteriocidal bacteriocins secreted by Streptococcal species in order to kill off closely-related competitor Gram-positives. The sequence includes the peptide precursor, this being cleaved off proteolytically at the double-glycine. The family does not carry the YGNGVXC motif characteristic of pediocin-like Bacteriocins, Bacteriocin_II pfam01721. The producer bacteria are protected from the effects of their own bacteriocins by production of a specific immunity protein which is co-transcribed with the genes encoding the bacteriocins, eg family EntA_Immun pfam08951. The bacteriocins are structurally more specific than their immunity-protein counterparts. Typically, production of the bacteriocin gene is from within an operon carrying up to 6 genes including a typical two-component regulatory system (R and H), a small peptide pheromone (C), and a dedicated ABC transporter (A and -B) as well as an immunity protein. The ABC transporter is thought to recognize the N termini of both the pheromone and the bacteriocins and to transport these peptides across the cytoplasmic membrane, concurrent with cleavage at the conserved double-glycine motif. Cleaved extracellular C can then bind to the sensor kinase, H, resulting in activation of R and up-regulation of the entire gene cluster via binding to consensus sequences within each promoter. It seems likely that this whole regulon is carried on a transmissible plasmid which is passed between closely related Firmicute species since many clinical isolates from different Firmicutes can produce at least two bacteriocins. and the same bacteriocins can be produced by different species.¡€0€ª€0€ €CDD¡€ €É!¢€0€0€ €‚·pfam10440, WIYLD, Ubiquitin-binding WIYLD domain. This presumed domain has been predicted to contain three alpha helices. The domain was named the WIYLD domain based on the pattern of most conserved residues. It binds ubiquitin. In the Arabidopsis thaliana histone-lysine N-methyltransferase SUVR4, binding of ubiquitin to this domain stimulates enzymatic activity and converts its activity from a strict dimethylase to a di/trimethylase.¡€0€ª€0€ €CDD¡€ €É"¢€0€0€ €~pfam10441, Urb2, Urb2/Npa2 family. This family includes the Urb2 protein from yeast that are involved in ribosome biogenesis.¡€0€ª€0€ €CDD¡€ €É#¢€0€0€ €‚apfam10442, FIST_C, FIST C domain. The FIST C domain is a novel sensory domain, which is present in signal transduction proteins from Bacteria, Archaea and Eukarya. Chromosomal proximity of FIST-encoding genes to those coding for proteins involved in amino acid metabolism and transport suggest that FIST domains bind small ligands, such as amino acids.¡€0€ª€0€ €CDD¡€ €É$¢€0€0€ €‚pfam10443, RNA12, RNA12 protein. This family includes RNA12 from S. cerevisiae. That protein contains an RRM domain. This region is C-terminal to that and includes a P-loop motif suggesting this region binds to NTP. The RNA12 proteins is involved in pre-rRNA maturation.¡€0€ª€0€ €CDD¡€ €É%¢€0€0€ €‚¦pfam10444, Nbl1_Borealin_N, Nbl1 / Borealin N terminal. Nbl1 is a subunit of the conserved CPC, the chromosomal passenger complex, which regulates mitotic chromosome segregation. In Fungi and Animalia, this complex consists of the kinase Aurora B/AIR-2/Ipl1p, INCENP/ICP-1/Sli15p, and Survivin/BIR-1/Bir1p. In Animalia, a fourth subunit (Borealin/Dasra/CSC-1) is required for targeting CPC to centromeres and central spindles. Nbl1 has been shown in budding yeast to be essential for viability, and for CPC localization, stability, integrity, and function. The N terminus of Borealin is homologous to Nbl1. This family contains both Nbl1, and the N terminal region of Borealin.¡€0€ª€0€ €CDD¡€ €É&¢€0€0€ €ipfam10445, DUF2456, Protein of unknown function (DUF2456). This is a family of uncharacterized proteins.¡€0€ª€0€ €CDD¡€ €É'¢€0€0€ €ipfam10446, DUF2457, Protein of unknown function (DUF2457). This is a family of uncharacterized proteins.¡€0€ª€0€ €CDD¡€ €É(¢€0€0€ €‚pfam10447, EXOSC1, Exosome component EXOSC1/CSL4. This family of proteins are components of the exosome 3'->5' exoribonuclease complex. The exosome mediates degradation of unstable mRNAs that contain AU-rich elements (AREs) within their 3' untranslated regions.¡€0€ª€0€ €CDD¡€ €É)¢€0€0€ €‚špfam10448, POC3_POC4, 20S proteasome chaperone assembly proteins 3 and 4. This family contains chaperones of the 20S proteasome which function in early 20S proteasome assembly. The structures of two of the proteins in this family (POC3 and POC4) have been solved, and they closely resemble those of the mammalian proteasome assembling chaperone PAC3, although there is little sequence similarity between them.¡€0€ª€0€ €CDD¡€ €bÅ¢€0€0€ €‡pfam10450, POC1, POC1 chaperone. In yeast, POC1 is a chaperone of the 20S proteasome which functions in early 20S proteasome assembly.¡€0€ª€0€ €CDD¡€ €Mñ¢€0€0€ €‚ypfam10451, Stn1, Telomere regulation protein Stn1. The budding yeast protein Stn1 is a DNA-binding protein which has specificity for telomeric DNA. Structural profiling has predicted an OB-fold. This domain is the N-terminal part of the molecule, which adopts the OB fold. Protection of telomeres by multiple proteins with OB-fold domains is conserved in eukaryotic evolution.¡€0€ª€0€ €CDD¡€ €É*¢€0€0€ €¤pfam10452, TCO89, TORC1 subunit TCO89. TC089 is a component of the TORC1 complex. TORC1 is responsible for a wide range of rapamycin-sensitive cellular activities.¡€0€ª€0€ €CDD¡€ €É+¢€0€0€ €‚²pfam10453, NUFIP1, Nuclear fragile X mental retardation-interacting protein 1 (NUFIP1). Proteins in this family have been implicated in the assembly of the large subunit of the ribosome and in telomere maintenance. Some proteins in this family contain a CCCH zinc finger. This family contains a protein called human fragile X mental retardation-interacting protein 1, which is known to bind RNA and is phosphorylated upon DNA damage.¡€0€ª€0€ €CDD¡€ €É,¢€0€0€ €ipfam10454, DUF2458, Protein of unknown function (DUF2458). This a is family of uncharacterized proteins.¡€0€ª€0€ €CDD¡€ €É-¢€0€0€ €pfam10455, BAR_2, Bin/amphiphysin/Rvs domain for vesicular trafficking. This Pfam entry includes proteins that are not matched by pfam03114.¡€0€ª€0€ €CDD¡€ €è¢€0€0€ €‚Îpfam10456, BAR_3_WASP_bdg, WASP-binding domain of Sorting nexin protein. The C-terminal region of the Sorting nexin group of proteins appears to carry a BAR-like (Bin/amphiphysin/Rvs) domain. This domain is very diverse and the similarities with other BAR domains are few. In the Sorting nexins it is associated with family PX, pfam00787.13, and in combination with PX appears to be necessary to bind WASP along with p85 to form a multimeric signalling complex.¡€0€ª€0€ €CDD¡€ €É.¢€0€0€ €‚gpfam10457, MENTAL, Cholesterol-capturing domain. Human meta-static lymph node (MLN) 64 is a late endosomal membrane protein, and carries this MENTAL (MLN64N-terminal) domain at its N-terminus. The domain is composed of four trans-membrane helices with three short intervening loops. The function of the domain is to capture cholesterol and pass it to the associated START domain pfam01852 for transfer to a cytosolic acceptor protein or membrane. In mammals, the MENTAL domain is involved in the localization of MLN64 and MENTHO in late endosomes, and also in homo-and of hetero-interactions of these two proteins.¡€0€ª€0€ €CDD¡€ €É/¢€0€0€ €†pfam10458, Val_tRNA-synt_C, Valyl tRNA synthetase tRNA binding arm. This domain is found at the C-terminus of Valyl tRNA synthetases.¡€0€ª€0€ €CDD¡€ €É0¢€0€0€ €õpfam10459, Peptidase_S46, Peptidase S46. Dipeptidyl-peptidase 7 (DPP-7) is the best characterized member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.¡€0€ª€0€ €CDD¡€ €É1¢€0€0€ €·pfam10460, Peptidase_M30, Peptidase M30. This family contains the metallopeptidase hyicolysin. Hyicolysin has a zinc ion which is liganded by two histidine and one glutamate residue.¡€0€ª€0€ €CDD¡€ €É2¢€0€0€ €‚Bpfam10461, Peptidase_S68, Peptidase S68. This family of serine peptidases contains PIDD proteins. PIDD forms a complex with RAIDD and procaspase-2 that is known as the 'PIDDosome'. The PIDDosome forms when DNA damage occurs and either activates NF-kappaB, leading to cell survival, or caspase-2, which leads to apoptosis.¡€0€ª€0€ €CDD¡€ €É3¢€0€0€ €Ïpfam10462, Peptidase_M66, Peptidase M66. This family of metallopeptidases contains StcE, a virulence factor found in Shiga toxigenic Escherichia coli organisms. StcE peptidase cleaves C1 esterase inhibitor.¡€0€ª€0€ €CDD¡€ €bТ€0€0€ €‚«pfam10463, Peptidase_U49, Peptidase U49. This family contains Lit peptidase from Escherichia coli. Lit protease functions in bacterial cell death in response to infection by bacteriophage T4. Following binding of Gol peptide to domains II and III of elongation factor Tu, the Lit peptidase cleaves domain I of the elongation factor. This prevents binding of guanine nucleotides, shuts down translation and leads to cell death.¡€0€ª€0€ €CDD¡€ €bÑ¢€0€0€ €‚8pfam10464, Peptidase_U40, Peptidase U40. This family contains P5 murein endopeptidase from bacteriophage phi-6. P5 murein endopeptidase has lytic activity against several gram-negative bacteria. It is thought that the enzyme cleaves the cell wall peptide bridge formed by meso-2,6-diaminopimelic acid and D-Ala.¡€0€ª€0€ €CDD¡€ €bÒ¢€0€0€ €Ãpfam10465, Inhibitor_I24, PinA peptidase inhibitor. PinA inhibits the endopeptidase La. It binds to the La homotetramer but does not interfere with the ATP binding site or the active site of La.¡€0€ª€0€ €CDD¡€ €bÓ¢€0€0€ €‚wpfam10466, Inhibitor_I34, Saccharopepsin inhibitor I34. The saccharopepsin inhibitor is highly specific for the aspartic peptidase saccharopepsin. It is largely unstructured in the absence of saccharopepsin, but in the presence, the inhibitor undergoes a conformation change forming an almost perfect alpha-helix from Asn2 to Met32 in the active site cleft of the peptidase.¡€0€ª€0€ €CDD¡€ €bÔ¢€0€0€ €‚pfam10467, Inhibitor_I48, Peptidase inhibitor clitocypin. Clitocypin binds and inhibits cysteine proteinases. It has no similarity to any other known cysteine proteinase inhibitors but bears some similarity to a lectin-like family of proteins from mushrooms.¡€0€ª€0€ €CDD¡€ €É4¢€0€0€ €opfam10468, Inhibitor_I68, Carboxypeptidase inhibitor I68. This is a family of tick carboxypetidase inhibitors.¡€0€ª€0€ €CDD¡€ €É5¢€0€0€ €‚°pfam10469, AKAP7_NLS, AKAP7 2'5' RNA ligase-like domain. AKAP7_NLS is the N-terminal domain of the cyclic AMP-dependent protein kinase A, PKA, anchor protein AKAP7. This protein anchors PKA for its role in regulating PKA-mediated gene transcription in both somatic cells and oocytes. AKAP7_NLS carries the nuclear localization signal (NLS) KKRKK, that indicates the cellular destiny of this anchor protein. Binding to the regulatory subunits RI and RII of PKA is mediated via the family AKAP7_RIRII_bdg. at the C-terminus. This family represents a region that contains two 2'5' RNA ligase like domains pfam02834. Presumably this domain carried out some as yet unknown enzymatic function.¡€0€ª€0€ €CDD¡€ €É6¢€0€0€ €‚%pfam10470, AKAP7_RIRII_bdg, PKA-RI-RII subunit binding domain of A-kinase anchor protein. AKAP7_RIRII_bdg is the C-terminal domain of the cyclic AMP-dependent protein kinase A, PKA, anchor protein AKAP7. This protein anchors PKA, for its role in regulating PKA-mediated gene transcription in both somatic cells and oocytes, by binding to its regulatory subunits, RI and RII, hence being known as a dual-specific AKAP. The 25 crucial amino acids of RII-binding domains in general form structurally conserved amphipathic helices with unrelated sequences; hydrophobic amino acid residues form the backbone of the interaction and hydrogen bond- and salt-bridge-forming amino acid residues increase the affinity of the interaction. The N-terminus, of family AKAP7_NLS, carries the nuclear localization signal.¡€0€ª€0€ €CDD¡€ €É7¢€0€0€ €‚Cpfam10471, ANAPC_CDC26, Anaphase-promoting complex APC subunit CDC26. The anaphase-promoting complex (APC) or cyclosome is a cell cycle-regulated ubiquitin-protein ligase that regulates important events in mitosis such as the initiation of anaphase and exit from telophase. The APC, in conjunction with other enzymes, assembles multi-ubiquitin chains on a variety of regulatory proteins thereby targeting them for proteolysis by the 26S proteasome. CDC26 is one of the nine or so subunits identified within APC but its exact function is not known. The APC/C becomes active at the metaphase/anaphase transition and remains active during G1 phase. One mechanism linked to activation of the APC/C is phosphorylation. The yeast APC/C is composed of at least 13 subunits, but the function of many of the subunits is unknown. Hcn1 is the smallest subunit of the S. pombe APC/C, and is found to be essential for cell viability, APC/C integrity, and proper APC/C regulation. In addition, Hcn1 phosphorylation indicates a specific role for the phosphorylation of this subunit late in the cell cycle.¡€0€ª€0€ €CDD¡€ €É8¢€0€0€ €‚Mpfam10472, CReP_N, eIF2-alpha phosphatase phosphorylation constitutive repressor. This is the conserved N-terminal domain of CReP, constitutive repressor of eIF2-alpha phosphorylation/protein phosphatase 1, catalytic subunit. It functions in the dephosphorylation of eIF2-alpha under basal conditions in the absence of stress. In response to translation inhibition, there is reduced synthesis of the labile CReP that contributes to elevated levels of eIF2-alpha phosphorylation. The C-terminus, family PP1c, is shared with the apoptosis-associated protein Gadd34 and herpes simplex virus.¡€0€ª€0€ €CDD¡€ €É9¢€0€0€ €‚·pfam10473, CENP-F_leu_zip, Leucine-rich repeats of kinetochore protein Cenp-F/LEK1. Cenp-F, a centromeric kinetochore, microtubule-binding protein consisting of two 1,600-amino acid-long coils, is essential for the full functioning of the mitotic checkpoint pathway. There are several leucine-rich repeats along the sequence of LEK1 that are considered to be zippers, though they do not appear to be binding DNA directly in this instance.¡€0€ª€0€ €CDD¡€ €É:¢€0€0€ €¹pfam10474, DUF2451, Protein of unknown function C-terminus (DUF2451). This protein is found in eukaryotes but its function is not known. The C-terminal part of some members is DUF2450.¡€0€ª€0€ €CDD¡€ €É;¢€0€0€ €‚òpfam10475, Vps54_N, Vacuolar-sorting protein 54, of GARP complex. This is a family of vacuolar-sorting proteins 54, from eukaryotes. Along with VPS52 and VPS53 this forms the Golgi-associated retrograde protein complex GARP. VPS54 is separated into N- and C-terminal regions each of which has a different function. This N-terminal family of is important for GARP complex assembly and stability, whereas the C-terminal domain, pfam07928, brings about localization to an early endocytic compartment.¡€0€ª€0€ €CDD¡€ €É<¢€0€0€ €Ãpfam10476, DUF2448, Protein of unknown function C-terminus (DUF2448). The family DUF2349 is the N-terminal part of this family. This protein is found in eukaryotes but its function is not known.¡€0€ª€0€ €CDD¡€ €É=¢€0€0€ €‚Äpfam10477, EIF4E-T, Nucleocytoplasmic shuttling protein for mRNA cap-binding EIF4E. EIF4E-T is the transporter protein for shuttling the mRNA cap-binding protein EIF4E protein, targeting it for nuclear import. EIF4E-T contains several key binding domains including two functional leucine-rich NESs (nuclear export signals) between residues 438-447 and 613-638 in the human protein. The other two binding domains are an EIF4E-binding site, between residues 27-42 in Q9EST3, and a bipartite NLS (nuclear localization signals) between 194-211, and these lie in family EIF4E-T_N. EIF4E is the eukaryotic translation initiation factor 4E that is the rate-limiting factor for cap-dependent translation initiation.¡€0€ª€0€ €CDD¡€ €É>¢€0€0€ €‚Ypfam10479, FSA_C, Fragile site-associated protein C-terminus. This is the conserved C-terminal half of the protein KIAA1109 which is the fragile site-associated protein FSA. Genome-wide-association studies showed this protein to linked to the susceptibility to coeliac disease. The protein may also be associated with polycystic kidney disease.¡€0€ª€0€ €CDD¡€ €É?¢€0€0€ €‚‰pfam10480, ICAP-1_inte_bdg, Beta-1 integrin binding protein. ICAP-1 is a serine/threonine-rich protein that binds to the cytoplasmic domains of beta-1 integrins in a highly specific manner, binding to a NPXY sequence motif on the beta-1 integrin. The cytoplasmic domains of integrins are essential for cell adhesion, and the fact that phosphorylation of ICAP-1 by interaction with the cell-matrix implies an important role of ICAP-1 during integrin-dependent cell adhesion. Overexpression of ICAP-1 strongly reduces the integrin-mediated cell spreading on extracellular matrix and inhibits both Cdc42 and Rac1. In addition, ICAP-1 induces release of Cdc42 from cellular membranes and prevents the dissociation of GDP from this GTPase. An additional function of ICAP-1 is to promote differentiation of osteoprogenitors by supporting their condensation through modulating the integrin high affinity state,.¡€0€ª€0€ €CDD¡€ €ÐØ¢€0€0€ €‚±pfam10481, CENP-F_N, Cenp-F N-terminal domain. Mitosin or centromere-associated protein-F (Cenp-F) is found bound across the centromere as one of the proteins of the outer layer of the kinetochore. Most of the kinetochore/centromere functions appear to depend upon binding of the C-terminal par to f the molecule, whereas the N-terminal part, here, may be a cytoplasmic player in controlling the function of microtubules and dynein.¡€0€ª€0€ €CDD¡€ €É@¢€0€0€ €‚pfam10482, CtIP_N, tumor-suppressor protein CtIP N-terminal domain. CtIP is predominantly a nuclear protein that complexes with both BRCA1 and the BRCA1-associated RING domain protein (BARD1). At the protein level, CtIP expression varies with cell cycle progression in a pattern identical to that of BRCA1. Thus, the steady-state levels of CtIP polypeptides, which remain low in resting cells and G1 cycling cells, increase dramatically as Dividing cells traverse the G1/S boundary. CtIP can potentially modulate the functions ascribed to BRCA1 in transcriptional regulation, DNA repair, and/or cell cycle checkpoint control. This N-terminal domain carries a coiled-coil region and is essential for homodimerisation of the protein. The C-terminal domain is family pfam08573.¡€0€ª€0€ €CDD¡€ €ÉA¢€0€0€ €ópfam10483, Elong_Iki1, Elongator subunit Iki1. This family is a component of the RNA polymerase II elongator complex. This complex is involved in elongation of RNA polymerase II transcription and in modification of wobble nucleosides in tRNA.¡€0€ª€0€ €CDD¡€ €ÉB¢€0€0€ €‚¸pfam10484, MRP-S23, Mitochondrial ribosomal protein S23. MRP-S23 is one of the proteins that makes up the 55S ribosome in eukaryotes from nematodes to humans. It does not appear to carry any common motifs, either RNA binding or ribosomal protein motifs. All of the mammalian MRPs are encoded in nuclear genes that are evolving more rapidly than those encoding cytoplasmic ribosomal proteins. The MRPs are imported into mitochondria where they assemble coordinately with mitochondrially transcribed rRNAs into ribosomes that are responsible for translating the 13 mRNAs for essential proteins of the oxidative phosphorylation system. MRP-S23 is significantly up-regulated in uterine cancer cells.¡€0€ª€0€ €CDD¡€ €ÉC¢€0€0€ €‚ypfam10486, PI3K_1B_p101, Phosphoinositide 3-kinase gamma adapter protein p101 subunit. Class I PI3Ks are dual-specific lipid and protein kinases involved in numerous intracellular signaling pathways. Class IB PI3K, p110gamma, is mainly activated by seven-transmembrane G-protein-coupled receptors (GPCRs), through its regulatory subunit p101 and G-protein beta-gamma subunits.¡€0€ª€0€ €CDD¡€ €ÉD¢€0€0€ €‚pfam10487, Nup188, Nucleoporin subcomplex protein binding to Pom34. This is one of the many peptides that make up the nucleoporin complex (NPC), and is found across eukaryotes. The Nup188 subcomplex (Nic96p-Nup188p-Nup192p-Pom152p) is one of at least six that make up the NPC, and as such is symmetrically localized on both faces of the NPC at the nuclear end, being integrally bound to the C-terminus of Pom34p.¡€0€ª€0€ €CDD¡€ €ÉE¢€0€0€ €‚$pfam10488, PP1c_bdg, Phosphatase-1 catalytic subunit binding region. This conserved C-terminus appears to be a protein phosphatase-1 catalytic subunit (PP1C) binding region, which may in some circumstances also be retroviral in origin since it is found in both herpes simplex virus and in mouse and man. This domain is found in Gadd-34 apoptosis-associated proteins as well as the constitutive repressor of eIF2-alpha phosphorylation/protein phosphatase 1, regulatory (inhibitor) subunit 15b, otherwise known as CReP. Diverse stressful conditions are associated with phosphorylation of the {alpha} subunit of eukaryotic translation initiation factor 2 (eIF2{alpha}) on serine 51. This signaling event, which is conserved from yeast to mammals, negatively regulates the guanine nucleotide exchange factor, eIF2-B and inhibits the recycling of eIF2 to its active GTP bound form. In mammalian cells eIF2{alpha} phosphorylation emerges as an important event in stress signaling that impacts on gene expression at both the translational and transcriptional levels.¡€0€ª€0€ €CDD¡€ €ÉF¢€0€0€ €‚xpfam10490, CENP-F_C_Rb_bdg, Rb-binding domain of kinetochore protein Cenp-F/LEK1. Cenp-F, a centromeric kinetochore, microtubule-binding protein consisting of two 1,600-amino acid-long coils, is essential for the full functioning of the mitotic checkpoint pathway. This domain is at the very C-terminus of the C-terminal coiled-coil, and is one of the key Rb-binding domains.¡€0€ª€0€ €CDD¡€ €ÉG¢€0€0€ €‚ôpfam10491, Nrf1_DNA-bind, NLS-binding and DNA-binding and dimerisation domains of Nrf1. In Drosophila, the erect wing (ewg) protein is required for proper development of the central nervous system and the indirect flight muscles. The fly ewg gene encodes a novel DNA-binding domain that is also found in four genes previously identified in sea urchin, chicken, zebrafish, and human. Nuclear respiratory factor-1 is a transcriptional activator that has been implicated in the nuclear control of respiratory chain expression in vertebrates. The first 26 amino acids of nuclear respiratory factor-1 are required for the binding of dynein light chain. The interaction with dynein light chain is observed for both ewg and Nrf-1, transcription factors that are structurally and functionally similar between humans and Drosophila. The highest level of expression of both ewg and Nrf-1 was found in the central nervous system, somites, first branchial arch, optic vesicle, and otic vesicle. In the mouse Nrf-1 protein there is also an NLS domain at 88-116, and a DNA binding and dimerisation domain at 127-282. Ewg is a site-specific transcriptional activator, and evolutionarily conserved regions of ewg contribute both positively and negatively to transcriptional activity.¡€0€ª€0€ €CDD¡€ €ÉH¢€0€0€ €‚lpfam10492, Nrf1_activ_bdg, Nrf1 activator activation site binding domain. In Drosophila, the erect wing (ewg) protein is required for proper development of the central nervous system and the indirect flight muscles. The fly ewg gene encodes a novel DNA-binding domain that is also found in four genes previously identified in sea urchin, chicken, zebrafish, and human. Nuclear respiratory factor-1 is a transcriptional activator that has been implicated in the nuclear control of respiratory chain expression in vertebrates. The first 26 amino acids of nuclear respiratory factor-1 are required for the binding of dynein light chain. The interaction with dynein light chain is observed for both ewg and Nrf-1, transcription factors that are structurally and functionally similar between humans and Drosophila. The highest level of expression of both ewg and Nrf-1 was found in the central nervous system, somites, first branchial arch, optic vesicle, and otic vesicle. In the mouse Nrf-1 protein, there is an activation domain at 303-469, the most conserved part of which is this domain 446-469. Ewg is a site-specific transcriptional activator, and evolutionarily conserved regions of ewg contribute both positively and negatively to transcriptional activity. The family Nrf1_DNA-bind is associated with this domain towards the N-terminal, as is the N terminal of the activation domain.¡€0€ª€0€ €CDD¡€ €ÉI¢€0€0€ €‚pfam10493, Rod_C, Rough deal protein C-terminal region. Rod, the Rough deal protein, displays a dynamic intracellular staining pattern, localising first to kinetochores in pro-metaphase, but moving to kinetochore microtubules at metaphase. Early in anaphase the protein is once again restricted to the kinetochores, where it persists until the end of telophase. This behaviour is in all respects similar to that described for ZW10, and indeed the two proteins function together, localization of each depending upon the other. These two proteins are found at the kinetochore in complex with a third, Zwilch, in both flies and humans. The C-terminus is the most conserved part of the protein. During pro-metaphase, the ZW10-Rod complex, dynein/dynactin, and Mad2 all accumulate on unattached kinetochores; microtubule capture leads to Mad2 depletion as it is carried off by dynein/dynactin; ZW10-Rod complex accumulation continues, replenishing kinetochore dynein. The continuing recruitment of the ZW10-Rod complex during metaphase may serve to maintain adequate dynein/dynactin complex on kinetochores for assisting chromatid movement during anaphase. The ZW10-Rod complex acts as a bridge whose association with Zwint-1 links Mad1 and Mad2, components that are directly responsible for generating the diffusible 'wait anaphase' signal, to a structural, inner kinetochore complex containing Mis12 and KNL-1AF15q14, the last of which has been proved to be essential for kinetochore assembly in C. elegans. Removal of ZW10 or Rod inactivates the mitotic checkpoint.¡€0€ª€0€ €CDD¡€ €ÉJ¢€0€0€ €‚¿pfam10494, Stk19, Serine-threonine protein kinase 19. This serine-threonine protein kinase number 19 is expressed from the MHC and predominantly in the nucleus. Protein kinases are involved in signal transduction pathways and play fundamental roles in the regulation of cell functions. This is a novel Ser/Thr protein kinase, that has Mn2+-dependent protein kinase activity that phosphorylates alpha -casein at Ser/Thr residues and histone at Ser residues. It can be covalently modified by the reactive ATP analogue 5'-p-fluorosulfonylbenzoyladenosine in the absence of ATP, and this modification is prevented in the presence of 1 mM ATP, indicating that the kinase domain of is capable of binding ATP.¡€0€ª€0€ €CDD¡€ €ÉK¢€0€0€ €‚tpfam10495, PACT_coil_coil, Pericentrin-AKAP-450 domain of centrosomal targeting protein. This domain is a coiled-coil region close to the C-terminus of centrosomal proteins that is directly responsible for recruiting AKAP-450 and pericentrin to the centrosome. Hence the suggested name for this region is a PACT domain (pericentrin-AKAP-450 centrosomal targeting). This domain is also present at the C-terminus of coiled-coil proteins from Drosophila and S. pombe, and that from the Drosophila protein is sufficient for targeting to the centrosome in mammalian cells. The function of these proteins is unknown but they seem good candidates for having a centrosomal or spindle pole body location. The final 22 residues of this domain in AKAP-450 appear specifically to be a calmodulin-binding domain indicating that this member at least is likely to contribute to centrosome assembly.¡€0€ª€0€ €CDD¡€ €ÉL¢€0€0€ €‚@pfam10496, Syntaxin-18_N, SNARE-complex protein Syntaxin-18 N-terminus. This is the conserved N-terminal of Syntaxin-18. Syntaxin-18 is found in the SNARE complex of the endoplasmic reticulum and functions in the trafficking between the ER intermediate compartment and the cis-Golgi vesicle. In particular, the N-terminal region is important for the formation of ER aggregates. More specifically, syntaxin-18 is involved in endoplasmic reticulum-mediated phagocytosis, presumably by regulating the specific and direct fusion of the ER with the plasma or phagosomal membranes.¡€0€ª€0€ €CDD¡€ €ÉM¢€0€0€ €‚Zpfam10497, zf-4CXXC_R1, Zinc-finger domain of monoamine-oxidase A repressor R1. R1 is a transcription factor repressor that inhibits monoamine oxidase A gene expression. This domain is a four-CXXC zinc finger putative DNA-binding domain found at the C-terminal end of R1. The domain carries 12 cysteines of which four pairs are of the CXXC type.¡€0€ª€0€ €CDD¡€ €ÉN¢€0€0€ €‚¾pfam10498, IFT57, Intra-flagellar transport protein 57. Eukaryotic cilia and flagella are specialized organelles found at the periphery of cells of diverse organisms. Intra-flagellar transport (IFT) is required for the assembly and maintenance of eukaryotic cilia and flagella, and consists of the bidirectional movement of large protein particles between the base and the distal tip of the organelle. IFT particles contain multiple copies of two distinct protein complexes, A and B, which contain at least 6 and 11 protein subunits. IFT57 is part of complex B but is not, however, required for the core subunits to stay associated. This protein is known as Huntington-interacting protein-1 in humans.¡€0€ª€0€ €CDD¡€ €ÉO¢€0€0€ €‚…pfam10500, SR-25, Nuclear RNA-splicing-associated protein. SR-25, otherwise known as ADP-ribosylation factor-like factor 6-interacting protein 4, is expressed in virtually all tissues. At the N-terminus there is a repeat of serine-arginine (SR repeat), and towards the middle of the protein there are clusters of both serines and of basic amino acids. The presence of many nuclear localization signals strongly implies that this is a nuclear protein that may contribute to RNA splicing. SR-25 is also implicated, along with heat-shock-protein-27, as a mediator in the Rac1 (GTPase ras-related C3 botulinum toxin substrate 1) signalling pathway.¡€0€ª€0€ €CDD¡€ €ÉP¢€0€0€ €‚ìpfam10501, Ribosomal_L50, Ribosomal subunit 39S. The 39S ribosomal protein appears to be a subunit of one of the larger mitochondrial 66S or 70S units. Under conditions of ethanol-stress in rats the larger subunit is largely dissociated into its smaller components. In E. coli, in the absence of the enzyme pseudouridine synthase (RluD) synthase, there is an accumulation of 50S and 30S subunits and the appearance of abnormal particles (62S and 39S), with concomitant loss of 70S ribosomes.¡€0€ª€0€ €CDD¡€ €ÉQ¢€0€0€ €‚Ùpfam10502, Peptidase_S26, Signal peptidase, peptidase S26. This is a family of membrane signal serine endopeptidases which function in the processing of newly-synthesized secreted proteins. Peptidase S26 removes the hydrophobic, N-terminal, signal peptides as proteins are translocated across membranes. The active site residues take the form of a catalytic dyad that is Ser, Lys in subfamily S26A; the Ser is the nucleophile in catalysis, and the Lys is the general base.¡€0€ª€0€ €CDD¡€ €ÉR¢€0€0€ €½pfam10503, Esterase_phd, Esterase PHB depolymerase. This family of proteins include acetyl xylan esterases (AXE), feruloyl esterases (FAE), and poly(3-hydroxybutyrate) (PHB) depolymerases.¡€0€ª€0€ €CDD¡€ €ÉS¢€0€0€ €|pfam10504, DUF2452, Protein of unknown function (DUF2452). This protein is found in eukaryotes but its function is unknown.¡€0€ª€0€ €CDD¡€ €ÉT¢€0€0€ €‚Êpfam10505, NARG2_C, NMDA receptor-regulated gene protein 2 C-terminus. The transition of neuronal cells from pre-cursor to mature state is regulated by the N-methyl-d-aspartate (NMDA) receptor, a glutamate-gated ion channel that is permeable to Ca2+. NMDA receptors probably mediate this activity by permitting expression of NARG2. NARG2 is transiently expressed, being a regulatory protein that is present in the nucleus of dividing cells and then down-regulated as progenitors exit the cell cycle and begin to differentiate. NARG2 contains repeats of (S/T)PXX, (11 in mouse, six in human), a putative DNA-binding motif that is found in many gene-regulatory proteins including Kruppel, Hunchback and Antennapedi.¡€0€ª€0€ €CDD¡€ €ÉU¢€0€0€ €‚pfam10506, MCC-bdg_PDZ, PDZ domain of MCC-2 bdg protein for Usher syndrome. The protein has a high homology to the tumor suppressor MCC (mutated in colon cancer; or MCC1 hereafter) and was named MCC2. MCC2 protein binds the first PDZ domain of AIE-75 with its C-terminal amino acids -DTFL. A possible role of MCC2 as a tumor suppressor has been put forward. The carboxyl terminus of the predicted protein was DTFL which matched the consensus motif X-S/T-X-phi (phi: hydrophobic amino acid residue) for binding to the PDZ domain of AIE-75.¡€0€ª€0€ €CDD¡€ €ÉV¢€0€0€ €ýpfam10507, TMEM65, Transmembrane protein 65. MEM65 is an intercalated disc protein that interacts with with connexin 43 (Cx43) and is required for correct localization of Cx43 to the intercalated disc. It is essential for cardiac function in zebrafish.¡€0€ª€0€ €CDD¡€ €ÉW¢€0€0€ €‚Èpfam10508, Proteasom_PSMB, Proteasome non-ATPase 26S subunit. The 26S proteasome, a eukaryotic ATP-dependent, dumb-bell shaped, protease complex with a molecular mass of approx 20kDa consists of a central 20S proteasome,functioning as a catalytic machine, and two large V-shaped terminal modules, having possible regulatory roles,composed of multiple subunits of 25- 110 kDa attached to the central portion in opposite orientations. It is responsible for degradation of abnormal intracellular proteins, including oxidatively damaged proteins, and may play a role as a component of a cellular anti-oxidative system. Expression of catalytic core subunits including PSMB5 and peptidase activities of the proteasome were elevated following incubation with 3-methylcholanthrene. The 20S proteasome comprises a cylindrical stack of four rings, two outer rings formed by seven alpha-subunits (alpha1-alpha7) and two inner rings of seven beta-subunits (beta1-beta7). Two outer rings of alpha subunits maintain structure, while the central beta rings contain the proteolytic active core subunits beta1 (PSMB6), beta2 (PSMB7), and beta5 (PSMB5). Expression of PSMB5 can be altered by chemical reactants, such as 3-methylcholanthrene.¡€0€ª€0€ €CDD¡€ €ÉX¢€0€0€ €‚épfam10509, GalKase_gal_bdg, Galactokinase galactose-binding signature. This is the highly conserved galactokinase signature sequence which appears to be present in all galactokinases irrespective of how many other ATP binding sites, etc that they carry. The function of this domain appears to be to bind galactose, and the domain is normally at the N-terminus of the enzymes, EC:2.7.1.6. This domain is associated with the families GHMP_kinases_C, pfam08544 and GHMP_kinases_N, pfam00288.¡€0€ª€0€ €CDD¡€ €ÉY¢€0€0€ €‚špfam10510, PIG-S, Phosphatidylinositol-glycan biosynthesis class S protein. PIG-S is one of several key, core, components of the glycosylphosphatidylinositol (GPI) trans-amidase complex that mediates GPI anchoring in the endoplasmic reticulum. Anchoring occurs when a protein's C-terminal GPI attachment signal peptide is replaced with a pre-assembled GPI. Mammalian GPITransamidase consists of at least five components: Gaa1, Gpi8, PIG-S, PIG-T, and PIG-U, all five of which are required for function. It is possible that Gaa1, Gpi8, PIG-S, and PIG-T form a tightly associated core that is only weakly associated with PIG-U. The exact function of PIG-S is unclear.¡€0€ª€0€ €CDD¡€ €ÉZ¢€0€0€ €‚)pfam10511, Cementoin, Trappin protein transglutaminase binding domain. Trappin-2, itself a protease inhibitor, has this unique N-terminal domain that enables it to become cross-linked to extracellular matrix proteins by transglutaminase. This domain contains several repeated motifs with the the consensus sequence Gly-Gln-Asp-Pro-Val-Lys, and these together can anchor the whole molecule to extracellular matrix proteins, such as laminin, fibronectin, beta-crystallin, collagen IV, fibrinogen, and elastin, by transglutaminase-catalyzed cross-links. The whole domain is rich in glutamine and lysine, thus allowing and transglutaminase(s) to catalyze the formation of an intermolecular epsilon-(gamma-glutamyl)lysine isopeptide bond. Cementoin is associated with the WAP family, pfam00095, at the C-terminus.¡€0€ª€0€ €CDD¡€ €bú¢€0€0€ €‚$pfam10512, Borealin, Cell division cycle-associated protein 8. The chromosomal passenger complex of Aurora B kinase, INCENP, and Survivin has essential regulatory roles at centromeres and the central spindle in mitosis. Borealin is also a member of the complex. Approximately half of Aurora B in mitotic cells is complexed with INCENP, Borealin, and Survivin. Depletion of Borealin by RNA interference delays mitotic progression and results in kinetochore-spindle mis-attachments and an increase in bipolar spindles associated with ectopic asters.¡€0€ª€0€ €CDD¡€ €É[¢€0€0€ €òpfam10513, EPL1, Enhancer of polycomb-like. This is a family of EPL1 (Enhancer of polycomb-like) proteins. The EPL1 protein is a member of a histone acetyltransferase complex which is involved in transcriptional activation of selected genes.¡€0€ª€0€ €CDD¡€ €É\¢€0€0€ €‚cpfam10514, Bcl-2_BAD, Pro-apoptotic Bcl-2 protein, BAD. BAD is a Bcl-2 homology domain 3 (BH3)-only pro-apoptotic member of the Bcl-2 protein family that is regulated by phosphorylation in response to survival factors. Binding of BAD to mitochondria is thought to be exclusively mediated by its BH3 domain. Membrane localization of BAD mediates membrane translocation of Bcl-XL. The C-terminal part of BAD is sufficient for membrane binding. There are two segments with differing lipid-binding preferences, LBD1 and LBD2, that are responsible for this binding: (i) LBD1 located in the proximity of the BH3 domain (amino acids 122-131) and (ii) LBD2, the putative C-terminal alpha-helix-5. Phosphorylation-regulated 14-3-3 protein binding may expose the cholesterol-preferring LBD1 and bury the LBD2, thereby mediating translocation of BAD to raft-like micro-domains.¡€0€ª€0€ €CDD¡€ €É]¢€0€0€ €‚Fpfam10515, APP_amyloid, beta-amyloid precursor protein C-terminus. This is the amyloid, C-terminal, protein of the beta-Amyloid precursor protein (APP) which is a conserved and ubiquitous transmembrane glycoprotein strongly implicated in the pathogenesis of Alzheimer's disease but whose normal biological function is unknown. The C-terminal 100 residues are released and aggregate into amyloid deposits which are strongly implicated in the pathology of Alzheimer's disease plaque-formation. The domain is associated with family A4_EXTRA, pfam02177, further towards the N-terminus.¡€0€ª€0€ €CDD¡€ €É^¢€0€0€ €†pfam10516, SHNi-TPR, SHNi-TPR. SHNi-TPR family members contain a reiterated sequence motif that is an interrupted form of TPR repeat.¡€0€ª€0€ €CDD¡€ €bÿ¢€0€0€ €‚lpfam10517, DM13, Electron transfer DM13. The DM13 domain is a component of a novel electron-transfer system potentially involved in oxidative modification of animal cell-surface proteins. It contains a nearly absolutely conserved cysteine, which could be involved in a redox reaction, either as a naked thiol group or through binding a prosthetic group like heme.¡€0€ª€0€ €CDD¡€ €É_¢€0€0€ €Spfam10518, TAT_signal, TAT (twin-arginine translocation) pathway signal sequence. ¡€0€ª€0€ €CDD¡€ €É`¢€0€0€ €‚¢€0€0€ €spfam10826, DUF2551, Protein of unknown function (DUF2551). This Archaeal family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €Ê?¢€0€0€ €tpfam10827, DUF2552, Protein of unknown function (DUF2552). This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €d¢€0€0€ €opfam10828, DUF2570, Protein of unknown function (DUF2570). This is a family of proteins with unknown function.¡€0€ª€0€ €CDD¡€ €Ê@¢€0€0€ €™pfam10829, DUF2554, Protein of unknown function (DUF2554). This family of proteins with unknown function appears to be restricted to Enterobacteriaceae.¡€0€ª€0€ €CDD¡€ €ÊA¢€0€0€ €tpfam10830, DUF2553, Protein of unknown function (DUF2553). This family of bacterial proteins has no known function.¡€0€ª€0€ €CDD¡€ €ÊB¢€0€0€ €™pfam10831, DUF2556, Protein of unknown function (DUF2556). This family of proteins with unknown function appears to be restricted to Enterobacteriaceae.¡€0€ª€0€ €CDD¡€ €ÊC¢€0€0€ €Îpfam10832, DUF2559, Protein of unknown function (DUF2559). This family of proteins appear to be restricted to Enterobacteriaceae. The sequences are annotated as yhfG however currently no function is known.¡€0€ª€0€ €CDD¡€ €ÊD¢€0€0€ €tpfam10833, DUF2572, Protein of unknown function (DUF2572). This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €d¢€0€0€ €jpfam10834, DUF2560, Protein of unknown function (DUF2560). This family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €d¢€0€0€ €ípfam10835, DUF2573, Protein of unknown function (DUF2573). Some members in this bacterial family of proteins are annotated as YusU however no function is currently known. This family of proteins appears to be restricted to Bacillus spp.¡€0€ª€0€ €CDD¡€ €ÊE¢€0€0€ €×pfam10836, DUF2574, Protein of unknown function (DUF2574). This family of proteins appears to be restricted to Enterobacteriaceae. Members of the family are annotated as yehE however currently no function is known.¡€0€ª€0€ €CDD¡€ €d¢€0€0€ €Ùpfam10837, DUF2575, Protein of unknown function (DUF2575). This family of proteins appears to be restricted to Enterobacteriaceae. Members in the family are annotated as yaaY but currently there is no known function.¡€0€ª€0€ €CDD¡€ €d¢€0€0€ €pfam10838, DUF2677, Protein of unknown function (DUF2677). Members in this family of proteins are annotated as UL121 however currently no function is known.¡€0€ª€0€ €CDD¡€ €ÊF¢€0€0€ €‘pfam10839, DUF2647, Protein of unknown function (DUF2647). This eukaryotic family of proteins are annotated as ycf68 but have no known function.¡€0€ª€0€ €CDD¡€ €d¢€0€0€ €ìpfam10840, DUF2645, Protein of unknown function (DUF2645). This family of proteins appear to be restricted to Enterobacteriaceae. Some members in the family are annotated as YjeO however no function for this protein is currently known.¡€0€ª€0€ €CDD¡€ €ÊG¢€0€0€ €•pfam10841, DUF2644, Protein of unknown function (DUF2644). This family of proteins with unknown function appear to be restricted to Pasteurellaceae.¡€0€ª€0€ €CDD¡€ €d ¢€0€0€ €’pfam10842, DUF2642, Protein of unknown function (DUF2642). This family of proteins with unknown function appear to be restricted to Bacillus spp.¡€0€ª€0€ €CDD¡€ €ÊH¢€0€0€ €Âpfam10843, RGI1, Respiratory growth induced protein 1. This family of fungal proteins includes RGI1, standing for respiratory growth induced 1. RGI1 is involved in aerobic energetic metabolism.¡€0€ª€0€ €CDD¡€ €ÊI¢€0€0€ €jpfam10844, DUF2577, Protein of unknown function (DUF2577). This family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €ÊJ¢€0€0€ €upfam10845, DUF2576, Protein of unknown function (DUF2576). The function of this viral family of proteins is unknown.¡€0€ª€0€ €CDD¡€ €ÊK¢€0€0€ €upfam10846, DUF2722, Protein of unknown function (DUF2722). This eukaryotic family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €ÊL¢€0€0€ €tpfam10847, DUF2656, Protein of unknown function (DUF2656). This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €d%¢€0€0€ €™pfam10848, DUF2655, Protein of unknown function (DUF2655). This family of proteins with unknown function appears to be restricted to Enterobacteriaceae.¡€0€ª€0€ €CDD¡€ €d&¢€0€0€ €£pfam10849, DUF2654, Protein of unknown function (DUF2654). Some members in this family of proteins are annotated as a-gt.4 however currently no function is known.¡€0€ª€0€ €CDD¡€ €d'¢€0€0€ €“pfam10850, DUF2653, Protein of unknown function (DUF2653). This family of proteins with unknown function appears to be restricted to Bacillus spp.¡€0€ª€0€ €CDD¡€ €ÊM¢€0€0€ €jpfam10851, DUF2652, Protein of unknown function (DUF2652). This family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €ÊN¢€0€0€ €“pfam10852, DUF2651, Protein of unknown function (DUF2651). This family of proteins with unknown function appears to be restricted to Bacillus spp.¡€0€ª€0€ €CDD¡€ €ÊO¢€0€0€ €œpfam10853, DUF2650, Protein of unknown function (DUF2650). This family of proteins with unknown function appear to be restricted to Caenorhabditis elegans.¡€0€ª€0€ €CDD¡€ €ÊP¢€0€0€ €Âpfam10854, DUF2649, Protein of unknown function (DUF2649). Members in this family of proteins are annotated as Plectrovirus orf 10 transmembrane proteins however currently no function is known.¡€0€ª€0€ €CDD¡€ €d,¢€0€0€ € pfam10855, DUF2648, Protein of unknown function (DUF2648). This family of proteins with unknown function appears to be restricted to Bacillales Staphylococcus.¡€0€ª€0€ €CDD¡€ €d-¢€0€0€ €jpfam10856, DUF2678, Protein of unknown function (DUF2678). This family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €d.¢€0€0€ €ppfam10857, DUF2701, Protein of unknown function (DUF2701). This viral family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €d/¢€0€0€ €tpfam10858, DUF2659, Protein of unknown function (DUF2659). This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €ÊQ¢€0€0€ €opfam10859, DUF2660, Protein of unknown function (DUF2660). This is a family of proteins with unknown function.¡€0€ª€0€ €CDD¡€ €ÊR¢€0€0€ €qpfam10860, DUF2661, Protein of unknown function (DUF2661). This viral family of proteins have no known function.¡€0€ª€0€ €CDD¡€ €d2¢€0€0€ €§pfam10861, DUF2784, Protein of Unknown function (DUF2784). This is a family of uncharacterized protein. The function is not known however it is conserved in Bacteria.¡€0€ª€0€ €CDD¡€ €ÊS¢€0€0€ €‚pfam10862, FcoT, FcoT-like thioesterase domain. Proteins in this family have a HotDog fold. This family was formerly known as domain of unknown function 2662 (DUF2662). The structure of Rv0098 from M. tuberculosis suggested a thioesterase function. Assays showed that this protein was a thioesterase with a preference for long chain fatty acyl groups. The maximal Kcat was observed for palmitoyl-CoA although longer and shorter molecules were also cleaved. In solution this protein forms a homo-hexameric complex.¡€0€ª€0€ €CDD¡€ €ÊT¢€0€0€ €zpfam10863, NOP19, Nucleolar protein 19. Nucleolar protein 19 plays an essential role in 40S ribosomal subunit biogenesis.¡€0€ª€0€ €CDD¡€ €ÊU¢€0€0€ €¡pfam10864, DUF2663, Protein of unknown function (DUF2663). Some members in this family of proteins are annotated as YpbF however currently no function is known.¡€0€ª€0€ €CDD¡€ €ÊV¢€0€0€ €äpfam10865, DUF2703, Domain of unknown function (DUF2703). This family of protein has no known function, but it may be distantly related to the thioredoxin fold. It contains the CXXC motif that is characteristic of thioredoxins.¡€0€ª€0€ €CDD¡€ €ÊW¢€0€0€ €ppfam10866, DUF2704, Protein of unknown function (DUF2704). This viral family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €ÊX¢€0€0€ €™pfam10867, DUF2664, Protein of unknown function (DUF2664). This family of proteins is a viral family, annotated as UL96. Currently no function is known.¡€0€ª€0€ €CDD¡€ €d8¢€0€0€ €Épfam10868, Defensin_like, Cysteine-rich antifungal protein 2, defensin-like. This is a family of plant antifungal proteins. It has insecticidal and antifungal activity against certain plant pathogens.¡€0€ª€0€ €CDD¡€ €ÊY¢€0€0€ €spfam10869, DUF2666, Protein of unknown function (DUF2666). This Archaeal family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €ÊZ¢€0€0€ €ppfam10870, DUF2729, Protein of unknown function (DUF2729). This viral family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €d:¢€0€0€ €ypfam10871, DUF2748, Protein of unknown function (DUF2748). This is a bacterial family of proteins with unknown function.¡€0€ª€0€ €CDD¡€ €Ê[¢€0€0€ €Špfam10872, DUF2740, Protein of unknown function (DUF2740). This family of proteins with unknown function has a highly conserved sequence.¡€0€ª€0€ €CDD¡€ €d<¢€0€0€ €ºpfam10873, CYYR1, Cysteine and tyrosine-rich protein 1. Members in this family of proteins are annotated as Cysteine and tyrosine-rich protein 1, however currently no function is known.¡€0€ª€0€ €CDD¡€ €Ê\¢€0€0€ €jpfam10874, DUF2746, Protein of unknown function (DUF2746). This family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €Ê]¢€0€0€ €tpfam10875, DUF2670, Protein of unknown function (DUF2670). This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €d?¢€0€0€ €¶pfam10876, Phage_TAC_9, Phage tail assemb.y chaperone protein, TAC. This is a family of putative phage tail assembly chaperone proteins largely from Haemophilus and Xylella species.¡€0€ª€0€ €CDD¡€ €Ê^¢€0€0€ €•pfam10877, DUF2671, Protein of unknown function (DUF2671). This family of proteins with unknown function appears to be restricted to Rickettsia spp.¡€0€ª€0€ €CDD¡€ €Ê_¢€0€0€ €‘pfam10878, DUF2672, Protein of unknown function (DUF2672). This family of proteins with unknown function appear to be restricted to Rickettsiae.¡€0€ª€0€ €CDD¡€ €Ê`¢€0€0€ €”pfam10879, DUF2674, Protein of unknown function (DUF2674). This family of proteins with unknown function appears to be conserved to Rickettsia spp.¡€0€ª€0€ €CDD¡€ €dC¢€0€0€ €–pfam10880, DUF2673, Protein of unknown function (DUF2673). This family of proteins with unknown function appears to be restricted to Rickettsiae spp.¡€0€ª€0€ €CDD¡€ €Êa¢€0€0€ €tpfam10881, DUF2726, Protein of unknown function (DUF2726). This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €Êb¢€0€0€ €®pfam10882, bPH_5, Bacterial PH domain. This family of proteins with unknown function appear to be related to bacterial PH domains. This family was formerly known as DUF2679.¡€0€ª€0€ €CDD¡€ €Êc¢€0€0€ €¹pfam10883, DUF2681, Protein of unknown function (DUF2681). This family of proteins is found in bacteria. Proteins in this family are typically between 81 and 117 amino acids in length.¡€0€ª€0€ €CDD¡€ €Êd¢€0€0€ €™pfam10884, DUF2683, Protein of unknown function (DUF2683). This family of proteins with unknown function appears to be restricted to Methanosarcinaceae.¡€0€ª€0€ €CDD¡€ €dG¢€0€0€ €œpfam10885, DUF2684, Protein of unknown function (DUF2684). Members in this family of proteins are annotated as yqgD however currently no function is known.¡€0€ª€0€ €CDD¡€ €dH¢€0€0€ €Ñpfam10886, DUF2685, Protein of unknown function (DUF2685). Members in this family of proteins are annotated as uvdY.-2 which is an open reading frame within uvsY. However currently there is no known function.¡€0€ª€0€ €CDD¡€ €dI¢€0€0€ €¡pfam10887, DUF2686, Protein of unknown function (DUF2686). Some members in this family of proteins are annotated as yjfZ however currently no function is known.¡€0€ª€0€ €CDD¡€ €dJ¢€0€0€ €¦pfam10888, DUF2742, Protein of unknown function (DUF2742). Members in this family of phage proteins are the product of the gene phiRv1, however no function is known.¡€0€ª€0€ €CDD¡€ €dK¢€0€0€ €zpfam10890, Cyt_b-c1_8, Cytochrome b-c1 complex subunit 8. This entry represents subunit 8 of the Cytochrome b-c1 complex.¡€0€ª€0€ €CDD¡€ €Êe¢€0€0€ €›pfam10891, DUF2719, Protein of unknown function (DUF2719). This family of proteins with unknown function appears to be restricted to Nucleopolyhedrovirus.¡€0€ª€0€ €CDD¡€ €O+¢€0€0€ €œpfam10892, DUF2688, Protein of unknown function (DUF2688). Members in this family of proteins are annotated as KleB however currently no function is known.¡€0€ª€0€ €CDD¡€ €dM¢€0€0€ €opfam10893, DUF2724, Protein of unknown function (DUF2724). This is a family of proteins with unknown function.¡€0€ª€0€ €CDD¡€ €dN¢€0€0€ €œpfam10894, DUF2689, Protein of unknown function (DUF2689). Members in this family of proteins are annotated as TrbD however currently no function is known.¡€0€ª€0€ €CDD¡€ €Êf¢€0€0€ €™pfam10895, DUF2715, Protein of unknown function (DUF2715). This family of proteins with unknown function appears to be restricted to Treponema pallidum.¡€0€ª€0€ €CDD¡€ €èÁ¢€0€0€ €—pfam10896, DUF2714, Protein of unknown function (DUF2714). This family of proteins with unknown function appears to be restricted to Mycoplasmataceae.¡€0€ª€0€ €CDD¡€ €Êg¢€0€0€ €™pfam10897, DUF2713, Protein of unknown function (DUF2713). This family of proteins with unknown function appears to be restricted to Enterobacteriaceae.¡€0€ª€0€ €CDD¡€ €Êh¢€0€0€ €tpfam10898, DUF2716, Protein of unknown function (DUF2716). This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €Êi¢€0€0€ €‚pfam10899, AbiGi, Putative abortive phage resistance protein AbiGi, antitoxin. This is a bacterial family of proteins with unknown function. AbiGi is a family of putative type IV toxin-antitoxin system antitoxins. The AbiG abortive phage resistance system affects lactococcal bacteriophages phiP335 and phiQ30 but not the other P335 phage species. AbiGii toxin appears to confer resistance to phages by a mechanism of abortive infection that acts by interfering with phage RNA synthesis. The cognate toxin is found in pfam16873.¡€0€ª€0€ €CDD¡€ €Êj¢€0€0€ €tpfam10901, DUF2690, Protein of unknown function (DUF2690). This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €Êk¢€0€0€ €ppfam10902, DUF2693, Protein of unknown function (DUF2693). This viral family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €Êl¢€0€0€ €tpfam10903, DUF2691, Protein of unknown function (DUF2691). This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €Êm¢€0€0€ €˜pfam10904, DUF2694, Protein of unknown function (DUF2694). This family of proteins with unknown function appears to be restricted to Mycobacterium spp.¡€0€ª€0€ €CDD¡€ €dS¢€0€0€ €tpfam10905, DUF2695, Protein of unknown function (DUF2695). This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €Ên¢€0€0€ €»pfam10906, Mrx7, MIOREX complex component 7. This entry includes budding yeast MIOREX complex component 7 (Mrx7), which associates with mitochondrial ribosome. Its function is not clear.¡€0€ª€0€ €CDD¡€ €Êo¢€0€0€ €¨pfam10907, DUF2749, Protein of unknown function (DUF2749). This bacterial family of proteins appear to come from the Trb operon however currently no function is known.¡€0€ª€0€ €CDD¡€ €Êp¢€0€0€ €spfam10908, DUF2778, Protein of unknown function (DUF2778). This is a bacterial family of uncharacterized proteins.¡€0€ª€0€ €CDD¡€ €Êq¢€0€0€ €ppfam10909, DUF2682, Protein of unknown function (DUF2682). This viral family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €O<¢€0€0€ €upfam10910, DUF2744, Protein of unknown function (DUF2744). This is a viral family of proteins with unknown function.¡€0€ª€0€ €CDD¡€ €dV¢€0€0€ €®pfam10911, DUF2717, Protein of unknown function (DUF2717). Members in this family of proteins are annotated as gene 6.5 protein however currently there is no known function.¡€0€ª€0€ €CDD¡€ €O>¢€0€0€ €pfam10912, DUF2700, Protein of unknown function (DUF2700). This family of proteins with unknown function appears to be restricted to Caenorhabditis elegans.¡€0€ª€0€ €CDD¡€ €Êr¢€0€0€ €•pfam10913, DUF2706, Protein of unknown function (DUF2706). This family of proteins with unknown function appears to be restricted to Rickettsia spp.¡€0€ª€0€ €CDD¡€ €Ês¢€0€0€ €¼pfam10914, DUF2781, Protein of unknown function (DUF2781). This is a eukaryotic family of uncharacterized proteins. Some of the proteins in this family are annotated as membrane proteins.¡€0€ª€0€ €CDD¡€ €Êt¢€0€0€ €tpfam10915, DUF2709, Protein of unknown function (DUF2709). This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €Êu¢€0€0€ €pfam10916, DUF2712, Protein of unknown function (DUF2712). This family of proteins with unknown function appear to be restricted to Bacillales.¡€0€ª€0€ €CDD¡€ €Êv¢€0€0€ €Špfam10917, Fungus-induced, Fungus-induced protein. This entry represents fungus-induced proteins which may have role in hypoxia response.¡€0€ª€0€ €CDD¡€ €d[¢€0€0€ €ppfam10918, DUF2718, Protein of unknown function (DUF2718). This viral family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €Êw¢€0€0€ €tpfam10920, DUF2705, Protein of unknown function (DUF2705). This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €d\¢€0€0€ €—pfam10921, DUF2710, Protein of unknown function (DUF2710). This family of proteins with unknown function appears to be restricted to Mycobacteriaceae.¡€0€ª€0€ €CDD¡€ €d]¢€0€0€ €upfam10922, DUF2745, Protein of unknown function (DUF2745). This is a viral family of proteins with unknown function.¡€0€ª€0€ €CDD¡€ €Êx¢€0€0€ €Òpfam10923, DUF2791, P-loop Domain of unknown function (DUF2791). This is a family of proteins found in archaea and bacteria. This domain contains a P-loop motif suggesting it binds to a nucleotide such as ATP.¡€0€ª€0€ €CDD¡€ €Êy¢€0€0€ €§pfam10924, DUF2711, Protein of unknown function (DUF2711). Some members in this family of proteins are annotated as ywbB however currently there is no known function.¡€0€ª€0€ €CDD¡€ €d_¢€0€0€ €œpfam10925, DUF2680, Protein of unknown function (DUF2680). Members in this family of proteins are annotated as yckD however currently no function is known.¡€0€ª€0€ €CDD¡€ €Êz¢€0€0€ €Öpfam10926, DUF2800, Protein of unknown function (DUF2800). This is a family of uncharacterized proteins found in bacteria and viruses. Some members of this family are annotated as being Phi APSE P51-like proteins.¡€0€ª€0€ €CDD¡€ €Ê{¢€0€0€ €upfam10927, DUF2738, Protein of unknown function (DUF2738). This is a viral family of proteins with unknown function.¡€0€ª€0€ €CDD¡€ €db¢€0€0€ €spfam10928, DUF2810, Protein of unknown function (DUF2810). This is a bacterial family of uncharacterized proteins.¡€0€ª€0€ €CDD¡€ €dc¢€0€0€ €spfam10929, DUF2811, Protein of unknown function (DUF2811). This is a bacterial family of uncharacterized proteins.¡€0€ª€0€ €CDD¡€ €Ê|¢€0€0€ €jpfam10930, DUF2737, Protein of unknown function (DUF2737). This family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €de¢€0€0€ €Ëpfam10931, DUF2735, Protein of unknown function (DUF2735). Some members in this family of proteins are annotated as glutamine synthetase translation inhibitor however this function can not be confirmed.¡€0€ª€0€ €CDD¡€ €Ê}¢€0€0€ €rpfam10932, DUF2783, Protein of unknown function (DUF2783). This is a bacterial family of uncharacterized protein.¡€0€ª€0€ €CDD¡€ €Ê~¢€0€0€ €pfam10933, DUF2827, Protein of unknown function (DUF2827). This is a family of uncharacterized proteins found in Burkholderia.¡€0€ª€0€ €CDD¡€ €Ê¢€0€0€ €¶pfam10934, DUF2634, Protein of unknown function (DUF2634). Some members in this family of proteins are annotated as phage related, xkdS however currently there is no known function.¡€0€ª€0€ €CDD¡€ €Ê€¢€0€0€ €jpfam10935, DUF2637, Protein of unknown function (DUF2637). This family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €Ê¢€0€0€ €rpfam10936, DUF2617, Protein of unknown function DUF2617. This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €Ê‚¢€0€0€ €Õpfam10937, S36_mt, Ribosomal protein S36, mitochondrial. This entry is represented by a mitochondrial ribosomal protein of the small subunit, which has similarity to human mitochondrial ribosomal protein MRP-S36.¡€0€ª€0€ €CDD¡€ €ʃ¢€0€0€ € pfam10938, YfdX, YfdX protein. YfdX is a protein found in Proteobacteria of unknown function. The protein coding for this gene is regulated by EvgA in E. coli.¡€0€ª€0€ €CDD¡€ €Ê„¢€0€0€ €ypfam10939, DUF2631, Protein of unknown function (DUF2631). This is s bacterial family of proteins with unknown function.¡€0€ª€0€ €CDD¡€ €Ê…¢€0€0€ €ªpfam10940, DUF2618, Protein of unknown function (DUF2618). This bacterial family of proteins has no known function. The sequences within the family are highly conserved.¡€0€ª€0€ €CDD¡€ €do¢€0€0€ €wpfam10941, DUF2620, Protein of unknown function DUF2620. This is a bacterial family of proteins with unknown function.¡€0€ª€0€ €CDD¡€ €ʆ¢€0€0€ €tpfam10942, DUF2619, Protein of unknown function (DUF2619). This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €ʇ¢€0€0€ €xpfam10943, DUF2632, Protein of unknown function (DUF2632). This is a family of membrane proteins with unknown function.¡€0€ª€0€ €CDD¡€ €dr¢€0€0€ €upfam10944, DUF2630, Protein of unknown function (DUF2630). This bacterial family of proteins have no known function.¡€0€ª€0€ €CDD¡€ €ʈ¢€0€0€ €‚8pfam10945, CBP_BcsR, Cellulose biosynthesis protein BcsR. CBP_BcsR is a family of bacterial cellulose biosynthesis proteins. Cellulose is necessary for biofilm formation in bacteria. Roemling U. and Galperin M.Y. "Bacterial cellulose biosynthesis. Diversity of operons and subunits" (manuscript in preparation).¡€0€ª€0€ €CDD¡€ €dt¢€0€0€ €Ÿpfam10946, DUF2625, Protein of unknown function DUF2625. Some members in this family of proteins are annotated as ybfG however currently no function is known.¡€0€ª€0€ €CDD¡€ €ʉ¢€0€0€ €¡pfam10947, DUF2628, Protein of unknown function (DUF2628). Some members in this family of proteins are annotated as yigF however currently no function is known.¡€0€ª€0€ €CDD¡€ €ÊŠ¢€0€0€ €upfam10948, DUF2635, Protein of unknown function (DUF2635). This is a family of phage proteins with unknown function.¡€0€ª€0€ €CDD¡€ €Ê‹¢€0€0€ €–pfam10949, DUF2777, Protein of unknown function (DUF2777). This family of proteins with unknown function appears to be restricted to Bacillus cereus.¡€0€ª€0€ €CDD¡€ €ÊŒ¢€0€0€ €‚pfam10950, Organ_specific, Organ specific protein. This eukaryotic family includes a number of plant organ-specific proteins. While their function is unknown, their predicted amino acid sequence suggests that these proteins could be exported and glycosylated.¡€0€ª€0€ €CDD¡€ €Ê¢€0€0€ €tpfam10951, DUF2776, Protein of unknown function (DUF2776). This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €ÊŽ¢€0€0€ €tpfam10952, DUF2753, Protein of unknown function (DUF2753). This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €d{¢€0€0€ €˜pfam10953, DUF2754, Protein of unknown function (DUF2754). This family of proteins with unknown function appear to be restricted to Enterobacteriaceae.¡€0€ª€0€ €CDD¡€ €Ê¢€0€0€ €Òpfam10954, DUF2755, Protein of unknown function (DUF2755). Some members in this family of proteins are annotated as YaiY however no function is known. The family appears to be restricted to Enterobacteriaceae.¡€0€ª€0€ €CDD¡€ €Ê¢€0€0€ €œpfam10955, DUF2757, Protein of unknown function (DUF2757). Members in this family of proteins are annotated as YabK however currently no function is known.¡€0€ª€0€ €CDD¡€ €Ê‘¢€0€0€ €Ùpfam10956, DUF2756, Protein of unknown function (DUF2756). Some members in this family of proteins are annotated yhhA however currently no function is known. The family appears to be restricted to Enterobacteriaceae.¡€0€ª€0€ €CDD¡€ €Ê’¢€0€0€ €‚Bpfam10957, Spore_Cse60, Sporulation protein Cse60. Cse60 is expressed during sporulation in Bacillus subtilis. Transcription commences around 2h after the start of sporulation and had an absolute requirement for the transcription factor sigmaE. Cse60 is an acidic product of only 60 residues, whose function is not known.¡€0€ª€0€ €CDD¡€ €Ê“¢€0€0€ €‘pfam10958, DUF2759, Protein of unknown function (DUF2759). This family of proteins with unknown function appear to be restricted to Bacillaceae.¡€0€ª€0€ €CDD¡€ €Ê”¢€0€0€ €’pfam10959, DUF2761, Protein of unknown function (DUF2761). Members in this family of proteins are annotated as KleF however no function is known.¡€0€ª€0€ €CDD¡€ €d¢€0€0€ €‚úpfam10960, Holin_BhlA, BhlA holin family. The Phage_holin_BhlA family is a family of holin-like proteins from both bacteriophages and bacterial chromosomes. In bacteriophage, holins are small membrane proteins that accumulate and oligomerise to form non-specific lesions in the cytoplasmic membrane allowing the release of the second protein, endolysins, to access the peptidoglycan. Most holins share common structural features: two or three transmembrane domains separated by a beta-turn, a short hydrophilic N-terminus, a highly charged C-terminus and a dual translational start motif. The BhlA holin of Bacillus is found to be toxic to the host cell where the site of action of is on the cell membrane and causes bacterial death by cell membrane disruption.¡€0€ª€0€ €CDD¡€ €Ê•¢€0€0€ €‚-pfam10961, SelK_SelG, Selenoprotein SelK_SelG. This entry inclues a group of eukaryotic selenoproteins, such as SelK and SelG. SelK seems to play an important role in protecting cells from endoplasmic reticulum stress induced apoptosis. SelG may be involved in regulating the redox state of the cell.¡€0€ª€0€ €CDD¡€ €Ê–¢€0€0€ €tpfam10962, DUF2764, Protein of unknown function (DUF2764). This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €Ê—¢€0€0€ €tpfam10963, Phage_TAC_10, Phage tail assembly chaperone. This is a family of phage tail assembly chaperone proteins.¡€0€ª€0€ €CDD¡€ €ʘ¢€0€0€ €™pfam10964, DUF2766, Protein of unknown function (DUF2766). This family of proteins with unknown function appears to be restricted to Enterobacteriaceae.¡€0€ª€0€ €CDD¡€ €d†¢€0€0€ €™pfam10965, DUF2767, Protein of unknown function (DUF2767). This family of proteins with unknown function appears to be restricted to Enterobacteriaceae.¡€0€ª€0€ €CDD¡€ €Ê™¢€0€0€ €’pfam10966, DUF2768, Protein of unknown function (DUF2768). This family of proteins with unknown function appear to be restricted to Bacillus spp.¡€0€ª€0€ €CDD¡€ €Êš¢€0€0€ €kpfam10967, DUF2769, Protein of unknown function (DUF2769). This family of proteins have no known function.¡€0€ª€0€ €CDD¡€ €Ê›¢€0€0€ €œpfam10968, DUF2770, Protein of unknown function (DUF2770). Members in this family of proteins are annotated as yceO however currently no function is known.¡€0€ª€0€ €CDD¡€ €Êœ¢€0€0€ €tpfam10969, DUF2771, Protein of unknown function (DUF2771). This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €Ê¢€0€0€ €‚pfam10970, GerPE, Spore germination protein GerPE. GerPE is required for the formation of functionally normal spores. It could be involved in the establishment of a normal spore coat structure and (or) permeability, which allows the access of germinants to their receptor.¡€0€ª€0€ €CDD¡€ €dŒ¢€0€0€ €™pfam10971, DUF2773, Protein of unknown function (DUF2773). This family of proteins with unknown function appears to be restricted to Enterobacteriaceae.¡€0€ª€0€ €CDD¡€ €d¢€0€0€ €‚Rpfam10972, CsiV, Peptidoglycan-binding protein, CsiV. CsiV, a small periplasmic protein (cell-shape integrity in Vibrio), is essential for growth of Vibrio cholerae in the presence of DAA, non-canonical amino-acids, the typical components of peptidoglycan side-chains in Vibrio cholerae. CsiV interacts with LpoA, the lipoprotein activator of penicillin-binding-protein1A that is necessary for mediating the assembly of peptidoglycan. CsiV acts through LpoA to promote peptidoglycan biogenesis in V. cholerae and other vibrio species as well as in the other genera where this protein is found.¡€0€ª€0€ €CDD¡€ €Êž¢€0€0€ €–pfam10973, DUF2799, Protein of unknown function (DUF2799). Some members in this family of proteins are annotated as yfiL which has no known function.¡€0€ª€0€ €CDD¡€ €ÊŸ¢€0€0€ €opfam10974, DUF2804, Protein of unknown function (DUF2804). This is a family of proteins with unknown function.¡€0€ª€0€ €CDD¡€ €Ê ¢€0€0€ €tpfam10975, DUF2802, Protein of unknown function (DUF2802). This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €Ê¡¢€0€0€ €–pfam10976, DUF2790, Protein of unknown function (DUF2790). This family of proteins with unknown function appear to be restricted to Pseudomonadaceae.¡€0€ª€0€ €CDD¡€ €Ê¢¢€0€0€ €jpfam10977, DUF2797, Protein of unknown function (DUF2797). This family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €Ê£¢€0€0€ €Ôpfam10978, DUF2785, Protein of unknown function (DUF2785). Some members in this family are annotated as hypothetical membrane spanning proteins however this cannot be confirmed. The family has no known function.¡€0€ª€0€ €CDD¡€ €ʤ¢€0€0€ €jpfam10979, DUF2786, Protein of unknown function (DUF2786). This family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €Ê¥¢€0€0€ €tpfam10980, DUF2787, Protein of unknown function (DUF2787). This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €ʦ¢€0€0€ €upfam10981, DUF2788, Protein of unknown function (DUF2788). This bacterial family of proteins have no known function.¡€0€ª€0€ €CDD¡€ €ʧ¢€0€0€ €tpfam10982, DUF2789, Protein of unknown function (DUF2789). This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €ʨ¢€0€0€ €ypfam10983, DUF2793, Protein of unknown function (DUF2793). This is a bacterial family of proteins with unknown function.¡€0€ª€0€ €CDD¡€ €Ê©¢€0€0€ €ypfam10984, DUF2794, Protein of unknown function (DUF2794). This is a bacterial family of proteins with unknown function.¡€0€ª€0€ €CDD¡€ €ʪ¢€0€0€ €ypfam10985, DUF2805, Protein of unknown function (DUF2805). This is a bacterial family of proteins with unknown function.¡€0€ª€0€ €CDD¡€ €Ê«¢€0€0€ €tpfam10986, DUF2796, Protein of unknown function (DUF2796). This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €ʬ¢€0€0€ €tpfam10987, DUF2806, Protein of unknown function (DUF2806). This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €Ê­¢€0€0€ €‚pfam10988, DUF2807, Putative auto-transporter adhesin, head GIN domain. This bacterial family of proteins shows structural similarity to other pectin lyase families. Although structures from this family align with acetyl-transferases, there is no conservation of catalytic residues found. It is likely that the function is one of cell-adhesion. In Structure 3jx8, it is interesting to note that the sequence of contains several well defined sequence repeats, centred around GSG motifs defining the tight beta turn between the two sheets of the super-helix; there are 8 such repeats in the C-terminal half of the protein, which could be grouped into 4 repeats of two. It seems likely that this family belongs to the superfamily of trimeric auto-transporter adhesins (TAAs), which are important virulence factors in Gram-negative pathogens. In the case of Parabacteroides distasonis, which is a component of the normal distal human gut microbiota, TAA-like complexes probably modulate adherence to the host (information derived from TOPSAN).¡€0€ª€0€ €CDD¡€ €Ê®¢€0€0€ €”pfam10989, DUF2808, Protein of unknown function (DUF2808). This family of proteins with unknown function appears to be restricted to Cyanobacteria.¡€0€ª€0€ €CDD¡€ €ʯ¢€0€0€ €±pfam10990, DUF2809, Protein of unknown function (DUF2809). Some members in this family of proteins are annotated as yjgA however currently no function for the protein is known.¡€0€ª€0€ €CDD¡€ €ʰ¢€0€0€ €}pfam10991, DUF2815, Protein of unknown function (DUF2815). This is a phage related family of proteins with unknown function.¡€0€ª€0€ €CDD¡€ €ʱ¢€0€0€ €upfam10992, DUF2816, Protein of unknown function (DUF2816). This eukaryotic family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €ʲ¢€0€0€ €tpfam10993, DUF2818, Protein of unknown function (DUF2818). This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €ʳ¢€0€0€ €jpfam10994, DUF2817, Protein of unknown function (DUF2817). This family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €Ê´¢€0€0€ €‚­pfam10995, CBP_GIL, GGDEF I-site like or GIL domain. The GIL domain, for GGDEF I-site like domain, is a c-di-GMP binding domain on the BcsE proteins of enterobacteria. It is not essentail for cellulose synthesis but is critical for maximal cellulose production. Cellulose production in enterobacteria is controlled by a two-tiered c-di-GMP-dependent system involving BcsE and the PilZ domain containing glycosyltransferase BcsA.¡€0€ª€0€ €CDD¡€ €ʵ¢€0€0€ €ëpfam10996, Beta-Casp, Beta-Casp domain. The beta-CASP domain is found C terminal to the beta-lactamase domain in pre-mRNA 3'-end-processing endonuclease. The active site of this enzyme is located at the interface of these two domains.¡€0€ª€0€ €CDD¡€ €ʶ¢€0€0€ €tpfam10997, DUF2837, Protein of unknown function (DUF2837). This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €Ê·¢€0€0€ €tpfam10998, DUF2838, Protein of unknown function (DUF2838). This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €ʸ¢€0€0€ €pfam10999, DUF2839, Protein of unknown function (DUF2839). This bacterial family of unknown function appear to be restricted to Cyanobacteria.¡€0€ª€0€ €CDD¡€ €ʹ¢€0€0€ €upfam11000, DUF2840, Protein of unknown function (DUF2840). This bacterial family of proteins have no known function.¡€0€ª€0€ €CDD¡€ €ʺ¢€0€0€ €ƒpfam11001, DUF2841, Protein of unknown function (DUF2841). This family of proteins with unknown function are all present in yeast.¡€0€ª€0€ €CDD¡€ €Ê»¢€0€0€ €‚’pfam11002, RDM, RFPL defining motif (RDM). The RDM domain is found on RFPL (Ret finger protein like) proteins. In humans, RFPL transcripts can be detected at the onset of neurogenesis in differentiating human embryonic stem cells, and in the developing human neocortex. The RDM domain is thought to have emerged from a neofunctionalisation event. It is found N terminal to the SPRY domain (pfam00622).¡€0€ª€0€ €CDD¡€ €ʼ¢€0€0€ €upfam11003, DUF2842, Protein of unknown function (DUF2842). This bacterial family of proteins have no known function.¡€0€ª€0€ €CDD¡€ €ʽ¢€0€0€ €‚wpfam11004, Kdo_hydroxy, 3-deoxy-D-manno-oct-2-ulosonic acid (Kdo) hydroxylase. This is a family of 3-deoxy-D-manno-oct-2-ulosonic acid 3-hydroxylases, which catalyze the conversion of 3-deoxy-D-manno-oct-2-ulosonic acid (Kdo) to D-glycero-D-talo-oct-2-ulosonic acid (Ko). It contains a potential iron-binding motif, HXDX(n)H (n>40). Hydroxylation activity is iron-dependent.¡€0€ª€0€ €CDD¡€ €ʾ¢€0€0€ €tpfam11005, DUF2844, Protein of unknown function (DUF2844). This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €Ê¿¢€0€0€ €tpfam11006, DUF2845, Protein of unknown function (DUF2845). This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €ÊÀ¢€0€0€ €‚pfam11007, CotJA, Spore coat associated protein JA (CotJA). CotJA is part of the CotJ operon which contains CotJA and CotJC. The operon encodes spore coat proteins. Interaction of CotJA with CotJC is required for the assembly of both CotJA and CotJC into the spore coat.¡€0€ª€0€ €CDD¡€ €ÊÁ¢€0€0€ €¹pfam11008, DUF2846, Protein of unknown function (DUF2846). Some members in this family of proteins with unknown function are annotated as lipoproteins however this cannot be confirmed.¡€0€ª€0€ €CDD¡€ €Ê¢€0€0€ €×pfam11009, DUF2847, Protein of unknown function (DUF2847). Some members in this bacterial family of proteins with unknown function are annotated as YtxJ, a putative general stress protein. This cannot be confirmed.¡€0€ª€0€ €CDD¡€ €Êâ€0€0€ €tpfam11010, DUF2848, Protein of unknown function (DUF2848). This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €ÊÄ¢€0€0€ €tpfam11011, DUF2849, Protein of unknown function (DUF2849). This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €ÊÅ¢€0€0€ €’pfam11012, DUF2850, Protein of unknown function (DUF2850). This family of proteins with unknown function appear to be restricted to Vibrionaceae.¡€0€ª€0€ €CDD¡€ €ÊÆ¢€0€0€ €tpfam11013, DUF2851, Protein of unknown function (DUF2851). This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €ÊÇ¢€0€0€ €tpfam11014, DUF2852, Protein of unknown function (DUF2852). This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €ÊÈ¢€0€0€ €tpfam11015, DUF2853, Protein of unknown function (DUF2853). This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €ÊÉ¢€0€0€ €jpfam11016, DUF2854, Protein of unknown function (DUF2854). This family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €ÊÊ¢€0€0€ €jpfam11017, DUF2855, Protein of unknown function (DUF2855). This family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €ÊË¢€0€0€ €‚pfam11018, Cuticle_3, Pupal cuticle protein C1. Insect cuticles are composite structures whose mechanical properties are optimized for biological function. The major components are the chitin filament system and the cuticular proteins, and the cuticle's properties are determined largely by the interactions between these two sets of molecules. The proteins can be ordered by species.¡€0€ª€0€ €CDD¡€ €ÊÌ¢€0€0€ €|pfam11019, DUF2608, Protein of unknown function (DUF2608). This family is conserved in Bacteria. The function is not known.¡€0€ª€0€ €CDD¡€ €ÊÍ¢€0€0€ €‚pfam11020, DUF2610, Domain of unknown function (DUF2610). This family is conserved in Proteobacteria. One member is annotated as being elongation factor P but this could not be confirmed. This domain is related to the Ribbon-helix-helix superfamily so may be a DNA-binding protein.¡€0€ª€0€ €CDD¡€ €Ê΢€0€0€ €¨pfam11021, DUF2613, Protein of unknown function (DUF2613). This is a family of putative small secreted proteins expressed by Actinobacteria. The function is not known.¡€0€ª€0€ €CDD¡€ €ÊÏ¢€0€0€ €ˆpfam11022, DUF2611, Protein of unknown function (DUF2611). This family is conserved in the Dikarya of Fungi. The function is not known.¡€0€ª€0€ €CDD¡€ €ÊТ€0€0€ €Âpfam11023, DUF2614, Zinc-ribbon containing domain. This is a family of proteins conserved in the Bacillaceae family. Some members are annotated as being protein YgzB. The function is not known.¡€0€ª€0€ €CDD¡€ €ÊÑ¢€0€0€ €‚²pfam11024, DGF-1_4, Dispersed gene family protein 1 of Trypanosoma cruzi region 4. This protein is likely to be highly expressed, and is expressed from the sub-telomeric region. However, the function is not known. Other domains on this protein include DGF-1_N, DGF-1_2, and DGF-1_5. This domain is just downstream from the C-terminus, but not the C-terminus of proteins, also annotated as being DGF-1, that constitute family DGF-1_C.¡€0€ª€0€ €CDD¡€ €ÊÒ¢€0€0€ €ªpfam11025, GP40, Glycoprotein GP40 of Cryptosporidium. This family is highly conserved in Cryptosporidium spp. Many members are annotated as being a 60 kDa glycoprotein.¡€0€ª€0€ €CDD¡€ €ÊÓ¢€0€0€ €|pfam11026, DUF2721, Protein of unknown function (DUF2721). This family is conserved in bacteria. The function is not known.¡€0€ª€0€ €CDD¡€ €ÊÔ¢€0€0€ €èpfam11027, DUF2615, Protein of unknown function (DUF2615). This small. approximately 100 residue, family is conserved from worms to humans. It is cysteine-rich with a characteristic FDxCEC sequence motif. The function is not known.¡€0€ª€0€ €CDD¡€ €ÊÕ¢€0€0€ €|pfam11028, DUF2723, Protein of unknown function (DUF2723). This family is conserved in bacteria. The function is not known.¡€0€ª€0€ €CDD¡€ €ÊÖ¢€0€0€ €‚‹pfam11029, DAZAP2, DAZ associated protein 2 (DAZAP2). DAZ associated protein 2 has a highly conserved sequence throughout evolution including a conserved polyproline region and several SH2/SH3 binding sites. It occurs as a single copy gene with a four-exon organisation and is located on chromosome 12. It encodes a ubiquitously expressed protein and binds to DAZ and DAZL1 through DAZ repeats.¡€0€ª€0€ €CDD¡€ €Ê×¢€0€0€ €‚€pfam11030, Nucleocapsid-N, Nucleocapsid protein N. This is the N protein of the nucleocapsid. The nucleocapsid functions to protect the RNA against nuclease degradation and to promote it's reverse transcription. The NC protein promotes viral RNA dimerisation and encapsidation and initiates reverse transcription by activating the annealing of the primer tRNA to the initiation site.¡€0€ª€0€ €CDD¡€ €dÈ¢€0€0€ €‚Rpfam11031, Phage_holin_T, Bacteriophage T holin. Bacteriophage effects host lysis with T holin along with an endolysin. T disrupts the membrane allowing sequential events which lead to the attack of the peptidoglycan. T has an usual periplasmic domain which transduces environmental information for the real-time control of lysis timing.¡€0€ª€0€ €CDD¡€ €dÉ¢€0€0€ €‚£pfam11032, ApoM, ApoM domain. ApoM is a 25 kDa plasma protein associated with high-density lipoproteins (HDLs). ApoM is important in the formation of pre-ss-HDL and also in increasing cholesterol efflux from macrophage foam cells. Lipoproteins consist of lipids solubilized by apolipoproteins. ApoM lacks an external amphipathic motif and is uniquely secreted to plasma without cleavage of its terminal signal peptide.¡€0€ª€0€ €CDD¡€ €ÊØ¢€0€0€ €Upfam11033, ComJ, Competence protein J (ComJ). ComJ is a competence specific protein.¡€0€ª€0€ €CDD¡€ €ÊÙ¢€0€0€ €†pfam11034, Grg1, Glucose-repressible protein Grg1. This fungal protein increases during glucose deprivation. Its function is unknown.¡€0€ª€0€ €CDD¡€ €ÊÚ¢€0€0€ €‚&pfam11035, SnAPC_2_like, Small nuclear RNA activating complex subunit 2, SNAP190 Myb. This family of proteins is snRNA-activating protein complex subunit 2 (SnAPC subunit 2). SnAPC complex allows the transcription of human small nuclear RNA genes to occur by recognition of the proximal sequence element, the TATA box. The family functions both to specifically recognize the proximal sequence element present in the core promoters of human snRNA genes and to stimulate TBP recognition of the neighboring TATA box present in human U6 snRNA promoters.¡€0€ª€0€ €CDD¡€ €ÊÛ¢€0€0€ €×pfam11036, YqgB, Virulence promoting factor. YqgB encodes adaptive factors that acts in synergy with vqfZ, enabling the bacteria to cope with the physical environment in vivo, facilitating colonisation of the host.¡€0€ª€0€ €CDD¡€ €O»¢€0€0€ €‚Kpfam11037, Musclin, Insulin-resistance promoting peptide in skeletal muscle. Musclin is a muscle derived secretory peptide which induces insulin resistance in vitro. It encodes a 130 amino acid sequence including a NH(2) terminal 30 amino acid signal sequence. Musclin expression level is tightly regulated by nutritional changes.¡€0€ª€0€ €CDD¡€ €ÊÜ¢€0€0€ €‚²pfam11038, DGF-1_5, Dispersed gene family protein 1 of Trypanosoma cruzi region 5. This protein is likely to be highly expressed, and is expressed from the sub-telomeric region. However, the function is not known. Other domains on this protein include DGF-1_N, DGF-1_2, and DGF-1_4. This domain is just downstream from the C-terminus, but not the C-terminus of proteins, also annotated as being DGF-1, that constitute family DGF-1_C.¡€0€ª€0€ €CDD¡€ €dÏ¢€0€0€ €Ûpfam11039, DUF2824, Protein of unknown function (DUF2824). This family of proteins has no known function. Some members in the family are annotated as the P22 head assembly protein gp14 however this cannot be confirmed.¡€0€ª€0€ €CDD¡€ €ÊÝ¢€0€0€ €‚ pfam11040, DGF-1_C, Dispersed gene family protein 1 of Trypanosoma cruzi C-terminus. This protein is likely to be highly expressed, and is expressed from the sub-telomeric region. However, the function is not known. This is the very C-terminal part of the protein.¡€0€ª€0€ €CDD¡€ €dÑ¢€0€0€ €§pfam11041, DUF2612, Protein of unknown function (DUF2612). This is a phage protein family expressed from a range of Proteobacteria species. The function is not known.¡€0€ª€0€ €CDD¡€ €ÊÞ¢€0€0€ €‚pfam11042, DUF2750, Protein of unknown function (DUF2750). This family is conserved in Proteobacteria. The function is not known.¡€0€ª€0€ €CDD¡€ €Êߢ€0€0€ €·pfam11043, DUF2856, Protein of unknown function (DUF2856). Some members in this viral family of proteins with unknown function are annotated as Abc2 however this cannot be confirmed.¡€0€ª€0€ €CDD¡€ €dÔ¢€0€0€ €‚(pfam11044, TMEMspv1-c74-12, Plectrovirus spv1-c74 ORF 12 transmembrane protein. This is a family of proteins expressed by Plectroviruses. The plectroviruses are single-stranded DNA viruses belonging to the Inoviridae. Except that it is a putative transmembrane protein the function is not known.¡€0€ª€0€ €CDD¡€ €dÕ¢€0€0€ €Ðpfam11045, YbjM, Putative inner membrane protein of Enterobacteriaceae. This family is conserved in the Enterobacteriaceae. It is a putative inner membrane protein, named YbjM, but the function is not known.¡€0€ª€0€ €CDD¡€ €Êࢀ0€0€ €‚åpfam11046, HycA_repressor, Transcriptional repressor of hyc and hyp operons. This family is conserved in Proteobacteria. It is likely to be the transcriptional repressor molecule for the hyc and hyp operons, which express, amongst others, the protein HycA. This protein may be harnessed for the reduction of technetium oxide, an unwelcome product of radio-nucleotide bioaccumulation. HycA produces formate hydrogenlyase, one of the key proteins necessary for metal compound reduction.¡€0€ª€0€ €CDD¡€ €Êᢀ0€0€ €œpfam11047, SopD, Salmonella outer protein D. SopD is a type III virulence effector protein whose structure consists of 38% alpha-helix and 26% beta-strand.¡€0€ª€0€ €CDD¡€ €dØ¢€0€0€ €‚npfam11049, KSHV_K1, Glycoprotein K1 of Kaposi's sarcoma-associated herpes virus. This is a highly glycosylated cytoplasmic and membrane protein similar to the immunoglobulin receptor family that is expressed as an inducible early-lytic-cycle gene product in primary effusion lymphoma cell-lines. This domain would appear to be the cytoplasmic region of the protein.¡€0€ª€0€ €CDD¡€ €dÙ¢€0€0€ €Øpfam11050, Viral_env_E26, Virus envelope protein E26. E26 is a multifunctional protein. One form of E26 associates with viral DNA or DNA binding proteins, while a second form associates with intracellular membranes.¡€0€ª€0€ €CDD¡€ €dÚ¢€0€0€ €Ãpfam11051, Mannosyl_trans3, Mannosyltransferase putative. This family is conserved in fungi. Several members are annotated as being alpha-1,3-mannosyltransferase but this could not be confirmed.¡€0€ª€0€ €CDD¡€ €Ê⢀0€0€ €‚Ipfam11052, Tr-sialidase_C, Trans-sialidase of Trypanosoma hydrophobic C-terminal. This is a highly conserved sequence motif that is the very C-terminus of a number of more diverse proteins from Trypanosoma cruzi. All members of the family are annotated putatively as being trans-sialidase but this appears to be a diverse group.¡€0€ª€0€ €CDD¡€ €Ê㢀0€0€ €‚Gpfam11053, DNA_Packaging, Terminase DNA packaging enzyme. Phage T4 terminase functions in packaging concatemeric DNA. The T4 terminase is composed of a large subunit, gp17 ad a small subunit, gp16. The role of gp16 is not well characterized however it is known that it binds to double-stranded DNA but not single stranded DNA.¡€0€ª€0€ €CDD¡€ €dÝ¢€0€0€ €‚5pfam11054, Surface_antigen, Sporozoite TA4 surface antigen. This family of proteins is a Eukaryotic family of surface antigens. One of the better characterized members of the family is the sporulated TA4 antigen. The TA4 gene encodes a single polypeptide of 25 kDa which contains a 17 and a 8 kD polypeptide.¡€0€ª€0€ €CDD¡€ €dÞ¢€0€0€ €‘pfam11055, Gsf2, Glucose signalling factor 2. Gsf2 is localized to the ER and functions to promote the secretion of certain hexose transporters.¡€0€ª€0€ €CDD¡€ €Ê䢀0€0€ €‚Opfam11056, UvsY, Recombination, repair and ssDNA binding protein UvsY. UvsY protein enhances the rate of single-stranded-DNA-dependant ATP hydrolysis by UvsX protein. The enhancement of ATP hydrolysis by UvsY protein is shown to result from the ability of UvsY protein to increase the affinity of UvsX protein for single-stranded DNA.¡€0€ª€0€ €CDD¡€ €Ê墀0€0€ €‚×pfam11057, Cortexin, Cortexin of kidney. In the middle of cortexin protein there is a single membrane-spanning domain which indicates that this protein may be a membrane protein involved in intracellular or extracellular signalling of the kidney or brain, since it is expressed specifically in the kidneys and brain only. The protein is highly conserved among species. Cortexin is also thought to be important to neurons of both the developing and adult cerebral cortex.¡€0€ª€0€ €CDD¡€ €Ê梀0€0€ €•pfam11058, Ral, Antirestriction protein Ral. Ral alleviates restriction and enhances modification by the E.Coli restriction and modification system.¡€0€ª€0€ €CDD¡€ €d⢀0€0€ €tpfam11059, DUF2860, Protein of unknown function (DUF2860). This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €Ê碀0€0€ €tpfam11060, DUF2861, Protein of unknown function (DUF2861). This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €Ê袀0€0€ €jpfam11061, DUF2862, Protein of unknown function (DUF2862). This family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €Ê颀0€0€ €upfam11062, DUF2863, Protein of unknown function (DUF2863). This bacterial family of proteins have no known function.¡€0€ª€0€ €CDD¡€ €Êꢀ0€0€ €tpfam11064, DUF2865, Protein of unknown function (DUF2865). This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €Ê뢀0€0€ €upfam11065, DUF2866, Protein of unknown function (DUF2866). This bacterial family of proteins have no known function.¡€0€ª€0€ €CDD¡€ €Ê좀0€0€ €upfam11066, DUF2867, Protein of unknown function (DUF2867). This bacterial family of proteins have no known function.¡€0€ª€0€ €CDD¡€ €Êí¢€0€0€ €Épfam11067, DUF2868, Protein of unknown function (DUF2868). Some members in this family of proteins with unknown function are annotated as putative membrane proteins. However, this cannot be confirmed.¡€0€ª€0€ €CDD¡€ €Ê0€0€ €‚6pfam11068, YlqD, YlqD protein. The structure of a representative of this family has been solved (Structure 4dci) and found to form a tetrameric structure of prefoldin-like architecture with the beta-barrel core and helical coiled coil tentacles. This suggests that this family may act as molecular chaperones.¡€0€ª€0€ €CDD¡€ €Ê0€0€ €zpfam11069, DUF2870, Protein of unknown function (DUF2870). This is a eukaryotic family of proteins with unknown function.¡€0€ª€0€ €CDD¡€ €Êð¢€0€0€ €jpfam11070, DUF2871, Protein of unknown function (DUF2871). This family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €Êñ¢€0€0€ €Hpfam11071, Nuc_deoxyri_tr3, Nucleoside 2-deoxyribosyltransferase YtoQ. ¡€0€ª€0€ €CDD¡€ €Êò¢€0€0€ €spfam11072, DUF2859, Protein of unknown function (DUF2859). This is a bacterial family of uncharacterized proteins.¡€0€ª€0€ €CDD¡€ €Êó¢€0€0€ €ëpfam11073, NSs, Rift valley fever virus non structural protein (NSs) like. This family contains several Phlebovirus non structural proteins which act as a major determinant of virulence by antagonising interferon beta gene expression.¡€0€ª€0€ €CDD¡€ €Êô¢€0€0€ €zpfam11074, DUF2779, Domain of unknown function(DUF2779). This domain is conserved in bacteria. The function is not known.¡€0€ª€0€ €CDD¡€ €Êõ¢€0€0€ €}pfam11075, DUF2780, Protein of unknown function VcgC/VcgE (DUF2780). This is a bacterial family of uncharacterized proteins.¡€0€ª€0€ €CDD¡€ €Êö¢€0€0€ €Âpfam11076, YbhQ, Putative inner membrane protein YbhQ. This family is conserved in Proteobacteria. The function is not known but most members are annotated as being inner membrane protein YbhQ.¡€0€ª€0€ €CDD¡€ €Ê÷¢€0€0€ €ápfam11077, DUF2616, Protein of unknown function (DUF2616). This cysteine-rich family is expressed by the double-stranded Nucleopolyhedrovirus, a member of the Baculoviridae family of dsDNA viruses. The function is not known.¡€0€ª€0€ €CDD¡€ €Êø¢€0€0€ €‚;pfam11078, Optomotor-blind, Optomotor-blind protein N-terminal region. This family is conserved in Drosophila spp. Optomotor-blind is one of the essential toolkit proteins for coordinating development in diverse animal taxa, and in Drosophila it plays a key role in establishing the abdominal pigmentation pattern, in development of the central nervous system and leg and wing imaginal disc-formation of Drosophila melanogaster. This is the N-terminal region of the protein and does not include the T-box-containing transcription factor that plays a part in DNA-binding.¡€0€ª€0€ €CDD¡€ €Êù¢€0€0€ €³pfam11079, YqhG, Bacterial protein YqhG of unknown function. This family of putative proteins is conserved in the Bacillaceae family of the Firmicutes. The function is not known.¡€0€ª€0€ €CDD¡€ €Êú¢€0€0€ €´pfam11080, GhoS, Endoribonuclease GhoS. GhoS is part of the GhoT-GhoS type V toxin-antitoxin (TA) system. GhoT is inhibited by antitoxin GhoS, which specifically cleaves its mRNA.¡€0€ª€0€ €CDD¡€ €Êû¢€0€0€ €•pfam11081, DUF2890, Protein of unknown function (DUF2890). This family is conserved in dsDNA adenoviruses of vertebrates. The function is not known.¡€0€ª€0€ €CDD¡€ €Êü¢€0€0€ €tpfam11082, DUF2880, Protein of unknown function (DUF2880). This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €dø¢€0€0€ €‚Ôpfam11083, Streptin-Immun, Lantibiotic streptin immunity protein. Streptococcal species produce a lantibiotic, streptin, in a similar manner to the production of nisin and subtilin by other lactic acid bacteria, in order to compete against competing bacteria within the environment. The immunity protein protects the bacterium from destruction by its own lantibiotic. In general, there is little homology between the immunity proteins of different genera of bacteria.¡€0€ª€0€ €CDD¡€ €Êý¢€0€0€ €­pfam11084, DUF2621, Protein of unknown function (DUF2621). This family is conserved in the Bacillaceae family. Several members are named as YneK. The function is not known.¡€0€ª€0€ €CDD¡€ €Êþ¢€0€0€ €“pfam11085, YqhR, Conserved membrane protein YqhR. This family is conserved in the Bacillaceae family of the Firmicutes. The function is not known.¡€0€ª€0€ €CDD¡€ €Êÿ¢€0€0€ €ópfam11086, DUF2878, Protein of unknown function (DUF2878). This bacterial family of proteins has no known function. Some members annotate the proteins as the permease component of a Mn2+/Zn2+ transport system however this cannot be confirmed.¡€0€ª€0€ €CDD¡€ €Ë¢€0€0€ €µpfam11087, PRD1_DD, PRD1 phage membrane DNA delivery. This small family of phage proteins are bound in the viral membrane and assist, along with P11 and P18 in the delivery of DNA.¡€0€ª€0€ €CDD¡€ €O좀0€0€ €‚!pfam11088, RL11D, Glycoprotein encoding membrane proteins RL5A and RL6. RL5A and RL6 are part of the RL11 family which are predicted to encode membrane glycoproteins. Two adjacent open reading frames potentially encode a domain that is the hallmark of proteins encoded by the RL11 family.¡€0€ª€0€ €CDD¡€ €dý¢€0€0€ €‚tpfam11089, SyrA, Exopolysaccharide production repressor. SyrA is a small protein located in the cytoplasmic membrane that lacks an apparent DNA binding domain. SyrA mediates the transcriptional up-regulation of exo genes involved in the biosynthesis of the symbiotic exopolysaccharide succinoglycan. It does this through a mechanism which requires a two component system.¡€0€ª€0€ €CDD¡€ €dþ¢€0€0€ €Ñpfam11090, DUF2833, Protein of unknown function (DUF2833). This family of proteins with unknown function are found in the bacteriophage T7. Some of the members of this family are annotated as gene 13 protein.¡€0€ª€0€ €CDD¡€ €O0€0€ €Äpfam11091, T4_tail_cap, Tail-tube assembly protein. This tail tube protein is also referred to as Gp48. It is required for the assembly and length regulation of the tail tube of bacteriophage T4.¡€0€ª€0€ €CDD¡€ €Ë¢€0€0€ €‚,pfam11092, Alveol-reg_P311, Neuronal protein 3.1 (p311). P311 has several PEST-like motifs and is found in neuron and muscle cells. P311 could have some function in myo-fibroblast transformation and prevention of fibrosis. It has also been identified as a potential regulator of alveolar generation.¡€0€ª€0€ €CDD¡€ €Ë¢€0€0€ €‚úpfam11093, Mitochondr_Som1, Mitochondrial export protein Som1. Som1 is a component of the mitochondrial protein export system. The various Som1 proteins exhibit a highly conserved region and a pattern of cysteine residues. Stabilisation of Som1 occurs through an interaction between Som1 and Imp1, a peptidase required for proteolytic processing of certain proteins during their transport across the mitochondrial membrane. This suggests that Som1 represents a third subunit of the Imp1 peptidase complex.¡€0€ª€0€ €CDD¡€ €Ë¢€0€0€ €‚pfam11094, UL11, Membrane-associated tegument protein. The UL11 gene product of herpes simplex virus is a membrane-associated tegument protein that is incorporated into the HSV virion and functions in viral envelopment. UL11 is acylated which is crucial for lipid raft association.¡€0€ª€0€ €CDD¡€ €e¢€0€0€ €‚8pfam11095, Gemin7, Gem-associated protein 7 (Gemin7). Gemin7 is a novel component of the survival of motor neuron complex which functions in the assembly of spliceosomal small nuclear ribonucleoproteins. Gemin7 interacts with several Sm proteins of spliceosomal small nuclear ribonucleoproteins, especially SmE.¡€0€ª€0€ €CDD¡€ €Ë¢€0€0€ €pfam11097, DUF2883, Protein of unknown function (DUF2883). This family of proteins have no known function but appear to be restricted to phage.¡€0€ª€0€ €CDD¡€ €e¢€0€0€ €‚[pfam11098, Chlorosome_CsmC, Chlorosome envelope protein C. Chlorosomes are light-harvesting antennae found in green bacteria. CsmC is one of the proteins that exists in the chlorosome envelope. CsmC has been shown to exist as a homomultimer with CsmD in the chlorosome envelope. CsmC is thought to be important in chlorosome elongation and shape.¡€0€ª€0€ €CDD¡€ €e¢€0€0€ €‚Gpfam11099, M11L, Apoptosis regulator M11L like. Apoptosis regulators function to modulate the apoptotic cascades and thereby favour productive viral replication. M11L inhibits mitochondrial-dependant apoptosis by mimicking and competing with host proteins for the binding and blocking of Bak and Bax, two executioner proteins.¡€0€ª€0€ €CDD¡€ €Ë¢€0€0€ €¢pfam11100, TrbE, Conjugal transfer protein TrbE. TrbE is essential for conjugation and phage adsorption. It contains four common motifs and one conserved domain.¡€0€ª€0€ €CDD¡€ €Ë¢€0€0€ €ªpfam11101, DUF2884, Protein of unknown function (DUF2884). Some members in this bacterial family of proteins are annotated as YggN which currently has no known function.¡€0€ª€0€ €CDD¡€ €Ë¢€0€0€ €‚_pfam11102, YjbF, Group 4 capsule polysaccharide lipoprotein gfcB, YjbF. This family includes lipoprotein GfcB (YmcC), involved in group 4 capsule polysaccharide formation. YjbF is a family of Gram-negative bacterial outer-membrane lipoproteins, predicted to be a beta-barrel and possibly a porin that is one of four gene-products expressed from an operon, yjbEFGH, which is regulated by the Rcs phosphorelay in a RcsA-dependent manner, similar to that of other exopolysaccharide biosynthetic pathways. It is highly possible that the yjbEFGH operon encodes a system involved in EPS secretion since none of the products is predicted to have enzymic activity, the products are all secreted and YbjF and H are predicted to be beta-barrel lipoproteins similar to porins. It may be that the operon products play some role in biofilm formation and/or matrix production.¡€0€ª€0€ €CDD¡€ €Ë¢€0€0€ €ºpfam11103, DUF2887, Protein of unknown function (DUF2887). This bacterial family of proteins has no known function. These proteins may be distantly related to the PD(D/E)XK superfamily.¡€0€ª€0€ €CDD¡€ €Ë ¢€0€0€ €¬pfam11104, PilM_2, Type IV pilus assembly protein PilM;. The type IV pilus assembly protein PilM is required for competency and pilus biogenesis. It binds to PilN and ATP.¡€0€ª€0€ €CDD¡€ €Ë ¢€0€0€ €‚pfam11105, CCAP, Arthropod cardioacceleratory peptide 2a. CCAP exerts a reversible and dose-dependant cardio-stimulatory effect on the semi-isolated heart of experimental beetles. CCAP also increases free hemolymph sugar concentration in young larvae and adults of the meal-worm beetle.¡€0€ª€0€ €CDD¡€ €Ë ¢€0€0€ €‚Bpfam11106, YjbE, Exopolysaccharide production protein YjbE. YjbE is part of a four gene operon which is involved in exopolysaccharide production. The expression of YjbE is higher than the rest of the operon yjbEFGH. It appears to be restricted to Enterobacteriaceae. YbjE is one of four gene-products expressed from an operon, yjbEFGH, which is regulated by the Rcs phosphorelay in a RcsA-dependent manner, similar to that of other exopolysaccharide biosynthetic pathways. It is highly possible that the yjbEFGH operon encodes a system involved in EPS secretion since none of the products is predicted to have enzymic activity, the products are all secreted and YbjH and F are predicted to be beta-barrel lipoproteins similar to porins. It may be that the operon products play some role in biofilm formation and/or matrix production.¡€0€ª€0€ €CDD¡€ €Ë ¢€0€0€ €‚pfam11107, FANCF, Fanconi anemia group F protein (FANCF). FANCF regulates its own expression by methylation at both mRNA and protein levels. Methylation-induced inactivation of FANCF has an important role on the occurrence of ovarian cancers by disrupting the FA-BRCA pathway.¡€0€ª€0€ €CDD¡€ €Ë ¢€0€0€ €‚Ipfam11108, Phage_glycop_gL, Viral glycoprotein L. GL forms a complex with gH, a glycoprotein known to be essential for entry of HSV-1 into cells and virus-induced cell fusion. It is a hetero-oligomer of gH and gL which is incorporated into virions and transported to the cell surface which acts during entry of virus into cells.¡€0€ª€0€ €CDD¡€ €e¢€0€0€ €‚$pfam11109, RFamide_26RFa, Orexigenic neuropeptide Qrfp/P518. Qrfp/P518 has a direct role in maintaining bone mineral density. Qrfp has also found to be important in energy homeostasis by regulating appetite and energy expenditure in mice. The c-terminal 28 residues are the functional 26RFa.¡€0€ª€0€ €CDD¡€ €Ë¢€0€0€ €‚[pfam11110, Phage_hub_GP28, Baseplate hub distal subunit. These baseplate proteins are also referred to as Gp28. Gp28 is the structural component of the central part of the bacteriophage T4 baseplate, which possesses a hydrophobic region and is membrane bound. Gp28 forms a complex with gp27 which is another structural component of the baseplate.¡€0€ª€0€ €CDD¡€ €Ë¢€0€0€ €‚Ûpfam11111, CENP-M, Centromere protein M (CENP-M). The prime candidate for specifying centromere identity is the array of nucleosomes assembles with CENP-A. CENP-A recruits a nucleosome associated complex (NAC) comprised of CENP-M along with two other proteins. Assembly of the CENP-A NAC at centromeres is partly dependant on CENP-M. The CENP-A NAC is essential, as disruption of the complex causes errors of chromosome alignment and segregation that preclude cell survival.¡€0€ª€0€ €CDD¡€ €Ë¢€0€0€ €‚pfam11112, PyocinActivator, Pyocin activator protein PrtN. PrtN is a transcriptional activator for pyocin synthesis genes. It activates the expression of various pyocin genes by interaction with the DNA sequences conserved in the 5' noncoding regions of the pyocin genes.¡€0€ª€0€ €CDD¡€ €Ë¢€0€0€ €‚pfam11113, Phage_head_chap, Head assembly gene product. This head assembly protein is also refereed to as gene product 40 (Gp40). A specific gp20-gp40 membrane insertion structure constitutes the T4 prohead assembly initiation complex. This protein in T4 stimulates head formation.¡€0€ª€0€ €CDD¡€ €e¢€0€0€ €‚pfam11114, Minor_capsid_2, Minor capsid protein. Most of the members of this family are annotated as being minor capsid proteins. The genomes carrying the genes usually have three similar proteins adjacent to each other, hence this one being named as No.2.¡€0€ª€0€ €CDD¡€ €Ë¢€0€0€ €´pfam11115, DUF2623, Protein of unknown function (DUF2623). This family is conserved in the Enterobacteriaceae family. Several members are named as YghW. The function is not known.¡€0€ª€0€ €CDD¡€ €e¢€0€0€ €­pfam11116, DUF2624, Protein of unknown function (DUF2624). This family is conserved in the Bacillaceae family. Several members are named as YqfT. The function is not known.¡€0€ª€0€ €CDD¡€ €Ë¢€0€0€ €­pfam11117, DUF2626, Protein of unknown function (DUF2626). This family is conserved in the Bacillaceae family. Several members are named as YqgY. The function is not known.¡€0€ª€0€ €CDD¡€ €Ë¢€0€0€ €­pfam11118, DUF2627, Protein of unknown function (DUF2627). This family is conserved in the Bacillaceae family. Several members are named as YqzF. The function is not known.¡€0€ª€0€ €CDD¡€ €Ë¢€0€0€ €µpfam11119, DUF2633, Protein of unknown function (DUF2633). This family is conserved largely in the Bacillaceae family. Several members are named as YfgG. The function is not known.¡€0€ª€0€ €CDD¡€ €e¢€0€0€ €‚:pfam11120, CBP_BcsF, Cellulose biosynthesis protein BcsF. CBP_BcsF is a family of bacterial cellulose biosynthesis proteins. Cellulose is necessary for biofilm formation in bacteria. (Roemling U. and Galperin M.Y. "Bacterial cellulose biosynthesis. Diversity of operons and subunits" (manuscript in preparation)).¡€0€ª€0€ €CDD¡€ €Ë¢€0€0€ €·pfam11121, DUF2639, Protein of unknown function (DUF2639). This family is conserved in the Bacillaceae family. Several members are named as being YflJ, but the function is not known.¡€0€ª€0€ €CDD¡€ €e¢€0€0€ €‚pfam11122, Spore-coat_CotD, Inner spore coat protein D. This family is conserved in the Enterobacteriaceae family. CotD is an inner spore coat protein that is expressed in the middle phase of mother cell gene expression. Along with CotD, CotH, CotS and CotT it is assumed to assemble into the loose skeleton of the matrix, between the shells of SpoIVA and CotE. Coat proteins do not share much sequence similarity between species, but this does not imply they do not share secondary, tertiary, or quaternary features.¡€0€ª€0€ €CDD¡€ €Ë¢€0€0€ €Îpfam11123, DNA_Packaging_2, DNA packaging protein. This DNA packaging protein is also referred to as gene 18 product (gp18). This protein is required for DNA packaging and functions in a complex with gp19.¡€0€ª€0€ €CDD¡€ €e¢€0€0€ €‚ýpfam11124, Pho86, Inorganic phosphate transporter Pho86. Pho86p is an ER protein which is produced in response to phosphate starvation. It is essential for growth when phosphate levels are limiting. Pho86p is also involved in the regulation of Pho84p, a high-affinity phosphate transporter which is localized to the endoplasmic reticulum (ER) in low phosphate medium. When the level of phosphate increases Pho84p is transported to the vacuole. Pho86p is required for packaging of Pho84p in to COPII vesicles.¡€0€ª€0€ €CDD¡€ €Ë¢€0€0€ €pfam11125, DUF2830, Protein of unknown function (DUF2830). Several members in this viral family of proteins are annotated as lysis proteins.¡€0€ª€0€ €CDD¡€ €e¢€0€0€ €‚“pfam11126, Phage_DsbA, Transcriptional regulator DsbA. DsbA is a double stranded binding protein found in bacteriophage T4 which is involved in transcriptional regulation. DsbA, along with other viral proteins, interacts with the host RNA polymerase core enzyme enabling initiation of transcription. DsbA acts as an enhancer protein of late genes in vitro. The protein consists of mainly alpha helices.¡€0€ª€0€ €CDD¡€ €e¢€0€0€ €|pfam11127, DUF2892, Protein of unknown function (DUF2892). This family is conserved in bacteria. The function is not known.¡€0€ª€0€ €CDD¡€ €Ë¢€0€0€ €Ÿpfam11128, Nucleocap_ssRNA, Plant viral coat protein nucleocapsid. This family of nucleocapsid proteins is from ssRNA negative-strand viruses of plant origin.¡€0€ª€0€ €CDD¡€ €e!¢€0€0€ €‚Çpfam11129, EIAV_Rev, Rev protein of equine infectious anaemia virus. The sequence of this family is highly conserved and carries a nuclear export signal from residues 31-55, and RNA binding/nuclear localization signals of RRDR at residue 76 and KRRRK at residue 159. Rev is an essential regulatory protein required for nucleocytoplasmic transport of incompletely spliced viral mRNAs that encode structural proteins. Rev has been shown to down-regulate the expression of viral late genes and alter sensitivity to Gag-specific cytotoxic-T-lymphocytes (CTL). Equine infectious anaemia virus (EIAV) exhibits a high rate of genetic variation in vivo, and results in a clinically variable disease in infected horses.¡€0€ª€0€ €CDD¡€ €P¢€0€0€ €‚ÿpfam11130, TraC_F_IV, F pilus assembly Type-IV secretion system for plasmid transfer. This family of TraC proteins is conserved in Proteobacteria. TraC is a cytoplasmic, peripheral membrane protein and is one of the proteins encoded by the F transfer region of the conjugative plasmid that is required for the assembly of F pilin into the mature F pilus structure. F pili are filamentous appendages that help establish the physical contact between donor and recipient cells involved in the conjugation process.¡€0€ª€0€ €CDD¡€ €Ë¢€0€0€ €‚Epfam11131, PhrC_PhrF, Rap-phr extracellular signalling. PhrC and PhrF stimulate ComA-dependent gene expression to different levels and are both required for full expression of genes activated by ComA, which activates the expression of genes involved in competence development and the production of several secreted products.¡€0€ª€0€ €CDD¡€ €e#¢€0€0€ €¹pfam11132, SplA, Transcriptional regulator protein (SplA). The SplA protein functions in trans as a negative regulator of the level of splB-lacZ expression in the developing forespore.¡€0€ª€0€ €CDD¡€ €Ë¢€0€0€ €°pfam11133, Phage_head_fibr, Head fiber protein. This head fiber protein is also refereed to as Gp8.5. Gp8.5 is a structural protein in phage. It is a dispensable head protein.¡€0€ª€0€ €CDD¡€ €é4¢€0€0€ €´pfam11134, Phage_stabilize, Phage stabilisation protein. Members of this family are phage proteins that are probably involved with stabilizing the condensed DNA within the capsid.¡€0€ª€0€ €CDD¡€ €e%¢€0€0€ €Ëpfam11135, DUF2888, Protein of unknown function (DUF2888). Some members in this family of proteins with unknown function are annotated as immediate early protein ICP-18 however this cannot be confirmed.¡€0€ª€0€ €CDD¡€ €e&¢€0€0€ €tpfam11136, DUF2889, Protein of unknown function (DUF2889). This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €Ë¢€0€0€ €‰pfam11137, DUF2909, Protein of unknown function (DUF2909). This is a family of proteins conserved in Proteobacteria of unknown function.¡€0€ª€0€ €CDD¡€ €Ë¢€0€0€ €tpfam11138, DUF2911, Protein of unknown function (DUF2911). This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €Ë¢€0€0€ €‚špfam11139, SfLAP, Sap, sulfolipid-1-addressing protein. SAP is a transmembrane transport protein with six predicted transmembrane helices, with a hydrophilic domain between helices 3 and 4. This hyrodphobic region is highly variable among identified Gap-like (GPL, peptidoglycolipid, addressing protein) proteins and may be involved in substrate recognition. SAP also belongs to the LysE protein superfamily (pfam01810), whose members have been implicated in small molecule transport in bacteria. Other Gap proteins export metabolites across the cell membrane so it is possible that Sap specifically may be involved in transport of sulfolipid-1 across the membrane.¡€0€ª€0€ €CDD¡€ €Ë¢€0€0€ €™pfam11140, DUF2913, Protein of unknown function (DUF2913). This family of proteins with unknown function appear to be restricted to Gammaproteobacteria.¡€0€ª€0€ €CDD¡€ €Ë ¢€0€0€ €tpfam11141, DUF2914, Protein of unknown function (DUF2914). This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €Ë!¢€0€0€ €‰pfam11142, DUF2917, Protein of unknown function (DUF2917). This bacterial family of proteins appears to be restricted to Proteobacteria.¡€0€ª€0€ €CDD¡€ €Ë"¢€0€0€ €¹pfam11143, DUF2919, Protein of unknown function (DUF2919). This bacterial family of proteins has no known function. Some members are annotated as YfeZ however this cannot be confirmed.¡€0€ª€0€ €CDD¡€ €Ë#¢€0€0€ €tpfam11144, DUF2920, Protein of unknown function (DUF2920). This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €Ë$¢€0€0€ €upfam11145, DUF2921, Protein of unknown function (DUF2921). This eukaryotic family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €Ë%¢€0€0€ €pfam11146, DUF2905, Protein of unknown function (DUF2905). This is a family of bacterial proteins conserved of unknown function.¡€0€ª€0€ €CDD¡€ €Ë&¢€0€0€ €tpfam11148, DUF2922, Protein of unknown function (DUF2922). This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €Ë'¢€0€0€ €tpfam11149, DUF2924, Protein of unknown function (DUF2924). This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €Ë(¢€0€0€ €Îpfam11150, DUF2927, Protein of unknown function (DUF2927). This family is conserved in Proteobacteria. Several members are described as being putative lipoproteins, but otherwise the function is not known.¡€0€ª€0€ €CDD¡€ €Ë)¢€0€0€ €‘pfam11151, DUF2929, Protein of unknown function (DUF2929). This family of proteins with unknown function appears to be restricted to Firmicutes.¡€0€ª€0€ €CDD¡€ €Ë*¢€0€0€ €‚™pfam11152, CCB2_CCB4, Cofactor assembly of complex C subunit B, CCB2/CCB4. Cofactor maturation pathways such as the CCB system (system IV) for cytochrome c-heme attachment are conserved in all organisms performing oxygenic photosynthesis. The CCB system consists of four proteins: CCB1-4. CCB2 and CCB4 are paralogues derived from a unique cyanobacterial ancestor. Orthologues are conserved in higher plants.¡€0€ª€0€ €CDD¡€ €Ë+¢€0€0€ €Êpfam11153, DUF2931, Protein of unknown function (DUF2931). Some members in this family of proteins are annotated as lipoproteins however this cannot be confirmed. Currently, there is no known function.¡€0€ª€0€ €CDD¡€ €Ë,¢€0€0€ €tpfam11154, DUF2934, Protein of unknown function (DUF2934). This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €Ë-¢€0€0€ €‚>pfam11155, DUF2935, Domain of unknown function (DUF2935). This family of proteins with unknown function appears to be restricted to Firmicutes. The structure of this protein has been solved and each domain is composed of four alpha helices. A metal cluster composed of iron and magnesium lies between the two domains.¡€0€ª€0€ €CDD¡€ €Ë.¢€0€0€ €•pfam11157, DUF2937, Protein of unknown function (DUF2937). This family of proteins with unknown function appears to be restricted to Proteobacteria.¡€0€ª€0€ €CDD¡€ €Ë/¢€0€0€ €Çpfam11158, DUF2938, Protein of unknown function (DUF2938). This bacterial family of proteins has no known function. Some members are thought to be membrane proteins however this cannot be confirmed.¡€0€ª€0€ €CDD¡€ €Ë0¢€0€0€ €tpfam11159, DUF2939, Protein of unknown function (DUF2939). This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €Ë1¢€0€0€ €jpfam11160, DUF2945, Protein of unknown function (DUF2945). This family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €Ë2¢€0€0€ €”pfam11161, DUF2944, Protein of unknown function (DUF2946). This family of proteins with unknown function appear to be restricted to Proteobacteria.¡€0€ª€0€ €CDD¡€ €Ë3¢€0€0€ €jpfam11162, DUF2946, Protein of unknown function (DUF2946). This family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €Ë4¢€0€0€ €špfam11163, DUF2947, Protein of unknown function (DUF2947). This family of proteins with unknown function appears to be restricted to Gammaproteobacteria.¡€0€ª€0€ €CDD¡€ €Ë5¢€0€0€ €”pfam11164, DUF2948, Protein of unknown function (DUF2948). This family of proteins with unknown function appear to be restricted to Proteobacteria.¡€0€ª€0€ €CDD¡€ €Ë6¢€0€0€ €“pfam11165, DUF2949, Protein of unknown function (DUF2949). This family of proteins with unknown function appear to be restricted to Cyanobacteria.¡€0€ª€0€ €CDD¡€ €Ë7¢€0€0€ €Žpfam11166, DUF2951, Protein of unknown function (DUF2951). This family of proteins has no known function. It has a highly conserved sequence.¡€0€ª€0€ €CDD¡€ €eC¢€0€0€ €jpfam11167, DUF2953, Protein of unknown function (DUF2953). This family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €Ë8¢€0€0€ €Çpfam11168, DUF2955, Protein of unknown function (DUF2955). Some members in this family of proteins with unknown function annotate the proteins as membrane protein. However, this cannot be confirmed.¡€0€ª€0€ €CDD¡€ €Ë9¢€0€0€ €špfam11169, DUF2956, Protein of unknown function (DUF2956). This family of proteins with unknown function appears to be restricted to Gammaproteobacteria.¡€0€ª€0€ €CDD¡€ €Ë:¢€0€0€ €Øpfam11170, DUF2957, Protein of unknown function (DUF2957). Some members annotate the proteins to be putative lipoproteins however this cannot be confirmed. Currently no function is known for this family of proteins.¡€0€ª€0€ €CDD¡€ €Ë;¢€0€0€ €·pfam11171, DUF2958, Protein of unknown function (DUF2958). Some members are annotated as lipoproteins however this cannot be confirmed. This family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €Ë<¢€0€0€ €špfam11172, DUF2959, Protein of unknown function (DUF2959). This family of proteins with unknown function appears to be restricted to Gammaproteobacteria.¡€0€ª€0€ €CDD¡€ €Ë=¢€0€0€ €špfam11173, DUF2960, Protein of unknown function (DUF2960). This family of proteins with unknown function appears to be restricted to Gammaproteobacteria.¡€0€ª€0€ €CDD¡€ €Ë>¢€0€0€ €ˆpfam11174, DUF2970, Protein of unknown function (DUF2970). This short family is conserved in Proteobacteria. The function is not known.¡€0€ª€0€ €CDD¡€ €Ë?¢€0€0€ €jpfam11175, DUF2961, Protein of unknown function (DUF2961). This family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €Ë@¢€0€0€ €pfam11176, Tma16, Translation machinery-associated protein 16. Proteins in this family localize to the nucleus. Their function is not clear.¡€0€ª€0€ €CDD¡€ €ËA¢€0€0€ €•pfam11177, DUF2964, Protein of unknown function (DUF2964). This family of proteins with unknown function appears to be restricted to Proteobacteria.¡€0€ª€0€ €CDD¡€ €ËB¢€0€0€ €‘pfam11178, DUF2963, Protein of unknown function (DUF2963). This family of proteins with unknown function appears to be restricted to Mollicutes.¡€0€ª€0€ €CDD¡€ €ËC¢€0€0€ €‘pfam11179, DUF2967, Protein of unknown function (DUF2967). This family of proteins with unknown function appears to be restricted to Drosophila.¡€0€ª€0€ €CDD¡€ €ËD¢€0€0€ €jpfam11180, DUF2968, Protein of unknown function (DUF2968). This family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €ËE¢€0€0€ €Spfam11181, YflT, Heat induced stress protein YflT. YflT is a heat induced protein.¡€0€ª€0€ €CDD¡€ €ËF¢€0€0€ €‚ pfam11182, AlgF, Alginate O-acetyl transferase AlgF. AlgF is essential for the addition of O-acetyl groups to alginate, an extracellular polysaccharide. The presence of O-acetyl groups plays an important role in the ability of the polymer to act as a virulence factor.¡€0€ª€0€ €CDD¡€ €ËG¢€0€0€ €‚!pfam11183, PmrD, Polymyxin resistance protein PmrD. PmrB forms a two-component system (TCS) with PmrA that allows Gram-negative bacteria to survive the cationic antimicrobial peptide polymyxin G. The TCS is linked to another one via the polymyxin resistance protein PmrD. PmrD is the first protein identified to mediate the connectivity between the two TCSs. It binds to the N terminal domain of the PmrA response regulator which prevents its dephosphorylation, thereby promoting the the transcription of genes involved in polymyxin resistance.¡€0€ª€0€ €CDD¡€ €ËH¢€0€0€ €–pfam11184, DUF2969, Protein of unknown function (DUF2969). This family of proteins with unknown function appears to be restricted to Lactobacillales.¡€0€ª€0€ €CDD¡€ €ËI¢€0€0€ €tpfam11185, DUF2971, Protein of unknown function (DUF2971). This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €ËJ¢€0€0€ €Èpfam11186, DUF2972, Protein of unknown function (DUF2972). Some members in this family of proteins with unknown function are annotated as sugar transferase proteins, however this cannot be confirmed.¡€0€ª€0€ €CDD¡€ €eV¢€0€0€ €tpfam11187, DUF2974, Protein of unknown function (DUF2974). This bacterial family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €ËK¢€0€0€ €‚pfam11188, DUF2975, Protein of unknown function (DUF2975). This family of bacterial proteins have no known function. These proteins are likely to be integral membrane proteins. The proteins contain a highly conserved glutamic acid close to their C-terminus.¡€0€ª€0€ €CDD¡€ €ËL¢€0€0€ €Ïpfam11189, DUF2973, Protein of unknown function (DUF2973). Some members in this family of proteins are annotated as membrane proteins however this cannot be confirmed. Currently they have no known function.¡€0€ª€0€ €CDD¡€ €ËM¢€0€0€ €¼pfam11190, DUF2976, Protein of unknown function (DUF2976). This family of proteins has no known function. Some members are annotated as membrane proteins however this cannot be confirmed.¡€0€ª€0€ €CDD¡€ €ËN¢€0€0€ €}pfam11191, DUF2782, Protein of unknown function (DUF2782). This is a bacterial family of proteins whose function is unknown.¡€0€ª€0€ €CDD¡€ €ËO¢€0€0€ €jpfam11192, DUF2977, Protein of unknown function (DUF2977). This family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €e\¢€0€0€ €»pfam11193, DUF2812, Protein of unknown function (DUF2812). This is a bacterial family of uncharacterized proteins, however some members of this family are annotated as membrane proteins.¡€0€ª€0€ €CDD¡€ €ËP¢€0€0€ €pfam11195, DUF2829, Protein of unknown function (DUF2829). This is a uncharacterized family of proteins found in bacteria and bacteriphages.¡€0€ª€0€ €CDD¡€ €ËQ¢€0€0€ €spfam11196, DUF2834, Protein of unknown function (DUF2834). This is a bacterial family of uncharacterized proteins.¡€0€ª€0€ €CDD¡€ €ËR¢€0€0€ €ípfam11197, DUF2835, Protein of unknown function (DUF2835). This is a bacterial family of uncharacterized proteins. One member of this family is annotated as the A subunit of Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV).¡€0€ª€0€ €CDD¡€ €ËS¢€0€0€ €spfam11198, DUF2857, Protein of unknown function (DUF2857). This is a bacterial family of uncharacterized proteins.¡€0€ª€0€ €CDD¡€ €ËT¢€0€0€ €spfam11199, DUF2891, Protein of unknown function (DUF2891). This is a bacterial family of uncharacterized proteins.¡€0€ª€0€ €CDD¡€ €ËU¢€0€0€ €upfam11200, DUF2981, Protein of unknown function (DUF2981). This eukaryotic family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €ec¢€0€0€ €špfam11201, DUF2982, Protein of unknown function (DUF2982). This family of proteins with unknown function appears to be restricted to Gammaproteobacteria.¡€0€ª€0€ €CDD¡€ €ËV¢€0€0€ €‚Spfam11202, PRTase_1, Phosphoribosyl transferase (PRTase). This PRTase family is fused to a C-terminal RNA binding Pelota domain, pfam01248. These genes are found in the biosynthetic operon associated with the Ter stress response operon and are predicted to be involved in the biosynthesis of a ribo-nucleoside involved in stress response.¡€0€ª€0€ €CDD¡€ €ËW¢€0€0€ €‚Üpfam11203, EccE, Putative type VII ESX secretion system translocon, EccE. EccE is a family of largely Gram-positive bacterial transmembrane componenets of the type VII secretion system characterized in Mycobacterium tuberculosis, systems ESX1-5. Translocation of virulent peptides through the membranes is thought to be mediated via a complex that includes EccB, EccC, EccD, EccE, and MycP. EccB, EccC, EccD, and EccE form a stable complex in the mycobacterial cell envelope.¡€0€ª€0€ €CDD¡€ €ËX¢€0€0€ €upfam11204, DUF2985, Protein of unknown function (DUF2985). This eukaryotic family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €ËY¢€0€0€ €špfam11205, DUF2987, Protein of unknown function (DUF2987). This family of proteins with unknown function appears to be restricted to Gammaproteobacteria.¡€0€ª€0€ €CDD¡€ €ËZ¢€0€0€ €­pfam11207, DUF2989, Protein of unknown function (DUF2989). Some members in this bacterial family of proteins are annotated as lipoproteins however this cannot be confirmed.¡€0€ª€0€ €CDD¡€ €Ë[¢€0€0€ €‚pfam11208, DUF2992, Protein of unknown function (DUF2992). This bacterial family of proteins has no known function. However, the cis-regulatory yjdF motif, just upstream from the gene encoding the proteins for this family, is a small non-coding RNA, Rfam:RF01764. The yjdF motif is found in many Firmicutes, including Bacillus subtilis. In most cases, it resides in potential 5' UTRs of homologs of the yjdF gene whose function is unknown. However, in Streptococcus thermophilus, a yjdF RNA motif is associated with an operon whose protein products synthesize nicotinamide adenine dinucleotide (NAD+). Also, the S. thermophilus yjdF RNA lacks typical yjdF motif consensus features downstream of and including the P4 stem. Thus, if yjdF RNAs are riboswitch aptamers, the S. thermophilus RNAs might sense a distinct compound that structurally resembles the ligand bound by other yjdF RNAs. On the ohter hand, perhaps these RNAs have an alternative solution forming a similar binding site, as is observed with some SAM riboswitches.¡€0€ª€0€ €CDD¡€ €Ë\¢€0€0€ €”pfam11209, DUF2993, Protein of unknown function (DUF2993). This family of proteins with unknown function appears to be restricted to Cyanobacteria.¡€0€ª€0€ €CDD¡€ €Ë]¢€0€0€ €jpfam11210, DUF2996, Protein of unknown function (DUF2996). This family of proteins has no known function.¡€0€ª€0€ €CDD¡€ €Ë^¢€