Gene sequence and properties of Cell, a family E endoglucanase from Clostridium thermocellum Hazlewood, Geoffrey P. and Davidson, Keith and Laurie, Judith I. and Huskisson, Neville S. and Gilbert, Harry J.,, 139, 307-316 (1993), doi = https://doi.org/10.1099/00221287-139-2-307, publicationName = Microbiology Society, issn = 1350-0872, abstract= The Clostridium thermocellum cell gene, coding for endoglucanase I (Cell), consists of an open reading frame (ORF) of 2640 nucleotides and codes for a protein of M r 98531. The ORF was confirmed as cell by comparing the N-terminal sequence of purified recombinant Cell with that deduced from the nucleotide sequence. Cell hydrolysed lichenan and carboxymethylcellulose, but was principally active against barley-glucan. It exhibited significant sequence identity with subfamily E2 endoglucanases, and by analogy with others in this group contains a catalytic domain of around 500 residues located in the N-terminal half of the protein. The C-terminal region of Cell was highly homologous with the cellulose-binding domain of the non-catalytic cellulosome subunit, S1. A repeated segment, previously shown to be highly conserved in xylanase Z and in other endoglucanases from C. thermocellum, was absent from Cell. Antiserum raised against purified recombinant Cell cross-reacted with proteins contained in the cellulosomes of two strains of C. thermocellum, suggesting that Cell is either a component of the cellulosome or is homologous to other cellulosome proteins. A second gene, located upstream of cell, consisted of an ORF of 1671 nucleotides, coding for a protein of M r 61042. Based on its homology with the Escherichia coli tar gene product, the polypeptide encoded by the second gene is tentatively identified as a sensory transducer., language=, type=