1887

Abstract

SUMMARY

In the framework of the international collaborative project aiming to sequence the whole chromosome, we have created a relational database for managing and analysing information associated with the molecular genetics of this bacterium: It allows recovery of non-redundant DNA sequences of the genome, as well as related information, i.e. genes, proteins, etc. A logical structure has been designed with appropriate links between the different objects, and a set of procedures has been implemented for data updating and management. The database is organized around a core constituted by all known contigs of i.e. sets of nonredundant sequences created from original entries in the EMBL data library. A user-friendly interface has been developed to make the database easy to consult. Sequence analysis tools have been integrated into the database, such as a program for rapid similarity searching of protein data banks, and a powerful DNA pattern searching program. Thanks to the consistency of we have performed a codon usage analysis by Factorial Correspondence Analysis, and a study of the distribution of the isoelectric points of known proteins of The database is available through anonymous ftp (address ‘ftp.pasteur.fr’ or IP number 157.99.64.12, directory ‘/pub/GenomeDB/SubtiList’

Loading

Article metrics loading...

/content/journal/micro/10.1099/13500872-141-2-261
1995-02-01
2024-04-20
Loading full text...

Full text loading...

/deliver/fulltext/micro/141/2/mic-141-2-261.html?itemId=/content/journal/micro/10.1099/13500872-141-2-261&mimeType=html&fmt=ahah

References

  1. Anagnostopoulos C., Piggot P. J., Hoch J. A. 1993 The genetic map of Bacillus subtilis. . In Bacillus subtilis and Other Gram-positive Bacteria: Biochemistry, Physiology and Molecular Genetics, pp 425–461 Edited by Sonenshein A. L., Hoch J. A., Losick R. Washington, DC: American Society for Microbiology;
    [Google Scholar]
  2. Bairoch A., Boeckmann B. 1993; The SWISS-PROT protein sequence data bank, recent developments. Nucleic Acids Res 21:3093–3096
    [Google Scholar]
  3. Bouffard G., Ostell J., Rudd K. E. 1992; GeneScape: a relational database of Escherichia coli genomic map data for Macintosh computers.. Comput Appl Biosci 8:563–567
    [Google Scholar]
  4. Delorme M. O., Hénaut A. 1988; Merging of distance matrices and classification by dynamic clustering.. Comput Appl Biosci 4:453–458
    [Google Scholar]
  5. Diday E. 1971; Une nouvelle méthode en classification auto- matique et reconnaissance des formes: la méthode des nuées dynamiques.. Rev Stat Appl 19:19–33
    [Google Scholar]
  6. Hill M. O. 1974; Correspondence analysis: a neglected multivariate method.. Appl Stat 23:340–353
    [Google Scholar]
  7. Itaya M., Tanaka T. 1991; Complete physical map of the Bacillus subtilis 168 chromosome constructed by a gene-directed mutagenesis method.. J Mol biol 220:631–648
    [Google Scholar]
  8. Kohara Y., Akiyama K., Isono K. 1987; The physical map of the whole E. coli chromosome: application of a new strategy for rapid analysis and sorting of a large genomic library.. Cell 50:495–508
    [Google Scholar]
  9. Kröger M., Wahl R., Rice P. 1993; Compilation of DNA sequences of Escherichia coli (update 1993).. Nucleic Acids Res 21:2973–3000
    [Google Scholar]
  10. Kunisawa T., Nakamura M., Watanabe H., Otsuka J., Tsugita A., Yeh L.S., George D. G., Barker W. C. 1990; Escherichia coli K12 genomic database.. Protein Sequences Data Anal 3:157–162
    [Google Scholar]
  11. Kunst F., Vassarotti A., Danchin A. 1995; Organization of the European Bacillus subtilis genome sequencing project.. Microbiology 141:249–255
    [Google Scholar]
  12. Lipman D. J., Pearson W. R. 1985; Rapid and sensitive protein similarity searches.. Science 227:1435–1441
    [Google Scholar]
  13. Médigue C., Rouxel T., Vigier P., Hénaut A., Danchin A. 1991; Evidence for horizontal gene transfer in Escherichia coli speciation.. J Mol biol 222:851–856
    [Google Scholar]
  14. Médigue C., Viari A., Hénaut A, Danchin A. 1993; Colibri: a functional data base for the Escherichia coli genome.. Microbiol Rev 57:623–654
    [Google Scholar]
  15. Needleman S. B., Wunsch C. D. 1970; A general method applicable to the search for similarities in the amino acid sequence of two proteins.. J Mol biol 48:443–453
    [Google Scholar]
  16. Perrière G., Gautier C. 1993; ColiGene: object-centered representation for the study of E. coli gene expressivity by sequence analysis.. Biochimie 75:415–422
    [Google Scholar]
  17. Rice C. M., Fuchs R., Higgins D. G., Stoehr P. J., Cameron G. N. 1993; The EMBL data library.. Nucleic Acids Res 21:2967–2971
    [Google Scholar]
  18. Rudd K. E. 1993; Maps, genes, sequences, and computers: an Escherichia coli case study.. ASM News 59:335–341
    [Google Scholar]
  19. Sellers P. H. 1974; On the theory and computation of evolutionary distances.. SIAM J Appl Math 26:787–793
    [Google Scholar]
  20. Sharp P. M., Higgins D. G., Shields D. C., Devine K. M. 1990a; Protein-coding genes: DNA sequence database and codon usage.. In Molecular Biological Methods for Bacillus, pp 557–569 Edited by Harwood C. R., Cutting S. M. Chichester: John Wiley and Sons;
    [Google Scholar]
  21. Sharp P. M., Higgins D. G., Shields D. C., Devine K. M., Hoch J. A. 1990b In Bacillus suhtilis gene sequences. Genetics and Biotechnology of Bacilli pp 89–98 Edited by Zukowski M. M., Ganesan A. T., Hoch J. A. San Diego: Academic Press;
    [Google Scholar]
  22. Shields D. C., Sharp P. M. 1987; Synonymous codon usage in Bacillus subtilis reflects both translational selection and mutational biases.. Nucleic AcidsRes 15:8023–8040
    [Google Scholar]
  23. Shin D. G., Lee C., Zhang J., Rudd K. E., Berg C. M. 1992; Redesigning, implementing and integrating Escherichia coli genome software tools with an object-oriented database system.. Comput Appl Biosci 8:227–238
    [Google Scholar]
  24. Slonimski P. P., Brouillet S. 1993; A data-base of chromosome III of Saccharomyces cerevisiae. . Yeast 9:941–1029
    [Google Scholar]
  25. Wilbur W. J., Lipman D. J. 1983; Rapid similarity searches of nucleic acid and protein data banks.. Proc Natl Acad Sci USA 80:726–730
    [Google Scholar]
http://instance.metastore.ingenta.com/content/journal/micro/10.1099/13500872-141-2-261
Loading
/content/journal/micro/10.1099/13500872-141-2-261
Loading

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error