The main goal of large-scale genome sequencing projects is to obtain new insights into physiological and biological processes underlying the very organization of life. An essential step in this quest is gene identification, with subsequent functional annotation of the corresponding gene products. Gene recognition in bacteria is far from being always straightforward, despite the fact that bacterial genes are usually lacking introns. Extraction of all possible Open Reading Frames (ORFs) of a given length from a given DNA sequence is a trivial procedure; it is much less simple to decide which among those contain genes that are eventually expressed and code for proteins (CoDing Sequences, CDSs). The widely spread (and unfortunate) confusion between ORFs and CDSs is the sign of the lack of adequacy of many annotating systems in gene identification. Gene-finding methods are traditionally divided into two broad categories . "Intrinsic" methods, which deal with DNA sequence only, use statistics or pattern recognition algorithms to find genes in DNA through detection of specific motifs or global statistical patterns. A typical example of such methods is the GeneMark software , a deservedly popular gene prediction program for prokaryotes, which uses periodical Markov models to find DNA regions that code for proteins. "Extrinsic" methods take into account information derived from similarity search procedures, using as queries either the genome sequence itself, or the putative proteins derived from the list of ORFs . In the first case, the translation in all the six frames of the query DNA is required to compare the resulting amino acid sequences to known proteins (BLASTX program). Although this method has been shown to be relatively effective for gene finding , it is too time-consuming to be used as a common procedure. In addition, the prediction of such extrinsic methods entirely relies on the presence of closely related protein sequences in databanks, a dramatic limitation for gene discovery. Finally, it has been recently shown that a great many spurious short genes are generally annotated in genomes , and that the number of potential errors in the prediction of functional annotation is higher than is usually believed, mainly because it is based on relatively weak sequence identities and/or partial alignments. 
Practical experience in genome analysis shows that it is necessary to incorporate as much available biologically derived evidence as possible in order to achieve reliable results . An integration of several sequence analysis methods into a coherent and efficient prediction system is therefore required to obtain efficient computer-assisted annotation of DNA sequences. Several platforms integrating some of these goals have recently been developed . Our own effort in this direction resulted in the creation of an integrated computer environment, Imagene, dedicated to genome sequence annotation and analysis . In this system, both the biological data and the sequence analysis tools are uniformly represented in an object-based model, with a user interface, which allows one to display simultaneously the results produced by a variety of methods. This helps one to easily annotate interesting features of the sequence. In contrast to the approach followed by "genome crunchers" such as GeneQuiz  or Magpie , no automatic postprocessing of results has been defined in Imagene. The final synthesis and decisions are under the responsibility of the annotator and are helped by the graphic clues presented by the system. With the multiplication of genomic sequences of microorganisms, it became important to perform an efficient gene annotation using a first automatic procedure step before going to an in-depth manual annotation. 

