International Conference on Bioinformatics and its Applications (ICBA'04), Fort Lauderdale, Florida, USA
Rapid progress in genome research has produced a huge amount of nucleotide sequence data, including hundreds of complete genomes (Entrez Genomes). There is therefore a need to automate the gene prediction process to ensure that the amount of information does not in itself become a problem. Many prediction engines are available to facilitate such identification, but their application scope and prediction capability vary. This paper investigates the potential to improve gene prediction by integrating three available engines, GrailEXP, Genscan, and MZEF by means of a modular mixture of expert (MoE) neural network, where the utilization of a modular architecture is able to directly support the partitioned nature of the original data sources. The three selected engines represent three different categories of the prediction software. We were able to demonstrate that the integration system has markedly better recovery (proportion of actual exons that are predicted) than any of the individual prediction engines alone. After integration, we were able to achieve a higher correlation coefficient of exon prediction and thus a higher accuracy of the results. <!-- The program is available on line at <a href="http://cbr-rbc.nrc-cnrc.gc.ca/pany/integ.html ">http://cbr-rbc.nrc-cnrc.gc.ca/pany/integ.html</a> with links to the data used for this research. -->
Proceedings of the International Conference on Bioinformatics and its Applications (ICBA'04).