This paper presents an analysis of microarray gene expression data from patients with and without scleroderma skin disease using computational intelligence and visual data mining techniques. Virtual reality spaces are used for providing unsupervised insight about the information content of the original set of genes describing the objects. These spaces are constructed by hybrid optimization algorithms based on a combination of Differential Evolution (DE) and Particle Swarm Optimization respectively, with deterministic Fletcher-Reeves optimization. A distributed-pipelined data mining algorithm composed of clustering and cross-validated rough sets analysis is applied in order to find subsets of relevant attributes with high classification capabilities. Finally, genetic programming (GP) is applied in order to find explicit analytic expressions for the characteristic functions of the scleroderma and the normal classes. The virtual reality spaces associated with the set of function arguments (genes) are also computed. Several small subsets of genes are discovered which are capable of classifying the data with complete accuracy. They represent genes potentially relevant to the understanding of the scleroderma disease.
The Genetic and Evolutionary Computation Conference (GECCO-2007) (2007).