Abstract | This paper presents a parallel implementation of a hybrid data mining technique for multivariate heterogeneous time varying processes based on a combination of neuro-fuzzy techniques and genetic algorithms. The purpose is to discover patterns of dependency in general multivariate time-varying systems, and to construct a suitable representation for the function expressing those dependencies. The patterns of dependency are represented by multivariate, non-linear, autoregressive models. Given a set of time series, the models relate future values of one target series with past values of all such series, including itself. The model space is explored with a genetic algorithm, whereas the functional approximation is constructed with a similarity based neuro-fuzzy heterogeneous network. This approach allows rapid prototyping of interesting interdependencies, especially in poorly known complex multivariate processes. This method contains a high degree of parallelism at different levels of granularity, which can be exploited when designing distributed implementations, such as workcrew computation in a master-slave paradigm. In the present paper,a first implementation at the highest granularity level is presented. The implementation was tested for performance and portability in different homogeneous and heterogeneous Beowulf clusters with satisfactory results. An application example with a known time series problem is presented. |
---|