data pre-processing; semiconductor manufacturing; intelligent data analysis; semiconductor manufacturing; Intelligent data analysis
This paper first provides an overview of data pre-processing focusing on problems of the real world data. These are primarily problems that have to be carefully understood and solved before any data analysis process starts. The paper discusses in detail, two main reasons for performing data pre-processing: (i) problems with the data and (ii) preparation for data analysis. The paper continues with details of data pre-processing techniques to achieve each of the above mentioned objectives. A total of 14 techniques are discussed. Two examples of data pre-processing applications from two of the most data rich domains are given at the end. The applications are related to semiconductor manufacturing and aerospace domains where large amounts of data are available and they are fairly reliable. Future directions and some challenges are discussed at the end.
International Journal on Intelligent Data Analysis1, no. 1 (1997).