The bootstrap procedure for selecting the number of principal components in PCA

International Journal of Informatics and Communication Technology

The bootstrap procedure for selecting the number of principal components in PCA

Abstract

The initial step in determining the number of principal components for both classification and regression involves evaluating how much each component contributes to the total variance in the data. Based on this analysis, a subset of components that explains the highest percentage of variance is typically selected. However, multiple valid combinations may exist, and the final choice is often made manually by the researcher. This study introduces a novel yet straightforward algorithm for the automatic selection of the number of principal components. By integrating ANOVA and bootstrapping with principal component analysis (PCA), the proposed method enables automatic component selection in classification tasks. The algorithm is evaluated using three publicly available datasets and applied with both decision tree and support vector machine (SVM) classifiers. Results indicate that this automated procedure not only eliminates researcher bias in selecting components but also improves classification accuracy. Unlike traditional methods, it selects a single optimal combination of principal components without manual intervention, offering a new and efficient approach to PCAbased model development.

Discover Our Library

Embark on a journey through our expansive collection of articles and let curiosity lead your path to innovation.

Explore Now
Library 3D Ilustration