An integration clustering and multi-target classification approach to explore employability and career linearity
International Journal of Informatics and Communication Technology
Abstract
This study analyzes job placement waiting times and job linearity among female science, technology, engineering, and mathematics (STEM) graduates using clustering and multi-target classification (MTC) models. The K-means least trimmed square (LTS) algorithm, known for its robustness against outliers, was employed for clustering. With k = 2 and a trimming percentage of 30%, the model achieved a silhouette score of 77%, resulting in two distinct clusters: ideal and non-ideal. To enhance the dataset for classification, synthetic data was generated using the adaptive synthetic (ADASYN)-gaussian method. Principal component analysis (PCA) was used for visualization purposes, along with overlapping histograms, to illustrate that the synthetic data distribution closely resembled the original. For classification, a random forest (RF) model was used to predict both jobs waiting time and job linearity. Hyperparameter tuning produced an optimal model with a classification accuracy of 92%. Cross-validation (CV) confirmed the model’s robustness, with F1-micro and F1-macro scores of 94% and 93%, respectively. Results show that although women in STEM are underrepresented, 73% of the female alumni analyzed belonged to the short job waiting group. Furthermore, a strong negative correlation between GPA and job waiting time suggests that higher-GPA graduates tend to secure employment more quickly.
Discover Our Library
Embark on a journey through our expansive collection of articles and let curiosity lead your path to innovation.





