An Explainable PSO-XGBoost Algorithm Framework for Concentration Prediction with High-Dimensional Data
Download as PDF
DOI: 10.25236/iwmecs.2025.005
Corresponding Author
Ya’nan Gao
Abstract
Noninvasive Prenatal Testing (NIPT) serves as a crucial tool for prenatal screening. Given the high-dimensional, heterogeneous, and nonlinear characteristics of NIPT data, an analytical framework that balances accuracy with interpretability is essential for effective prenatal screening. We propose an interpretable PSO-XGBoost framework that integrates Spearman correlation for feature screening, PSO-based hyperparameter optimization, and SHAP analysis to predict fetal Y-chromosome concentration. Experimental results demonstrate a significant positive correlation between fetal Y chromosome concentration and gestational age. The PSO-XGBoost model achieved an R² value of 0.958, indicating that the model exhibits high accuracy and robust stability. SHAP analysis further reveals that model predictions are primarily driven by core features such as X chromosome concentration, Y chromosome Z-score, and gestational age, with significant nonlinear interactions and individual variation present. Future integration of multimodal data could further improve the precision of prenatal diagnosis and clinical decision-making.
Keywords
NIPT; Spearman's rank correlation; PSO-XGBoost; SHAP; Prediction of fetal chromosomal concentration