In this study, multiple machine learning models were applied to predict student depression using various academic, personal, and psychological factors. After training and evaluating five models — Logistic Regression, Random Forest, Support Vector Machine (SVM), K-Nearest Neighbors (KNN), and Gradient Boosting — it was observed that the Gradient Boosting model achieved the highest accuracy of 83.73%, followed closely by Logistic Regression (83.64%) and SVM (83.43%).
The results indicate that the dataset has sufficient patterns and correlations for effective prediction, especially using ensemble methods like Gradient Boosting. On the other hand, KNN performed comparatively lower with an accuracy of 74.65%, possibly due to the nature of high-dimensional and mixed-type data.
Overall, this project demonstrates the practical capability of machine learning models in identifying students at risk of depression. Future work can include feature engineering, hyperparameter tuning, and the inclusion of more psychological or behavioral factors to further enhance model performance