A Comparative Analysis of Non-Linear Machine Learning Models for Predicting Permeability Across the Blood-Brain Barrier

Authors

  • Akash Asthana, Adarsh Tripathi

Keywords:

Blood-Brain Barrier (BBB), XGBoost, Drug Discovery, Support Vector Regression, Neural Networks, Random Forest, QSAR, Machine Learning

Abstract

Accurate prediction of blood–brain barrier (BBB) permeability remains a critical challenge in the early stages of central nervous system (CNS) drug discovery. This study presents a comparative evaluation of non-linear machine learning models such as Support Vector Regression (SVR), XGBoost, Random Forest (RF), and Artificial Neural Networks (ANN) to predict BBB permeability using quantitative structure activity relationship (QSAR) approaches. Molecular descriptors derived from cheminformatics platforms such as PaDEL, Mordred, and MOE were used as input features, following standard preprocessing techniques including imputation, normalization, and dimensionality reduction via PCA were used.

The (Burns et al. 2004), dataset, comprising 80 compounds with experimentally determined BBB permeability, was foundational to model development. Performance was tested using regressional metrics consist of the coefficient of determination (R²), mean squared error (MSE), root mean squared error (RMSE), and mean absolute error (MAE). In all models, XGBoost shown the highest predictive performance with an R² of 0.9430 and RMSE of 0.0057, outperforming SVR (R² = 0.8660), RF (R² = 0.9087), and ANN (R² = 0.7459). Feature importance analysis resulted that lipophilicity (logP), molecular weight, hydrogen bond donors (HBD), polar surface area (PSA), and the number of rotatable bonds were the most influential predictors of BBB permeability.

The study shows that XGBoost, because of its ability to capture complex non-linear interactions and its built-in regularization, is the most effective model amid those tested. While SVR and RF models offered competitive results, ANN models underperformed due to limited dataset size. This comparative analysis supports the integration of tree-based ensemble methods in predictive modeling for BBB permeability and highlights the potential of data-driven approaches to inform CNS-targeted drug design served as the foundation for model development. Further research should explore larger datasets and integrated architectures leveraging deep learning and ensemble learning approaches for enhanced generalization.

Downloads

Published

2025-06-08

How to Cite

Akash Asthana, Adarsh Tripathi. (2025). A Comparative Analysis of Non-Linear Machine Learning Models for Predicting Permeability Across the Blood-Brain Barrier. Journal of Computational Analysis and Applications (JoCAAA), 34(6), 105–121. Retrieved from https://eudoxuspress.com/index.php/pub/article/view/3045

Issue

Section

Articles