Strengthening Intrusion Detection Systems utilizing Distributed Deep Learning: A Model Parallelism Approach for Efficient Training on High-Dimensional Data
Keywords:
Intrusion Detection System, Distributed Deep Learning, Model Parallelism, Data Parallelism.Abstract
Intrusion detection systems (IDSs) are among the most outstandingwidespread and renowned detection models for providing the best safeguards to network traffic. IDS has the ability to discover malware and abnormal behaviors during or after data processing, whether in online or offline mode. In the literature, IDS has been approached by various deep learning techniques. The main problem is the high dimension features for large volumes of data of various intrusion detection datasets, and large deep learning (DL) models with huge numbers of hyperparameters, training such models is a major challenge that needs to be approached because such models is too big to be accommodated on one device. So, we need an efficient approach with low communication overhead for faster training and higher testing accuracy with fewer false alerts is strongly required. Consequently, the deployment of distributed deep learning (DDL) approach could address these challenges. DDL supports working on bigger datasets, especially with high-dimensional data and large numbers of hidden layers, with better performance in less time, whether using data parallelism to split datasets or model parallelism to split large models. In addition, deploying the optimization library Deep Speed, which was released by Microsoft Research (last version v.5.10), makes distributed training and inference easier, more efficient, and more effective. Solving and overcoming the limitations of training large deep learning model by standing fora novel deep convolutional neural network model using parallelization (Splitting) approach, in order to reduce the communication volume and computational load balance of the resulting partition, speeds up the training procedure by utilizing several (GPUs) in contrast to single processorfor training the whole deep convolutional neural network model. Improving accuracy, detect unknown attacks and reduce false alert rates by improving classification performanceby using optimal architecture for deep convolutional neural network model.In this paper, an enhanced IDS is proposed by applying a model parallelism that has the capability to split the model hidden layers and distribute them to more than one graphics processing unit (GPU). This approach has been applied to the high-dimensional UNSW-NB15 data set. Given that neural networks with convolutions (CNNs) have been applied with success in numerous image analysis use case where there is a spatial relationship between features, working with high-dimensional data where there is no relationship between its features is inappropriate for CNN modeling. To take upon this challenge, the high-dimensional data was converted into images because the conversion of data into images has the potential to enhance the representation of the correlation between features, whereby CNNs can learn to assist in prediction. Our proposed model achieved an improvement in the time of training by more than 5X speedup compared to conventional deep learning models and an archived high detection rate of 99%. In addition, the proposed model appreciably reduces the false alert rate by approximately 80%, which outperforms conventional deep learning models. The comparison utilizing cutting-edge methods demonstrates that the suggested strategy is more efficient and appropriate for large models and complex data.