Visualizing Realistic Benchmarked IDS Dataset: CIRA-CIC-DoHBrw-2020
Küçük Resim Yok
Tarih
2022
Dergi Başlığı
Dergi ISSN
Cilt Başlığı
Yayıncı
Ieee-Inst Electrical Electronics Engineers Inc
Erişim Hakkı
info:eu-repo/semantics/openAccess
Özet
Intrusion Detection System (IDS) dataset is crucial to detect lateral movement of cyber-attacks. IDS dataset will help to train the IDS classifier model to achieve earliest detection. A good near-realism public dataset is essential to assist the development of advanced IDS classifier models. However, the available public IDS dataset has long been under scrutiny for its practicality to reflect real low-footprint cyber threats, render real-time network scenario, reflect recent malware attack over newly developed DoH protocol, disregard layer 3 information and finally publish contradictory results of classification and analysis between various studies which makes it non-reproducible and without shareable results. This problem can be resolved by sophisticatedly visualizing a new realistic, real-time, low footprint and up-to-date benchmarked dataset. Visualization helps to detect data deformation before designing the optimized and highly accurate classifier model. Therefore, this study aims to review a new realistic benchmarked IDS dataset and apply sophisticated technique to visualize them. The review starts by carefully examining production network features. These are then compared with various well-established public IDS datasets. Many of them are static, unrealistic meta-features and disregard source and destination Internet Protocol (IP) information except CIRA-CIC-DoHBrw-2020 dataset. The study then applies Eigen Centrality (EC) technique from the graph theory to visualize this layer 3 (L3) information. Finally, using various visualization techniques such as Principal Component Analysis (PCA) and Gaussian Mixture Model (GMM), the study further analyzes and subsequently visualizes the data. Results show that the CIRA-CIC-DoHBrw-2020 simulated recent malware attack and has a very imbalanced dataset which reflects the realistic low-footprint cyber-attacks. The centrality graph clearly visualizes IPs that are compromised by recent DoH attack in real-time, and the study concludes decisively that smaller packet length of size 1000 to 2000 bytes is to fit an attack trait.
Açıklama
Anahtar Kelimeler
Data visualization, Real-time systems, Protocols, Principal component analysis, Benchmark testing, IP networks, Intrusion detection, Machine learning, Computer security, Cyberattack, Intrusion detection system (IDS), IDS dataset review, imbalanced dataset, data visualization, machine learning in cybersecurity
Kaynak
Ieee Access
WoS Q Değeri
Q2
Scopus Q Değeri
Q1
Cilt
10