Many existing security systems work with a type of dictionary of known attacks. When criminals change their methods, these systems are no longer able to recognize threats. Machine learning, algorithms that learn to identify patterns of data, provide a more flexible way to respond to new and previously invisible variants.

Smart Algorithms as Digital Watchdogs
Etienne van de Bijl investigated whether machine learning could help early detection of denial of service (DOS) and distributed denial of service (DDO) attacks. He also investigated web-based attacks such as SQL injection (where a malicious actor attempts to access the database via input fields or forms) and cross-site scripting (where harmful code is inserted into a website and then executed unconsciously by a visitor).
Another widely used method Van de Bijl studies is brute force attacks. Automatic testing of countless password combinations until successful.
Recognize new variations
His research shows that AI systems may recognize new types of attacks without explicit training. For example, models designed to detect brute force attacks can also take up specific variations of DDOS attacks. The opposite is not necessarily true. A model that works well with one type of attack is not necessarily suitable for another.
The findings also revealed that more training data does not automatically mean better performance. In some circumstances, small, carefully curated datasets can provide better results than large, indiscriminate data collections. Therefore, the quality of the dataset is important for effective cyber defense, Van de Bijl concludes.
Learning with limited data
A practical general challenge is the lack of labeled data. This is when it is not clear which network activity is normal and suspicious. To address this, Van de Bijl has developed Ultra, a method that combines two methods. Active learning and Transfer learning. Active learning allows the system to select the most promising examples for experts to label them as either “attack” or “normal.” Transfer learning, on the other hand, leverages knowledge from another domain or previous dataset to accelerate learning in a new system. Together, these methods allow detection systems to produce meaningful results at the early stage, even with limited data.
A valuable instrument
In his paper, Van de Bijl concludes that machine learning can be a valuable tool for detecting cyberattacks. At the same time, the challenges remain. We need a better algorithm that is rigorously tested and that allows the work to be transparent and explained. Only then can AI become a reliable part of our digital defense.
About the paper
title: From baselines to breakthroughs: the fundamentals and applications of machine learning in cybersecurity
PhD supervisors: Rob van der May (CWI/VU) and Sandjai Bhulai (VU)
Header Photo: ShutterStock
