Host based machine learning approach to detect data exfiltration on computer networks

Samanwickrama, WD; Wickramasinghe, Indika; Perera, LDRD

WUSL Repo Home
→
1. WUSL Publications
→
a). Proceedings
→
Wayamba University Research Congress (WURC)
→
View Item

dc.contributor.author	Samanwickrama, WD
dc.contributor.author	Wickramasinghe, Indika
dc.contributor.author	Perera, LDRD
dc.date.accessioned	2026-05-22T08:43:12Z
dc.date.available	2026-05-22T08:43:12Z
dc.date.issued	2018
dc.identifier.uri	http://repository.wyb.ac.lk/handle/1/3608
dc.description.abstract	Data exfiltration, unauthorized transfer of sensitive information, is a most common threat on computer networks. lt uses legitimate communication channels such as hypertext transfer protocol (HTTP) and hypertext transfer protocol secure (HTTPS). Differentiating the data exfiltration from normal network traffic is a challenge due to the regular exchange of data back-and-forth between networks, using traditional security solutions (e.g. firewalls). Due to its ability to adapt quickly to new and unknown complex situations, Machine learning (ML) offers a promise in this context that can be employed for resolving such challenges. ln this study, we adopted an anomaly detection method to address the data exfiltration and examined the effectiveness of this approach under different network environments. An anomaly is an observation that deviates so much from other observations as to arouse suspicion that it was generated by a different mechanism is the underlying hypothesis here. Fufther, in the security domain, evaluation of novel algorithms against production network data is difficult. This is mainly due to the legal, security and privacy issues. Researchers often use simulation/emulation methods or some benchmark datasets to validate their novel algorithms. Thus, in this study, simulated number of different attacks on a network testbed and captured data for validation purpose. The advance persistent threat (APT) attacks are highly targeted in nature and often discovered years after the information has been stolen. As a result it's difficult to derive signatures to detect these attacks using signature based intrusion detection systems (SBlDSs) or to obtain labeled data to train supervised machine learning algorithms. Therefore our work employs unsupervised anomaly detection technique with Local Outlier Factor (LOF). Experimental resulg are encouraging that the proposed method successfully isolates (FiS. 1) exfiltrated data flows (e.g. Remote Administration Tool traffic) from the rest of legitimate web traffic even when traffic is encrypted and uses the same channel (HTTPS). However, in order to generalize these findings, extensive validation mechanism is needed in future work.	en_US
dc.publisher	Wayamba University Research Congress	en_US
dc.subject	Anomaly detection	en_US
dc.subject	Data exfiltration	en_US
dc.subject	Machine learning	en_US
dc.subject	Network security	en_US
dc.subject	Remote administration	en_US
dc.title	Host based machine learning approach to detect data exfiltration on computer networks	en_US
dc.type	Article	en_US