| dc.description.abstract |
Data exfiltration, unauthorized transfer of sensitive information, is a most common
threat on computer networks. lt uses legitimate communication channels such as
hypertext transfer protocol (HTTP) and hypertext transfer protocol secure (HTTPS).
Differentiating the data exfiltration from normal network traffic is a challenge due to the
regular exchange of data back-and-forth between networks, using traditional security
solutions (e.g. firewalls). Due to its ability to adapt quickly to new and unknown
complex situations, Machine learning (ML) offers a promise in this context that can be
employed for resolving such challenges. ln this study, we adopted an anomaly
detection method to address the data exfiltration and examined the effectiveness of this
approach under different network environments. An anomaly is an observation that
deviates so much from other observations as to arouse suspicion that it was generated
by a different mechanism is the underlying hypothesis here.
Fufther, in the security domain, evaluation of novel algorithms against production
network data is difficult. This is mainly due to the legal, security and privacy issues.
Researchers often use simulation/emulation methods or some benchmark datasets to
validate their novel algorithms. Thus, in this study, simulated number of different
attacks on a network testbed and captured data for validation purpose. The advance
persistent threat (APT) attacks are highly targeted in nature and often discovered years
after the information has been stolen. As a result it's difficult to derive signatures to
detect these attacks using signature based intrusion detection systems (SBlDSs) or to
obtain labeled data to train supervised machine learning algorithms. Therefore our
work employs unsupervised anomaly detection technique with Local Outlier Factor
(LOF).
Experimental resulg are encouraging that the proposed method successfully isolates
(FiS. 1) exfiltrated data flows (e.g. Remote Administration Tool traffic) from the rest of
legitimate web traffic even when traffic is encrypted and uses the same channel
(HTTPS). However, in order to generalize these findings, extensive validation
mechanism is needed in future work. |
en_US |