4.7 Article

HinCTI: A Cyber Threat Intelligence Modeling and Identification System Based on Heterogeneous Information Network

Journal

Publisher

IEEE COMPUTER SOC
DOI: 10.1109/TKDE.2020.2987019

Keywords

Semantics; IP networks; Data mining; Malware; Electronic mail; Computer security; Cyber threat intelligence; threat type identification; heterogeneous information network; graph convolutional network; threat infrastructure nodes

Funding

  1. NSFC-General Technology Fundamental Research Joint Fund [U1836215]
  2. BUPT Excellent Ph.D.
  3. Students Foundation [CX2018216]
  4. US National Science Foundation [III-1763325, III-1909323, CNS-1930941]
  5. National Key R&D Program of China [2018YFC0830804]

Ask authors/readers for more resources

As cyber attacks become more complex, organizations are increasingly leveraging the exchange of cyber threat intelligence (CTI) to protect themselves. However, modeling CTI is challenging due to the relationships among CTI and the heterogeneity of cyber-threat infrastructure nodes. This study introduces HinCTI, a system that models CTI on a heterogeneous information network and utilizes graph convolutional networks for threat type identification of infrastructure nodes. The experiments demonstrate the effectiveness of the proposed approach in improving threat type identification compared to existing methods.
Cyber attacks have become increasingly complicated, persistent, organized, and weaponized. Faces with this situation, drives a rising number of organizations across the world are showing a growing willingness to leverage the open exchange of cyber threat intelligence (CTI) for obtaining a full picture of the fast-evolving cyber threat situation and protecting themselves against cyber-attacks. However, modeling CTI is challenging due to the explicit and implicit relationships among CTI and the heterogeneity of cyber-threat infrastructure nodes involved in CTI. Owing to the limited labels of cyber threat infrastructure nodes involved in CTI, automatically identifying the threat type of infrastructure nodes for early warning is also challenging. To tackle these challenges, a practical system called HinCTI is developed for modeling cyber threat intelligence and identifying threat types. We first design a threat intelligence meta-schema to depict the semantic relatedness of infrastructure nodes. We then model cyber threat intelligence on heterogeneous information network (HIN), which can integrate various types of infrastructure nodes and rich relations among them. Following, we define a meta-path and meta-graph instances-based threat Infrastructure similarity (MIIS) measure between threat infrastructure nodes and present a MIIS measure-based heterogeneous graph convolutional network (GCN) approach to identify the threat types of infrastructure nodes involved in CTI. Moreover, through the hierarchical regularization strategy, our model can alleviate the problem of overfitting and achieve good results in the threat type identification of infrastructure nodes. To the best of our knowledge, this work is the first to model CTI on HIN for threat identification and propose a heterogeneous GCN-based approach for threat type identification of infrastructure nodes. With HinCTI, comprehensive experiments are conducted on real-world datasets, and experimental results demonstrate that our proposed approach can significantly improve the performance of threat type identification compared to the existing state-of-the-art baseline methods. Our work is beneficial to greatly relieve security analysts from heavy analysis work and efficiently protect organizations against cyber-attacks.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available