Nguyen, Nhu T., Pham, Thuy T., Dang, Tien X., Dao, Minh-Son, Dang-Nguyen, Duc-Tien ORCID: 0000-0002-2761-2213, Gurrin, Cathal ORCID: 0000-0003-2903-3968 and Nguyen, Binh T. (2020) Malware detection using system logs. In: 2020 Intelligent Cross-Data Analysis and Retrieval Workshop (ICDAR'20), 26 Oct 2020, Dublin, Ireland. ISBN 978-1-4503-7087-5
Abstract
Malware detection is one of the most critical features in many real
applications, especially for the mobile platform and the Internet
of Things (IoT) technology. Due to the proliferation of mobile devices and the associated app-stores, the volume of new applications
growing extremely fast requires a better way to analyze all possible malicious behaviors. In this paper, we investigate the malware
prediction problem using system log files that contain numbers of
sequences of system calls recorded from IoT devices. We construct
a suitable multi-class classification model by using the combination
of hand-crafted features, (including Bag-of-Ngrams, TF-IDF, and the
statistical metrics computed from the consecutive repeated system
calls in each log file). Also, we consider different machine learning
models, including Random Forest, Support Vector Machines, and
Extreme Gradient Boosting, and measure the performance of each
method in terms of precision, recall, and F1-score. The experimental results show that a combination of different features, as well
as using the Extreme Gradient Boosting technique, can help us
to achieve promising performance in the dataset provided by the
organizers of the competition CMDC 2019.
Metadata
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Event Type: | Workshop |
Refereed: | Yes |
Uncontrolled Keywords: | malware detection; SVMs; IoT; XGBoost; random forest |
Subjects: | Computer Science > Computer security Computer Science > Software engineering |
DCU Faculties and Centres: | DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing Research Institutes and Centres > ADAPT |
Published in: | Proceedings of the 2020 Intelligent Cross-Data Analysis and Retrieval Workshop (ICDAR ’20). . Association for Computing Machinery (ACM). ISBN 978-1-4503-7087-5 |
Publisher: | Association for Computing Machinery (ACM) |
Official URL: | https://doi.org/10.1145/3379174.3392318 |
Copyright Information: | © 2020 ACM |
Use License: | This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License |
Funders: | Science Foundation Ireland under grant number SFI/13/RC/2106, L. Meltzers Høyskolefonds, UiB 2019/2259-NILSO |
ID Code: | 24668 |
Deposited On: | 22 Jun 2020 15:47 by Cathal Gurrin . Last Modified 15 Dec 2021 15:40 |
Documents
Full text available as:
Preview |
PDF
- Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
2MB |
Downloads
Downloads
Downloads per month over past year
Archive Staff Only: edit this record