Deep Learning Based Malware Detection Tool Development for Android Operating System

Authors

  • Mahmut Tokmak Isparta University of Applied Sciences, Gelendost Vocational School, Isparta, Turkey
  • Ecir Uğur Küçüksille Suleyman Demirel University, Department of Computer Engineering, Isparta, Turkey
  • Utku Köse Suleyman Demirel University, Department of Computer Engineering, Isparta, Turkey

DOI:

https://doi.org/10.18662/brain/12.4/237

Keywords:

Android malware analysis, static analysis, dynamic analysis, hybrid analysis, deep learning

Abstract

In today's world that called technology age, smartphones have become indispensable for users in many areas such as internet usage, social media usage, bank transactions, e-mail, as well as communication. The Android operating system is the most popular operating system that used with a rate of 85.4% in smartphones and tablets. Such a popular and widely used platform has become the target of malware. Malicious software can cause both material and moral damages to users.

In this study, malwares that targeting smart phones were detected by using static, dynamic and hybrid analysis methods. In the static analysis, feature extraction was made in 9 different categories. These attributes are categorized under the titles of requested permissions, intents, Android components, Android application calls, used permissions, unused permissions, suspicious Android application calls, system commands, internet addresses. The obtained features were subjected to dimension reduction with principal component analysis and used as input to the deep neural network model. With the established model, 99.38% accuracy rate, 99.36% F1 score, 99.32% precision and 99.39% sensitivity values were obtained in the test data set.

In the dynamic analysis part of the study, applications were run on a virtual smartphone, and Android application calls with strategic importance were obtained by hooking. The method called hybrid analysis was applied by combining the dynamically obtained features with the static features belonging to the same applications. With the established model, 96.94% accuracy rate, 96.78% F1 score, 96.99% precision and 96.59% sensitivity values were obtained in the test data set.

References

Aafer, Y., Du, W., & Yin, H. (2013). DroidAPIminer: Mining api-level features for robust malware detection in android. In T. Zia, A. Zomaya, V. Varadharajan & M. Mao (Eds.), International conference on security and privacy in communication systems (pp. 86-103). Springer. https://www.cs.ucr.edu/~heng/pubs/droidapiminer-securecomm13.pdf

Alshahrani, H., Mansourt, H., Thorn, S., Alshehri, A., Alzahrani, A., & Fu, H. (2018). Ddefender: Android Application Threat Detection Using Static and Dynamic Analysis. In 2018 IEEE International Conference on Consumer Electronics (ICCE), Las Vegas, USA, (pp. 1-6). https://drive.google.com/file/d/1CQfDO7VwqwYzkrYrWPdsbyST8pWqwalE/view

Alzaylaee, M. K., Yerima, S. Y., & Sezer, S. (2016). Dynalog: An Automated Dynamic Analysis Framework for Characterizing Android Applications. In 2016 International Conference on Cyber Security and Protection Of Digital Services (Cyber Security), London, United Kingdom (pp. 1-8). https://arxiv.org/ftp/arxiv/papers/1607/1607.08166.pdf

Alzaylaee, M. K., Yerima, S. Y., & Sezer, S. (2020). DL-Droid: Deep Learning Based Android Malware Detection Using Real Devices. Computers & Security, 89, 101663. https://doi.org/10.1016/j.cose.2019.101663

AMD. (2018). Android Malware Dataset. Retrieved May 10, 2018 from http://amd.arguslab.org/

Anagnostopoulos, M., Kambourakis, G., & Gritzalis, S. (2016). New Facets of Mobile Botnet: Architecture and Evaluation. International Journal of Information Security, 15(5), 455-473. http://doi.org/10.1007/s10207-015-0310-0

Android Wake Lock Research. (2018). Obtain the commercial Android apps. Retrieved May 06, 2018, from http://sccpu2.cse.ust.hk/elite/downloadApks.html

Arp, D., Spreitzenbarth, M., Hubner, M., Gascon, H., Rieck, K., & Siemens, C. E. R. T. (2014). Drebin: Effective and explainable detection of android malware in your pocket. In Network and Distributed System Security Symposium, 14 (pp. 23-26). The Internet Society. http://user.cs.uni-goettingen.de/~krieck/docs/2014-ndss.pdf

Au, K. W. Y., Zhou, Y. F., Huang, Z., & Lie, D. (2012). Pscout: Analyzing the Android Permission Specification. In T. Yu (Ed.), Proceedings of the 2012 ACM Conference on Computer and Communications Security, Raleigh North Carolina, USA (pp. 217-228). Association for Computing Machinery. http://dx.doi.org/10.1145/2382196.2382222

Barros, P., Parisi, G. I., Weber, C., & Wermter, S. (2017). Emotion-Modulated Attention Improves Expression Recognition: A Deep Learning Model. Neurocomputing, 253, 104-114. http://dx.doi.org/10.1016/j.neucom.2017.01.096

Bhandari, S., Gupta, R., Laxmi, V., Gaur, M. S., Zemmari, A., & Anikeev, M. (2015). DRACO: DRoid analyst combo an android malware analysis framework. In O. Makarevich (Ed.), Proceedings of the 8th International Conference on Security of Information and Networks (pp. 283-289). Association for Computing Machinery. https://doi.org/10.1145/2799979.2800003

Bhilvare, A., & Manik, T. (2015). An Overview of Different Malware Analysis Techniques in Android. IJSRD - International Journal for Scientific Research & Development, 3(1), 368-372. http://www.ijsrd.com/articles/IJSRDV3I1264.pdf

Cordonsky, I., Rosenberg, I., Sicard, G., & David, E. O. (2018). DeepOrigin: End-to-end deep learning for detection of new malware families. In 2018 International Joint Conference on Neural Networks (IJCNN) (pp. 1-7). IEEE. https://elidavid.com/pubs/deeporigin.pdf

Cui, Z., Xue, F., Cai, X., Cao, Y., Wang, G.-g., & Chen, J. (2018). Detection of Malicious Code Variants Based on Deep Learning. IEEE Transactions on Industrial Informatics, 14(7), 3187-3196. https://doi.org/10.1109/TII.2018.2822680

Feizollah, A., Anuar, N. B., Salleh, R., Suarez-Tangil, G., & Furnell, S. (2017). Androdialysis: Analysis of Android Intent Effectiveness in Malware Detection. Computers & Security, 65, 121-134. https://doi.org/10.1016/j.cose.2016.11.007

Fenton, C. (2018). GitHub. Retrieved July 6, 2018 from https://github.com/CalebFenton/apkfile

Fereidooni, H., Conti, M., Yao, D., & Sperduti, A. (2016,). ANASTASIA: ANdroid mAlware detection using STatic analySIs of Applications. In 2016 8th IFIP international conference on new technologies, mobility and security (NTMS) (pp. 1-5). IEEE. https://doi.org/10.1109/NTMS.2016.7792435

Google Play. (2018). Retrieved December 11, 2018 from https://www.android.com/play-protect/

Hall, P. M., Marshall, A. D., & Martin, R. R. (1998). Incremental Eigenanalysis for Classification. In J. N. Carter & M. S. Nixon (Eds.), Proceedings of British machine vision conference (pp. 286-295). BMVC. http://www.bmva.org/bmvc/1998/pdf/p186.pdf

Hou, S., Saas, A., Chen, L., & Ye, Y. (2016). Deep4maldroid: A deep learning framework for android malware detection based on linux kernel system call graphs. In 2016 IEEE/WIC/ACM International Conference on Web Intelligence Workshops (WIW) (pp. 104-111). IEEE. https://doi.ieeecomputersociety.org/10.1109/WIW.2016.040

Hou, S., Saas, A., Chen, L., Ye, Y., & Bourlai, T. (2017). Deep neural networks for automatic android malware detection. In Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017 (pp. 803-810). https://doi.org/10.1145/3110025.3116211

IDC. (2020). International Data Corporation. Retrieved December 2, 2020 from https://www.idc.com/promo/smartphone-market-share

Idrees, F., Rajarajan, M., Conti, M., Chen, T. M., & Rahulamathavan, Y. (2017). PIndroid: A novel Android Malware Detection System Using Ensemble Learning Methods. Computers & Security, 68, 36-46. https://doi.org/10.1016/j.cose.2017.03.011

Karbab, E. B., Debbabi, M., Derhab, A., & Mouheb, D. (2018). MalDozer: Automatic framework for android malware detection using deep learning. Digital Investigation, 24, 48-59. https://doi.org/10.1016/j.diin.2018.01.007

Kolosnjaji, B., Zarras, A., Webster, G., & Eckert, C. (2016). Deep learning for classification of malware system call sequences. In B. H. Kang & Q. Bai (Eds.), Australasian joint conference on artificial intelligence (pp. 137-149). Springer. http://cys.ewi.tudelft.nl/~zarras/files/AI_2016_Deep.pdf

Kulkarni, K. (2018). Android Malware Detection through Permission and App Component Analysis using Machine Learning Algorithms [ Master Thesis, University of Toledo]. OhioLINK. http://rave.ohiolink.edu/etdc/view?acc_num=toledo1525454213460236

Liu, Y., Xu, C., Cheung, S. C., & Terragni, V. (2016, November). Understanding and detecting wake lock misuses for android applications. In T. Zimmerman (Ed.), Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering (pp. 396-409). Association for Computing Machinery. http://cse.sustech.edu.cn/faculty/~liuyp/files/FSE2016.pdf

McAfee. (2018). Mobile Threat Report. Retrieved December 3, 2019 from https://www.mcafee.com/enterprise/en-us/assets/reports/rp-mobile-threat-report-2018.pdf

McLaughlin, N., Martinez del Rincon, J., Kang, B., Yerima, S., Miller, P., Sezer, S., ... & Joon Ahn, G. (2017, March). Deep android malware detection. In G.-J. Ahn (Ed.), Proceedings of the seventh ACM on conference on data and application security and privacy (pp. 301-308). Association for Computing Machinery . https://dora.dmu.ac.uk/bitstream/handle/2086/16947/Deep-Android-Malware-Detection.pdf?sequence=1&isAllowed=y

Mezgec, S., Eftimov, T., Bucher, T., & Seljak, B. K. (2019). Mixed Deep Learning and Natural Language Processing Method for Fake-Food Image Recognition and Standardization to Help Automated Dietary Assessment. Public health nutrition, 22(7), 1193-1202. https://doi.org/10.1017/s1368980018000708

Milosevic, N., Dehghantanha, A., & Choo, K.-K. R. (2017). Machine Learning Aided Android Malware Classification. Computers & Electrical Engineering, 61, 266-274. https://doi.org/10.1016/j.compeleceng.2017.02.013

Ng, S. (2017). Principal Component Analysis to Reduce Dimension on Digital Image. Procedia computer science, 111, 113-119. https://doi.org/10.1016/j.procs.2017.06.017

Ozawa, S., Pang, S., & Kasabov, N. (2006). An incremental principal component analysis for chunk data. In 2006 IEEE International Conference on Fuzzy Systems (pp. 2278-2285). IEEE. http://dx.doi.org/10.1109%2FFUZZY.2006.1682016

Qiu, X., Ren, Y., Suganthan, P. N., & Amaratunga, G. A. (2017). Empirical Mode Decomposition Based Ensemble Deep Learning for Load Demand Time Series Forecasting. Applied Soft Computing, 54, 246-255. https://doi.org/10.1016/j.asoc.2017.01.015

Ranjan, R., Patel, V. M., & Chellappa, R. (2017). Hyperface: A Deep Multi-Task Learning Framework for Face Detection, Landmark Localization, Pose Estimation, and Gender Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(1), 121-135. IEEE. https://arxiv.org/pdf/1603.01249.pdf

Rosmansyah, Y., & Dabarsyah, B. (2015). Malware detection on android smartphones using API class and machine learning. In 2015 International Conference on Electrical Engineering and Informatics (ICEEI) (pp. 294-297). IEEE. http://dx.doi.org/10.1109%2FICEEI.2015.7352513

Seo, S.-H., Gupta, A., Sallam, A. M., Bertino, E., & Yim, K. (2014). Detecting Mobile Malware Threats to Homeland Security Through Static Analysis. Journal of Network and Computer Applications, 38, 43-53. http://doi.org/10.1016/j.jnca.2013.05.008

Sugunan, K., Kumar, T. G., & Dhanya, K. (2018). Static and Dynamic Analysis for Android Malware Detection. In E. Blessing Rajsingh, J. Veerasamy, A. H. Alavi & J. Dinesh Peter (Eds.), Advances in Big Data and Cloud Computing (pp. 147-155). Springer.

Tokmak, M., & Küçüksille, E. U. (2019). Detection of Windows Executable Malware Files with Deep Learning. Bilge International Journal of Science and Technology Research, 3(1), 67-76. http://dx.doi.org/10.30516/bilgesci.531801

Tong, F., & Yan, Z. (2017). A Hybrid Approach of Mobile Malware Detection in Android. Journal of Parallel and Distributed computing, 103, 22-31. https://doi.org/10.1016/j.jpdc.2016.10.012

Türker, S., & Can, A. B. (2019). Andmfc: Android malware family classification framework. In 2019 IEEE 30th International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC Workshops) (pp. 1-6). IEEE. http://doi.org/10.1109/pimrcw.2019.8880840

Wei, F., Li, Y., Roy, S., Ou, X., & Zhou, W. (2017). Deep ground truth analysis of current android malware. In M. Polychronakis & M. Meier (Eds.), 14th International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment (pp. 252-276). Springer, Cham. http://www.arguslab.org/documents/tech_reports/2017/amd_fgwei_2017.pdf

Wu, D. J., Mao, C. H., Wei, T. E., Lee, H. M., & Wu, K. P. (2012). Droidmat: Android malware detection through manifest and api calls tracing. In 2012 Seventh Asia Joint Conference on Information Security (pp. 62-69). IEEE. https://doi.org/10.1109/AsiaJCIS.2012.18

Yang, Y., Wei, Z., Xu, Y., He, H., & Wang, W. (2018). Droidward: an Effective Dynamic Analysis Method for Vetting Android Applications. Cluster Computing, 21(1), 265-275. https://doi.org/10.1007/s10586-016-0703-5

Yerima, S. Y., Sezer, S., & McWilliams, G. (2014). Analysis of Bayesian Classification-Based Approaches for Android Malware Detection. IET Information Security, 8(1), 25-36. https://arxiv.org/ftp/arxiv/papers/1608/1608.05812.pdf

Yuan, Z., Lu, Y., & Xue, Y. (2016). Droiddetector: Android Malware Characterization and Detection Using Deep Learning. Tsinghua Science and Technology, 21(1), 114-123. https://doi.org/10.1109/TST.2016.7399288

Zeyer, A., Doetsch, P., Voigtlaender, P., Schlüter, R., & Ney, H. (2017). A comprehensive study of deep bidirectional LSTM RNNs for acoustic modeling in speech recognition. In 2017 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 2462-2466). IEEE. https://www-i6.informatik.rwth-aachen.de/publications/download/1030/Zeyer-ICASSP-2017.pdf

Downloads

Published

2021-12-20

How to Cite

Tokmak, M. ., Küçüksille, E. U., & Köse, U. . (2021). Deep Learning Based Malware Detection Tool Development for Android Operating System. BRAIN. Broad Research in Artificial Intelligence and Neuroscience, 12(4), 28-56. https://doi.org/10.18662/brain/12.4/237

Most read articles by the same author(s)


Publish your work at the Scientific Publishing House LUMEN

It easy with us: publish now your work, novel, research, proceeding at Lumen Scientific Publishing House

Send your manuscript right now