Deep Learning Based Malware Detection Tool Development for Android Operating System
Keywords:Android malware analysis, static analysis, dynamic analysis, hybrid analysis, deep learning
In today's world that called technology age, smartphones have become indispensable for users in many areas such as internet usage, social media usage, bank transactions, e-mail, as well as communication. The Android operating system is the most popular operating system that used with a rate of 85.4% in smartphones and tablets. Such a popular and widely used platform has become the target of malware. Malicious software can cause both material and moral damages to users.
In this study, malwares that targeting smart phones were detected by using static, dynamic and hybrid analysis methods. In the static analysis, feature extraction was made in 9 different categories. These attributes are categorized under the titles of requested permissions, intents, Android components, Android application calls, used permissions, unused permissions, suspicious Android application calls, system commands, internet addresses. The obtained features were subjected to dimension reduction with principal component analysis and used as input to the deep neural network model. With the established model, 99.38% accuracy rate, 99.36% F1 score, 99.32% precision and 99.39% sensitivity values were obtained in the test data set.
In the dynamic analysis part of the study, applications were run on a virtual smartphone, and Android application calls with strategic importance were obtained by hooking. The method called hybrid analysis was applied by combining the dynamically obtained features with the static features belonging to the same applications. With the established model, 96.94% accuracy rate, 96.78% F1 score, 96.99% precision and 96.59% sensitivity values were obtained in the test data set.
Aafer, Y., Du, W., & Yin, H. (2013). DroidAPIminer: Mining api-level features for robust malware detection in android. In T. Zia, A. Zomaya, V. Varadharajan & M. Mao (Eds.), International conference on security and privacy in communication systems (pp. 86-103). Springer. https://www.cs.ucr.edu/~heng/pubs/droidapiminer-securecomm13.pdf
Alshahrani, H., Mansourt, H., Thorn, S., Alshehri, A., Alzahrani, A., & Fu, H. (2018). Ddefender: Android Application Threat Detection Using Static and Dynamic Analysis. In 2018 IEEE International Conference on Consumer Electronics (ICCE), Las Vegas, USA, (pp. 1-6). https://drive.google.com/file/d/1CQfDO7VwqwYzkrYrWPdsbyST8pWqwalE/view
Alzaylaee, M. K., Yerima, S. Y., & Sezer, S. (2016). Dynalog: An Automated Dynamic Analysis Framework for Characterizing Android Applications. In 2016 International Conference on Cyber Security and Protection Of Digital Services (Cyber Security), London, United Kingdom (pp. 1-8). https://arxiv.org/ftp/arxiv/papers/1607/1607.08166.pdf
Alzaylaee, M. K., Yerima, S. Y., & Sezer, S. (2020). DL-Droid: Deep Learning Based Android Malware Detection Using Real Devices. Computers & Security, 89, 101663. https://doi.org/10.1016/j.cose.2019.101663
AMD. (2018). Android Malware Dataset. Retrieved May 10, 2018 from http://amd.arguslab.org/
Anagnostopoulos, M., Kambourakis, G., & Gritzalis, S. (2016). New Facets of Mobile Botnet: Architecture and Evaluation. International Journal of Information Security, 15(5), 455-473. http://doi.org/10.1007/s10207-015-0310-0
Android Wake Lock Research. (2018). Obtain the commercial Android apps. Retrieved May 06, 2018, from http://sccpu2.cse.ust.hk/elite/downloadApks.html
Arp, D., Spreitzenbarth, M., Hubner, M., Gascon, H., Rieck, K., & Siemens, C. E. R. T. (2014). Drebin: Effective and explainable detection of android malware in your pocket. In Network and Distributed System Security Symposium, 14 (pp. 23-26). The Internet Society. http://user.cs.uni-goettingen.de/~krieck/docs/2014-ndss.pdf
Au, K. W. Y., Zhou, Y. F., Huang, Z., & Lie, D. (2012). Pscout: Analyzing the Android Permission Specification. In T. Yu (Ed.), Proceedings of the 2012 ACM Conference on Computer and Communications Security, Raleigh North Carolina, USA (pp. 217-228). Association for Computing Machinery. http://dx.doi.org/10.1145/2382196.2382222
Barros, P., Parisi, G. I., Weber, C., & Wermter, S. (2017). Emotion-Modulated Attention Improves Expression Recognition: A Deep Learning Model. Neurocomputing, 253, 104-114. http://dx.doi.org/10.1016/j.neucom.2017.01.096
Bhandari, S., Gupta, R., Laxmi, V., Gaur, M. S., Zemmari, A., & Anikeev, M. (2015). DRACO: DRoid analyst combo an android malware analysis framework. In O. Makarevich (Ed.), Proceedings of the 8th International Conference on Security of Information and Networks (pp. 283-289). Association for Computing Machinery. https://doi.org/10.1145/2799979.2800003
Bhilvare, A., & Manik, T. (2015). An Overview of Different Malware Analysis Techniques in Android. IJSRD - International Journal for Scientific Research & Development, 3(1), 368-372. http://www.ijsrd.com/articles/IJSRDV3I1264.pdf
Cordonsky, I., Rosenberg, I., Sicard, G., & David, E. O. (2018). DeepOrigin: End-to-end deep learning for detection of new malware families. In 2018 International Joint Conference on Neural Networks (IJCNN) (pp. 1-7). IEEE. https://elidavid.com/pubs/deeporigin.pdf
Cui, Z., Xue, F., Cai, X., Cao, Y., Wang, G.-g., & Chen, J. (2018). Detection of Malicious Code Variants Based on Deep Learning. IEEE Transactions on Industrial Informatics, 14(7), 3187-3196. https://doi.org/10.1109/TII.2018.2822680
Feizollah, A., Anuar, N. B., Salleh, R., Suarez-Tangil, G., & Furnell, S. (2017). Androdialysis: Analysis of Android Intent Effectiveness in Malware Detection. Computers & Security, 65, 121-134. https://doi.org/10.1016/j.cose.2016.11.007
Fenton, C. (2018). GitHub. Retrieved July 6, 2018 from https://github.com/CalebFenton/apkfile
Fereidooni, H., Conti, M., Yao, D., & Sperduti, A. (2016,). ANASTASIA: ANdroid mAlware detection using STatic analySIs of Applications. In 2016 8th IFIP international conference on new technologies, mobility and security (NTMS) (pp. 1-5). IEEE. https://doi.org/10.1109/NTMS.2016.7792435
Google Play. (2018). Retrieved December 11, 2018 from https://www.android.com/play-protect/
Hall, P. M., Marshall, A. D., & Martin, R. R. (1998). Incremental Eigenanalysis for Classification. In J. N. Carter & M. S. Nixon (Eds.), Proceedings of British machine vision conference (pp. 286-295). BMVC. http://www.bmva.org/bmvc/1998/pdf/p186.pdf
Hou, S., Saas, A., Chen, L., & Ye, Y. (2016). Deep4maldroid: A deep learning framework for android malware detection based on linux kernel system call graphs. In 2016 IEEE/WIC/ACM International Conference on Web Intelligence Workshops (WIW) (pp. 104-111). IEEE. https://doi.ieeecomputersociety.org/10.1109/WIW.2016.040
Hou, S., Saas, A., Chen, L., Ye, Y., & Bourlai, T. (2017). Deep neural networks for automatic android malware detection. In Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017 (pp. 803-810). https://doi.org/10.1145/3110025.3116211
IDC. (2020). International Data Corporation. Retrieved December 2, 2020 from https://www.idc.com/promo/smartphone-market-share
Idrees, F., Rajarajan, M., Conti, M., Chen, T. M., & Rahulamathavan, Y. (2017). PIndroid: A novel Android Malware Detection System Using Ensemble Learning Methods. Computers & Security, 68, 36-46. https://doi.org/10.1016/j.cose.2017.03.011
Karbab, E. B., Debbabi, M., Derhab, A., & Mouheb, D. (2018). MalDozer: Automatic framework for android malware detection using deep learning. Digital Investigation, 24, 48-59. https://doi.org/10.1016/j.diin.2018.01.007
Kolosnjaji, B., Zarras, A., Webster, G., & Eckert, C. (2016). Deep learning for classification of malware system call sequences. In B. H. Kang & Q. Bai (Eds.), Australasian joint conference on artificial intelligence (pp. 137-149). Springer. http://cys.ewi.tudelft.nl/~zarras/files/AI_2016_Deep.pdf
Kulkarni, K. (2018). Android Malware Detection through Permission and App Component Analysis using Machine Learning Algorithms [ Master Thesis, University of Toledo]. OhioLINK. http://rave.ohiolink.edu/etdc/view?acc_num=toledo1525454213460236
Liu, Y., Xu, C., Cheung, S. C., & Terragni, V. (2016, November). Understanding and detecting wake lock misuses for android applications. In T. Zimmerman (Ed.), Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering (pp. 396-409). Association for Computing Machinery. http://cse.sustech.edu.cn/faculty/~liuyp/files/FSE2016.pdf
McAfee. (2018). Mobile Threat Report. Retrieved December 3, 2019 from https://www.mcafee.com/enterprise/en-us/assets/reports/rp-mobile-threat-report-2018.pdf
McLaughlin, N., Martinez del Rincon, J., Kang, B., Yerima, S., Miller, P., Sezer, S., ... & Joon Ahn, G. (2017, March). Deep android malware detection. In G.-J. Ahn (Ed.), Proceedings of the seventh ACM on conference on data and application security and privacy (pp. 301-308). Association for Computing Machinery . https://dora.dmu.ac.uk/bitstream/handle/2086/16947/Deep-Android-Malware-Detection.pdf?sequence=1&isAllowed=y
Mezgec, S., Eftimov, T., Bucher, T., & Seljak, B. K. (2019). Mixed Deep Learning and Natural Language Processing Method for Fake-Food Image Recognition and Standardization to Help Automated Dietary Assessment. Public health nutrition, 22(7), 1193-1202. https://doi.org/10.1017/s1368980018000708
Milosevic, N., Dehghantanha, A., & Choo, K.-K. R. (2017). Machine Learning Aided Android Malware Classification. Computers & Electrical Engineering, 61, 266-274. https://doi.org/10.1016/j.compeleceng.2017.02.013
Ng, S. (2017). Principal Component Analysis to Reduce Dimension on Digital Image. Procedia computer science, 111, 113-119. https://doi.org/10.1016/j.procs.2017.06.017
Ozawa, S., Pang, S., & Kasabov, N. (2006). An incremental principal component analysis for chunk data. In 2006 IEEE International Conference on Fuzzy Systems (pp. 2278-2285). IEEE. http://dx.doi.org/10.1109%2FFUZZY.2006.1682016
Qiu, X., Ren, Y., Suganthan, P. N., & Amaratunga, G. A. (2017). Empirical Mode Decomposition Based Ensemble Deep Learning for Load Demand Time Series Forecasting. Applied Soft Computing, 54, 246-255. https://doi.org/10.1016/j.asoc.2017.01.015
Ranjan, R., Patel, V. M., & Chellappa, R. (2017). Hyperface: A Deep Multi-Task Learning Framework for Face Detection, Landmark Localization, Pose Estimation, and Gender Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(1), 121-135. IEEE. https://arxiv.org/pdf/1603.01249.pdf
Rosmansyah, Y., & Dabarsyah, B. (2015). Malware detection on android smartphones using API class and machine learning. In 2015 International Conference on Electrical Engineering and Informatics (ICEEI) (pp. 294-297). IEEE. http://dx.doi.org/10.1109%2FICEEI.2015.7352513
Seo, S.-H., Gupta, A., Sallam, A. M., Bertino, E., & Yim, K. (2014). Detecting Mobile Malware Threats to Homeland Security Through Static Analysis. Journal of Network and Computer Applications, 38, 43-53. http://doi.org/10.1016/j.jnca.2013.05.008
Sugunan, K., Kumar, T. G., & Dhanya, K. (2018). Static and Dynamic Analysis for Android Malware Detection. In E. Blessing Rajsingh, J. Veerasamy, A. H. Alavi & J. Dinesh Peter (Eds.), Advances in Big Data and Cloud Computing (pp. 147-155). Springer.
Tokmak, M., & Küçüksille, E. U. (2019). Detection of Windows Executable Malware Files with Deep Learning. Bilge International Journal of Science and Technology Research, 3(1), 67-76. http://dx.doi.org/10.30516/bilgesci.531801
Tong, F., & Yan, Z. (2017). A Hybrid Approach of Mobile Malware Detection in Android. Journal of Parallel and Distributed computing, 103, 22-31. https://doi.org/10.1016/j.jpdc.2016.10.012
Türker, S., & Can, A. B. (2019). Andmfc: Android malware family classification framework. In 2019 IEEE 30th International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC Workshops) (pp. 1-6). IEEE. http://doi.org/10.1109/pimrcw.2019.8880840
Wei, F., Li, Y., Roy, S., Ou, X., & Zhou, W. (2017). Deep ground truth analysis of current android malware. In M. Polychronakis & M. Meier (Eds.), 14th International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment (pp. 252-276). Springer, Cham. http://www.arguslab.org/documents/tech_reports/2017/amd_fgwei_2017.pdf
Wu, D. J., Mao, C. H., Wei, T. E., Lee, H. M., & Wu, K. P. (2012). Droidmat: Android malware detection through manifest and api calls tracing. In 2012 Seventh Asia Joint Conference on Information Security (pp. 62-69). IEEE. https://doi.org/10.1109/AsiaJCIS.2012.18
Yang, Y., Wei, Z., Xu, Y., He, H., & Wang, W. (2018). Droidward: an Effective Dynamic Analysis Method for Vetting Android Applications. Cluster Computing, 21(1), 265-275. https://doi.org/10.1007/s10586-016-0703-5
Yerima, S. Y., Sezer, S., & McWilliams, G. (2014). Analysis of Bayesian Classification-Based Approaches for Android Malware Detection. IET Information Security, 8(1), 25-36. https://arxiv.org/ftp/arxiv/papers/1608/1608.05812.pdf
Yuan, Z., Lu, Y., & Xue, Y. (2016). Droiddetector: Android Malware Characterization and Detection Using Deep Learning. Tsinghua Science and Technology, 21(1), 114-123. https://doi.org/10.1109/TST.2016.7399288
Zeyer, A., Doetsch, P., Voigtlaender, P., Schlüter, R., & Ney, H. (2017). A comprehensive study of deep bidirectional LSTM RNNs for acoustic modeling in speech recognition. In 2017 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 2462-2466). IEEE. https://www-i6.informatik.rwth-aachen.de/publications/download/1030/Zeyer-ICASSP-2017.pdf
How to Cite
Copyright (c) 2021 The Authors & LUMEN Publishing House
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant this journal right of first publication, with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work, with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g. post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g. in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as an earlier and greater citation of published work (See The Effect of Open Access).
BRAIN. Broad Research in Artificial Intelligence and Neuroscience Journal has an Attribution-NonCommercial-NoDerivs