The Role of Big Data and Machine Learning in COVID-19

Authors

  • Mustafa Abdel-Karim Ababneh Faculty of Computer Information System, Near East University, Mersin 10, Turkey
  • Aayat Amin Al-Jarrah Faculty of Computer Information System, Near East University, Mersin 10, Turkey
  • Damla Karagozlu Faculty of Computer Information System, Near East University, Mersin 10, Turkey

DOI:

https://doi.org/10.18662/brain/11.2Sup1/89

Keywords:

artificial intelligence, machine learning, COVID-19, big data

Abstract

The big rise in the existence of digital data contributed to creating many good chances, especially related to corporations, institutions and firms. Also, it gives the capability to scrimp data regarding its major or area, where the countries have benefited from the analysis of big data (BD) greatly in the face of epidemics and diseases, especially COVID-19 since BD is now available everywhere around us, from official reports and scientific studies related to virology and epidemiology. The general aim of this study is to clarify how the conjunction among both BD and machine learning (ML) created huge differences in data science and a big influence on the applications related to a lot of fields chiefly in COVID-19. The method which is used in this study ‘relevance tree’ by identifying papers related to ML and BD, especially in COVID-19. The results have been shown that the use of reinforcement learning in analyzing BD provides effective and tremendous results, although it faces many challenges and restrictions that have been explained in detail in this study. In addition, the results showed that most of the countries in the time of Corona turned into smart cities, totally dependent on smart applications based on the analysis of BD using ML, and one of the most important applications that were circulated around the world global positioning system. In addition to the results that have been found, data privacy is one of the most important challenges facing data analysis. Consequently, it recommended future researchers to focus on studying the challenges faced by ML in analyzing medical data in the COVID-19 era.

Author Biographies

Mustafa Abdel-Karim Ababneh, Faculty of Computer Information System, Near East University, Mersin 10, Turkey

Mustafa Ababneh, received his B.Sc Computer Information System at Jordan University of science and technology, an M.Sc. in Computer Science at Amman Arab University in Amman, Jordan, and now. He is a Ph.D. student in the Computer Information System at Near East University in Mersin 10, Turkey. my research interests focus on Big Data, Social Media, cloud computing, and Information Retrieval. Email: 20194017@std.neu.edu.tr

Aayat Amin Al-Jarrah, Faculty of Computer Information System, Near East University, Mersin 10, Turkey

Ayat Al-Jarrah, received her B.Sc Biomedical Engineering at Yarmouk University, M.Sc. in Computer Science at Amman Arab University in Amman, Jordan and now she is a Ph.D. student in Computer Information System at Near East University in Mersin 10, Turkey. my research interests focus on Big Data, Social Media, cloud computing, and Information Retrieval. Email: 20194007@std.neu.edu.tr

Damla Karagozlu, Faculty of Computer Information System, Near East University, Mersin 10, Turkey

Assist. Professor Dr. Damla Karagozlu received her M.Sc. in Information Technology at Bournemouth University and Ph.D. in Computer Education and Instructional Technology at Near East University. She is currently an assistant professor at the Department of Computer Information Systems at Near East University in Mersin 10, Turkey. Her research interests focus on augmented reality, cloud computing and cybersecurity

References

Alian, S., & Ghatasheh, N. (2014). Multi-agent swarm spreading approach in unknown environments. International Journal of Computer Science Issues (IJCSI), 11(2), 160.‏

Al-Jarrah, O. Y., Yoo, P. D., Muhaidat, S., Karagiannidis, G. K., & Taha, K. (2015). Efficient machine learning for big data: A review. Big Data Research, 2(3), 87-93.‏ https://doi.org/10.1016/j.bdr.2015.04.001

Allam, Z., & Jones, D. S. (2020, March). On the coronavirus (COVID-19) outbreak and the smart city network: universal data sharing standards coupled with artificial intelligence (AI) to benefit urban health monitoring and management. Healthcare, 8(1), 46. https://doi.org/10.3390/healthcare8010046

Alloghani, M., Al-Jumeily, D., Mustafina, J., Hussain, A., & Aljaaf, A. J. (2020). A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science. In M. Berry, A. Mohamed, & B. Yap (Eds.), Supervised and Unsupervised Learning for Data Science (pp. 3-21). Springer, Cham.‏ https://doi.org/10.1007/978-3-030-22475-2_1

Al-Zoubi, A. M., Rodan, A., & Alazzam, A. (2018, November). Classification model for credit data. In 2018 Fifth HCT Information Technology Trends (ITT), Dubai, United Arab Emirates, 2018 (pp. 132-137). https://doi.org/10.1109/CTIT.2018.8649549

Assuncao, M. D., Calheiros, R. N., Bianchi, S., Netto, M. A., & Buyya, R. (2015). Big Data computing and clouds: Trends and future directions. Journal of Parallel and Distributed Computing, 79, 3-15.‏ https://doi.org/10.1016/j.jpdc.2014.08.003

Azzini, A., & Tettamanzi, A. G. (2011). Evolutionary ANNs: a state of the art survey. Intelligenza Artificiale, 5(1), 19-35.‏ https://doi.org/10.3233/IA-2011-0002

Barabasi, A. L., & Albert, R. (1999). Emergence of scaling in random networks. Science, 286(5439), 509-512.‏ https://doi.org/10.1126/science.286.5439.509

Cadez, S. (2013). Social change, institutional pressures and knowledge creation: A bibliometric analysis. Expert systems with applications, 40(17), 6885-6893. https://doi.org/10.1016/j.eswa.2013.06.036

China Academy of Information and Communications Technology (CAICT). (2020). Research Report on Data and Intelligent Application in Epidemic Prevention and Control (1.0). http://www.caict.ac.cn/kxyj/qwfb/ztbg/202003/P020200305495005485729.pdf

Calderon, C. A., Alvarez, M., & Marino, M. V. (2019). Distributed Supervised Sentiment Analysis of Tweets: Integrating Machine Learning and Streaming Analytics for Big Data Challenges in Communication and Audience Research. Empiria: Revista de metodología de ciencias sociales, 42, 113-136.‏ https://doi.org/empiria.42.2019.23254

Cavalcante, I. M., Frazzon, E. M., Forcellini, F. A., & Ivanov, D. (2019). A supervised machine learning approach to data-driven simulation of resilient supplier selection in digital manufacturing. International Journal of Information Management, 49, 86-97.‏ https://doi.org/10.1016/j.ijinfomgt.2019.03.004

Chang, Z., Lei, L., Zhou, Z., Mao, S., & Ristaniemi, T. (2018). Learn to cache: Machine learning for network edge caching in the big data era. IEEE Wireless Communications, 25(3), 28-35.‏ https://doi.org/10.1109/MWC.2018.1700317

Chawla, N. V. (2009). Data mining for imbalanced datasets: An overview. In O., Maimon, & L. Rokach, (Eds.), Data mining and knowledge discovery handbook (pp. 875-886). Springer. https://doi.org/10.1007/0-387-25465-X_40

Chen, B., & Chen, L. (2014). A link prediction algorithm based on ant colony optimization. Applied Intelligence, 41(3), 694-708.‏ https://doi.org/10.1007/s10489-014-0558-5

Chen, H., Li, X., & Huang, Z. (2005). Link prediction approach to collaborative filtering. Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '05), Denver, CO, 2005, pp. 141-142. https://doi.org/10.1145/1065385.1065415 .

Chowdhury, G. G. (2010). Introduction to modern information retrieval (3rd. ed.). Facet Publishing.

De Mauro, A., Greco, M., & Grimaldi, M. (2016). A formal definition of Big Data based on its essential features. Library Review, 65(3), 122-135. https://doi.org/10.1108/LR-06-2015-0061

Demchenko, Y., Grosso, P., De Laat, C., & Membrey, P. (2013). Addressing big data issues in scientific data infrastructure. In 2013 International Conference on Collaboration Technologies and Systems (CTS), San Diego, CA, 2013, 48-55. https://doi.org/10.1109/CTS.2013.6567203

Engle, S., Stromme, J., & Zhou, A. (2020). Staying at Home: Mobility Effects of COVID-19 (April 3, 2020). http://dx.doi.org/10.2139/ssrn.3565703 .

Fan, W., & Bifet, A. (2013). Mining big data: current status, and forecast to the future. ACM sIGKDD Explorations Newsletter, 14(2), 1-5.‏

Gandomi, A., & Haider, M. (2015). Beyond the hype: Big data concepts, methods, and analytics. International journal of information management, 35(2), 137-144.‏ https://doi.org/10.1016/j.ijinfomgt.2014.10.007

Gerdsri, N., Kongthon, A., & Vatananan, R. S. (2013). Mapping the knowledge evolution and professional network in the field of technology roadmapping: a bibliometric analysis. Technology Analysis & Strategic Management, 25(4), 403-422.‏ https://doi.org/10.1080/09537325.2013.774350

Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT press.‏

Green, K. (2020). How GPs can contribute to the challenge of covid-19. BMJ, 369:m1829.‏ https://doi.org/10.1136/bmj.m1829

Heilig, L., & Voß, S. (2014). A scientometric analysis of cloud computing literature. IEEE Transactions on Cloud Computing, 2(03), 266-278. https://doi.org/10.1109/TCC.2014.2321168.

Khan, N., Yaqoob, I., Hashem, I. A. T., Inayat, Z., Ali, A. K. M., Alam, M., Shiraz, M., & Gani, A. (2014). Big data: survey, technologies, opportunities, and challenges. Scientific World Journal, 712826. https://doi.org/10.1155/2014/712826

Khaparde, V., & Pawar, S. (2013). Authorship pattern and degree of collaboration in Information Technology. Journal of Computer Science & Information Technology, 1(1), 46-54.‏

Kibria, M. G., Nguyen, K., Villardi, G. P., Zhao, O., Ishizu, K., & Kojima, F. (2018). Big data analytics, machine learning, and artificial intelligence in next-generation wireless networks. IEEE Access, 6, 32328-32338.‏ https://doi.org/10.1109/ACCESS.2018.2837692

Kolisetty, V. V., & Rajput, D. S. (2020). A Review on the Significance of Machine Learning for Data Analysis in Big Data. Jordanian Journal of Computers and Information Technology (JJCIT), 6(01).‏ https://doi.org/10.5455/jjcit.71-1564729835

Liu, H., Gegov, A., & Cocea, M. (2017). Unified framework for control of machine learning tasks towards effective and efficient processing of big data. In W. Pedrycz, & S-M. Chen (Eds.), Data science and big data: An environment of computational intelligence (pp. 123-140). Springer, Cham.‏

Lo, K., Wang, L. L., Neumann, M., Kinney, R., & Weld, D. S. (2020). S2orc: The semantic scholar open research corpus. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (4969–4983).‏ https://doi.org/10.18653/v1/2020.acl-main.447

Ma, C., Zhang, H. H., & Wang, X. (2014). Machine learning for big data analytics in plants. Trends in plant science, 19(12), 798-808.‏ https://doi.org/10.1016/j.tplants.2014.08.004

Makawana, P. R., & Jhaveri, R. H. (2018). A bibliometric analysis of recent research on machine learning for cyber security. In Intelligent Communication and Computational Technologies (pp. 213-226). Springer.‏ https://doi.org/10.1007/978-981-10-5523-2_20

McCall, B. (2020). COVID-19 and artificial intelligence: protecting health-care workers and curbing the spread. The Lancet Digital Health, 2(4), Article e166-e167.‏ https://doi.org/10.1016/S2589-7500(20)30054-6

Mishra, D., Gunasekaran, A., Papadopoulos, T., & Childe, S. J. (2018). Big Data and supply chain management: a review and bibliometric analysis. Annals of Operations Research, 270(1-2), 313-336.‏ https://doi.org/10.1007/s10479-016-2236-y

Mohammadi, M., & Al-Fuqaha, A. (2018). Enabling cognitive smart cities using big data and machine learning: Approaches and challenges. IEEE Communications Magazine, 56(2), 94-101.‏ https://doi.org/10.1109/MCOM.2018.1700298

Muhuri, P. K., Shukla, A. K., & Abraham, A. (2019). Industry 4.0: A bibliometric analysis and detailed overview. Engineering applications of artificial intelligence, 78, 218-235.‏ https://doi.org/10.1016/j.engappai.2018.11.007

Murphy, K. P. (2012). Machine learning: a probabilistic perspective. MIT press.‏

Najafabadi, M. M., Villanustre, F., Khoshgoftaar, T. M., Seliya, N., Wald, R., & Muharemagic, E. (2015). Deep learning applications and challenges in big data analytics. Journal of Big Data, 2(1), 1.‏ https://doi.org/10.1186/s40537-014-0007-7

National Health Commission of the PRC (NHC). (2020). COVID-19 epidemic situation up to 24:00 on March 8th. http://www.nhc.gov.cn/xcs/yqtb/202005/%20f2c83db9f73d4be5be0dc96af731813c.shtml

New.cn. (2020, May 22). COVID-19 is urged by the UN Secretary General to do everything possible to contain the outbreak. New.cn. http://www.xinhuanet.com/2020-02/29/c_1125642849.htm/.

Nobre, G. C., & Tavares, E. (2017). Scientific literature analysis on big data and internet of things applications on circular economy: a bibliometric study. Scientometrics, 111(1), 463-492.‏ https://doi.org/10.1007/s11192-017-2281-6

Pandey, R., Gautam, V., Bhagat, K., & Sethi, T. (2020). A machine learning application for raising WASH awareness in the times of Covid-19 pandemic. arXiv preprint arXiv:2003.07074.‏

Oussous, A., Benjelloun, F. Z., Lahcen, A. A., & Belfkih, S. (2018). Big Data technologies: A survey. Journal of King Saud University-Computer and Information Sciences, 30(4), 431-448.‏ https://doi.org/10.1016/j.jksuci.2017.06.001

Qiu, H., Qiu, M., & Lu, Z. (2020). Selective encryption on ECG data in body sensor network based on supervised machine learning. Information Fusion, 55, 59-67.‏ https://doi.org/10.1016/j.inffus.2019.07.012

Qiu, J., Wu, Q., Ding, G., Xu, Y., & Feng, S. (2016). A survey of machine learning for big data processing. EURASIP Journal on Advances in Signal Processing, Article 67.‏ https://doi.org/10.1186/s13634-016-0355-x

Read, J., Perez-Cruz, F., & Bifet, A. (2015, April). Deep learning in partially-labeled data streams. In R. L. Wainwright, J. M. Corchado, (Eds.), SAC '15: Proceedings of the 30th Annual ACM Symposium on Applied Computing (pp. 954-959). Association for Computing Machinery.‏ https://doi.org/10.1145/2695664.2695871

Saunders, M., Lewis, P., & Thornhill, A. (2009). Research methods for business students (5th edition). Perntice Hall.‏

Shalev-Shwartz, S., & Ben-David, S. (2014). Understanding machine learning: From theory to algorithms. Cambridge University Press.‏

Sharma, A., Singh, G., & Rehman, S. (2020). A Review of Big Data Challenges and Preserving Privacy in Big Data. In M. Kolhe, S. Tiwari, M. Trivedi, K. Mishra (Eds.), Advances in Data and Information Sciences (pp. 57-65). Springer.‏ https://doi.org/10.1007/978-981-15-0694-9_7

Shvachko, K., Kuang, H., Radia, S., & Chansler, R. (2010, May). The hadoop distributed file system. In 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), Incline Village, (pp. 1-10). https://doi.org/10.1109/MSST.2010.5496972

Singh, V. K., Banshal, S. K., Singhal, K., & Uddin, A. (2015). Scientometric mapping of research on ‘Big Data’. Scientometrics, 105(2), 727-741.‏ https://doi.org/10.1007/s11192-015-1729-9

Sivarajah, U., Kamal, M. M., Irani, Z., & Weerakkody, V. (2017). Critical analysis of Big Data challenges and analytical methods. Journal of Business Research, 70, 263-286.‏ https://doi.org/10.1016/j.jbusres.2016.08.001

Sughasiny, M., & Rajeshwari, J. (2018, August). Application Of Machine Learning Techniques, Big Data Analytics In Health Care Sector–A Literature Survey. 2018 2nd International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC)I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), 2018 2nd International Conference on, Palladam, India, 2018, (741-749). https://doi.org/10.1109/I-SMAC.2018.8653654

Sweileh, W. M., Al-Jabi, S. W., AbuTaha, A. S., Zyoud, S. H., Anayah, F., & Sawalha, A. F. (2017). Bibliometric analysis of worldwide scientific literature in mobile - health: 2006-2016. BMC medical informatics and decision making, 17(1), 72. https://doi.org/10.1186/s12911-017-0476-7

Tseng, S. F., Won, Y. L., & Yang, J. M. (2016). A bibliometric analysis on Data Mining and Big Data. International Journal of Electronic Business, 13(1), 38-69.‏ https://doi.org/10.1504/IJEB.2016.075333

Tuite, A. R., Ng, V., Rees, E., Fisman, D., Wilder-Smith, A., Khan, K., & Bogoch, I. I. (2020). Estimation of the COVID-19 burden in Egypt through exported case detection. The Lancet Infectious Diseases.‏ https://doi.org/10.1016/S1473-3099(20)30233-4

Uddin, M. F., & Gupta, N. (2014, April). Seven V's of Big Data understanding Big Data to extract value. In Proceedings of the 2014 Zone 1 Conference of the American Society for Engineering Education, Bridgeport, CT, 2014, (pp. 1-5). https://doi.org/10.1109/ASEEZone1.2014.6820689

Van Eck, N., & Waltman, L. (2010). Software survey: VOSviewer, a computer program for bibliometric mapping. Scientometrics, 84(2), 523-538.‏ https://doi.org/10.1007/s11192-009-0146-3

Wang, L. L., Lo, K., Chandrasekhar, Y., Reas, R., Yang, J., Eide, D., Burdick, D., Eide, D., Funk, K., Katsis, Y., Kinney, R., Li, Y., Liu, Z., Merrill, W., Mooney, P., Murdick, D., Rishi, D., Sheehan, J., Shen, Z., ..., & Kohlmeier, S. (2020). CORD-19: The Covid-19 Open Research Dataset. arXiv preprint arXiv:2004.10706.

Wang, L., Wang, G., & Alexander, C. A. (2015). Natural language processing systems and Big Data analytics. International Journal of Computational Systems Engineering, 2(2), 76-84.‏ https://doi.org/10.1504/IJCSYSE.2015.077052

Ward, J. S., & Barker, A. (2013). Undefined by data: a survey of big data definitions. arXiv preprint arXiv:1309.5821.‏

World Health Organization (WHO). (2020, May 22). Coronavirus disease (COVID-2019) situation Re- ports. https://www.who.int/emergencies/diseases/novel-coronavirus-2019/situation- reports/

Xiang, L., Zhao, G., Li, Q., Hao, W., & Li, F. (2018). TUMK-ELM: a fast unsupervised heterogeneous data learning approach. IEEE Access, 6, 35305-35315.‏ https://doi.org/10.1109/ACCESS.2018.2847037

Yi, X., Liu, F., Liu, J., & Jin, H. (2014). Building a network highway for big data: architecture and challenges. Ieee Network, 28(4), 5-13.‏ https://doi.org/10.1109/MNET.2014.6863125

Zhou, L., Pan, S., Wang, J., & Vasilakos, A. V. (2017). Machine learning on big data: Opportunities and challenges. Neurocomputing, 237, 350-361.‏ https://doi.org/10.1109/ACCESS.2017.2696365

Zhu, C., Cao, L., Liu, Q., Yin, J., & Kumar, V. (2018). Heterogeneous metric learning of categorical data with hierarchical couplings. IEEE Transactions on Knowledge and Data Engineering, 30(7), 1254-1267.‏ https://doi.org/10.1109/TKDE.2018.2791525

Downloads

Published

2020-09-04

How to Cite

Ababneh, M. A.-K., Al-Jarrah, A. A., & Karagozlu, D. (2020). The Role of Big Data and Machine Learning in COVID-19. BRAIN. Broad Research in Artificial Intelligence and Neuroscience, 11(2Sup1), 01-20. https://doi.org/10.18662/brain/11.2Sup1/89

Publish your work at the Scientific Publishing House LUMEN

It easy with us: publish now your work, novel, research, proceeding at Lumen Scientific Publishing House

Send your manuscript right now