Machine Learning Against Malwares: Analysis On Windows Systems Based in Pattern Recognition Algorithms

Proceedings of ‏The 3rd International Conference on Research in Science, Engineering and Technology

Year: 2021

DOI: https://www.doi.org/10.33422/3rd.icrset.2021.03.70

[Fulltext PDF]

Machine Learning Against Malwares: Analysis On Windows Systems Based in Pattern Recognition Algorithms

Victor Picinin Veloso Senna, Gustavo Alves Fernandes, Antônio Ricardo Leocádio Gomes, Luiz Melk de Carvalho, Flávio Henrique Batista de Souza

 

ABSTRACT: 

In 2020, due to the pandemic by COVID-19, the use of digital resources increased considerably, unprecedented for providers and developers of digital services. An issue that accompanies such a process is the security and identification of malware, which spreads at a speed never seen before. Thus, the objective of this paper is to demonstrate an evaluation, based on an experimental process, of pattern recognition algorithms (Multilayer Perceptron-MLP; Naive Bayes, K-Nearest Neighbor and Decision Tree) for the recognition of malware patterns in Portable Executable (PE) Files Structures. As a methodology for experimentation, three steps were taken: collection of malware samples in a repository known as Virusshare (around 285 malware samples – from 6 different families – for Windows® from 2012 to 2019) and PE files (initially 549 were randomly selected PE files from “System32” and “Program and Files”) totaling 834 samples; 2,500 features were defined to compose the analysis dataset; finally, the test methodologies were defined according to the pattern recognition algorithm (e.g. in MLP, the number of hidden layers and the number of neurons in these hidden layers were varied). As a result, after 100 execution of each algorithm configuration, an accuracy ranging from 64.5% (Naive Bayes) to 95% (MLP) was obtained.

Keywords: Malwares, Machine Learning, Pattern Recognition, Windows® Systems, Detection Methods.