Journal Home Online First Current Issue Archive For Authors Journal Information 中文版

Frontiers of Information Technology & Electronic Engineering >> 2017, Volume 18, Issue 9 doi: 10.1631/FITEE.1601325

Automatic malware classification and new malwaredetection using machine learning

College of Computer, National Universityof Defense Technology, Changsha 410073, China

Available online: 2018-01-18

Next Previous

Abstract

The explosive growth of malware variants poses a major threatto information security. Traditional anti-virus systems based on signaturesfail to classify unknown malware into their corresponding familiesand to detect new kinds of malware programs. Therefore, we proposea machine learning based malware analysis system, which is composedof three modules: data processing, decision making, and new malwaredetection. The data processing module deals with gray-scale images,Opcode n-gram, and import functions, which are employed to extractthe features of the malware. The decision-making module uses the featuresto classify the malware and to identify suspicious malware. Finally,the detection module uses the shared nearest neighbor (SNN) clusteringalgorithm to discover new malware families. Our approach is evaluatedon more than 20 000 malware instances, which were collected by Kingsoft,ESET NOD32, and Anubis. The results show that our system can effectivelyclassify the unknown malware with a best accuracy of 98.9%, and successfullydetects 86.7% of the new malware.

Related Research