Journal Home Online First Current Issue Archive For Authors Journal Information 中文版

Frontiers of Information Technology & Electronic Engineering >> 2022, Volume 23, Issue 3 doi: 10.1631/FITEE.2000436

Generic, efficient, and effective deobfuscation and semantic-aware attack detection for PowerShell scripts

Affiliation(s): College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China; Department of Electrical Engineering and Computer Science, Northwestern University, Evanston, IL 60208, USA; College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou 310023, China; Magic Shield Co., Ltd., Hangzhou 310027, China; College of Control Science and Engineering, Zhejiang University, Hangzhou 310027, China; less

Received: 2020-08-28 Accepted: 2022-03-22 Available online: 2022-03-22

Next Previous

Abstract

In recent years, has increasingly been reported as appearing in a variety of cyber attacks. However, because the language is dynamic by design and can construct script fragments at different levels, state-of-the-art static analysis based attack detection approaches are inherently vulnerable to obfuscations. In this paper, we design the first generic, effective, and lightweight deobfuscation approach for scripts. To precisely identify the obfuscated script fragments, we define obfuscation based on the differences in the impacts on the s of scripts and propose a novel emulation-based recovery technology. Furthermore, we design the first semantic-aware attack detection system that leverages the classic objective-oriented association mining algorithm and newly identifies 31 semantic signatures. The experimental results on 2342 benign samples and 4141 malicious samples show that our deobfuscation method takes less than 0.5 s on average and increases the similarity between the obfuscated and original scripts from 0.5% to 93.2%. By deploying our deobfuscation method, the attack detection rates for Windows Defender and VirusTotal increase substantially from 0.33% and 2.65% to 78.9% and 94.0%, respectively. Moreover, our detection system outperforms both existing tools with a 96.7% true positive rate and a 0% false positive rate on average.

Related Research