Journal Home Online First Current Issue Archive For Authors Journal Information 中文版

Frontiers of Information Technology & Electronic Engineering >> 2022, Volume 23, Issue 3 doi: 10.1631/FITEE.2000709

Automatic protocol reverse engineering for industrial control systems with dynamic taint analysis

Affiliation(s): State Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou 450001, China; Zhejiang University NGICS Platform, Hangzhou 310000, China; less

Received: 2020-12-19 Accepted: 2022-03-22 Available online: 2022-03-22

Next Previous

Abstract

Proprietary (or semi-proprietary) protocols are widely adopted in industrial control systems (ICSs). Inferring by reverse engineering is important for many network security applications, e.g., program tests and intrusion detection. Conventional protocol reverse engineering methods have been proposed which are considered time-consuming, tedious, and error-prone. Recently, automatical protocol reverse engineering methods have been proposed which are, however, neither effective in handling binary-based ICS protocols based on network traffic analysis nor accurate in extracting protocol fields from protocol implementations. In this paper, we present a framework called the industrial control system protocol reverse engineering framework (ICSPRF) that aims to extract ICS protocol fields with high accuracy. ICSPRF is based on the key insight that an individual field in a message is typically handled in the same execution context, e.g., basic block (BBL) group. As a result, by monitoring program execution, we can collect the tainted data information processed in every BBL group in the execution trace and cluster it to derive the . We evaluate our approach with six open-source ICS protocol implementations. The results show that ICSPRF can identify individual protocol fields with high accuracy (on average a 94.3% match ratio). ICSPRF also has a low coarse-grained and overly fine-grained match ratio. For the same metric, ICSPRF is more accurate than AutoFormat (88.5% for all evaluated protocols and 80.0% for binary-based protocols).

Related Research