Journal Home Online First Current Issue Archive For Authors Journal Information 中文版

Frontiers of Information Technology & Electronic Engineering >> 2017, Volume 18, Issue 12 doi: 10.1631/FITEE.1700626

ONFS: a hierarchical hybrid file system based on memory, SSD, andHDDfor high performance computers

. School of Computer, National University of Defense Technology, Changsha 410073, China.. Computer Science and Engineering, University of Nebraska-Lincoln, Lincoln 68588, USA.. National Supercomputing Center, Tianjin 300457, China.

Available online: 2018-03-08

Next Previous

Abstract

With supercomputers developing towards exascale, the number of compute cores increases dramatically, making more complex and larger-scale applications possible. The input/output (I/O) requirements of large-scale applications, workflow applications, and their checkpointing include substantial bandwidth and an extremely low latency, posing a serious challenge to high performance computing (HPC) storage systems. Current hard disk drive (HDD) based underlying storage systems are becoming more and more incompetent to meet the requirements of next-generation exascale supercomputers. To rise to the challenge, we propose a hierarchical hybrid storage system, on-line and near-line file system (ONFS). It leverages dynamic random access memory (DRAM) and solid state drive (SSD) in compute nodes, and HDD in storage servers to build a three-level storage system in a unified namespace. It supports portable operating system interface (POSIX) semantics, and provides high bandwidth, low latency, and huge storage capacity. In this paper, we present the technical details on distributed metadata management, the strategy of memory borrow and return, data consistency, parallel access control, and mechanisms guiding downward and upward migration in ONFS. We implement an ONFS prototype on the TH-1A supercomputer, and conduct experiments to test its I/O performance and scalability. The results show that the bandwidths of single-thread and multi-thread ‘read’/‘write’ are 6-fold and 5-fold better than HDD-based Lustre, respectively. The I/O bandwidth of data-intensive applications in ONFS can be 6.35 times that in Lustre.

Related Research