Journal Home Online First Current Issue Archive For Authors Journal Information 中文版

Strategic Study of Chinese Academy of Engineering >> 2022, Volume 24, Issue 4 doi: 10.15302/J-SSCAE-2022.04.014

Next-Generation Database Benchmark for Financial Scenarios

1. Institute of Financial Technology, Fudan University, Shanghai 200433, China;

2. School of Computer Science, Fudan University, Shanghai 200433, China

Received:2022-06-19 Revised:2022-07-14 Available online:2022-08-04

Next Previous

Abstract

As the major financial entity in China, banks have high performance and security requirements for databases and data service solutions. With the progression of data application services in banking, the data types and business scenarios become more diverse, and it is difficult for users to make optimal choices among a wide diversity of database products and data service solutions. In combination with the data application demands of the financial industry, this study comprehensively analyzes the current status of applications of databases in banking, particularly the status and challenges of database localization in recent years, by using literature research and theoretical analysis. In addition, we systematically investigate the database benchmarks of China and other countries, and further prospect the necessity and importance of constructing next-generation database benchmarks for financial scenarios. We find that current database benchmarks have many deficiencies and face various challenges in dealing with the database testing in financial scenarios owing to the complex business logic, diverse data patterns, and high security requirements. Therefore, to build a nextgeneration database benchmark that can meet the requirements of financial scenarios, we propose several suggestions to address these challenges, which involve the aspects of workloads, data schemes, metrics, and technical architecture.

Image

图1

图1

图1

图2

图2

图2

图3

图3

图3

References

[1]  林毅夫 , 付才辉 , 任晓猛 . 金融创新如何推动高质量发展: 新结构经济学的视角 [J]. 金融论坛 , 2019 , 24 11 : 3 ‒ 13 .

[2]  中国人民银行 . 2021年末我国金融业机构总资产 381 . 95 万亿元 [EBOL]. 2022-03-15 [ 2022-06-22 ]. http:www.pbc.gov.cngoutongjiaoliu1134561134694507972index.html .

[3]  甲子光年智库 . 中国金融科技系列报告 [ROL]. 2020-08-11 [ 2022-06-06 ]. https:www.jazzyear.comstudy_list.html?classifyName2=金融科技classifyName3=全部classifyName4=全部 .

[4]  中国人民银行 . 中国人民银行印发《金融科技发展规划2022—2025年》 [EBOL]. 2022-01-04 [ 2022-06-06 ]. http:www.pbc.gov.cngoutongjiaoliu1134561134694438627index.html .

[5]  胡利明 . 分布式数据库在金融行业的应用和展望 [J]. 金融科技时代 , 2020 5 : 25 ‒ 33 .

[6]  Poess M , Floyd C . New TPC benchmarks for decision support and web commerce [J]. ACM Special Interest Group on Management of Data Record , 2000 , 29 4 : 64 ‒ 71 .

[7]  Nambiar R O , Poess M . The making of TPC-DS [C]. Seoul : Proceedings of the 32nd International Conference on Very Large Data Bases , 2006 .

[8]  中国信息通信研究院 . 数据库发展研究报告2021年 [R]. 北京 : 中国信息通信研究院 , 2021 .

[9]  ITpub技术栈 . 激荡三十年: 银行数据库的发展与变迁 [EBOL]. 2021-04-02 [ 2022-06-06 ]. https:z.itpub.netarticledetailCE307F44933F633B8EB297FE3CF7379E .

[10]  中国人民银行 . 中国人民银行印发《金融科技FinTech发展规划2019—2021年》 [EBOL]. 2019-08-22 [ 2022-06-06 ]. http:www.pbc.gov.cngoutongjiaoliu113 4561134693878634index.html .

[11]  全国金融标准化技术委员会 . 《分布式数据库技术金融应用规范 技术架构》等3项金融行业标准正式发布 [EBOL]. 2020-12-25 [ 2022-06-06 ]. https:www.cfstc. orgjinbiaowei29294362978097index.html .

[12]  王飞鹏 . 追求卓越 舐砺前行——中信银行GoldenDB分布式数据库转型实践 [J]. 金融电子化 , 2020 2 : 76 ‒ 78 .

[13]  李肇宁 . 分布式数据库金融应用稳步有序推进 [J]. 金融电子化 , 2020 12 : 34 ‒ 35 .

[14]  戴功旺 . 构建"新生态", 探索金融行业分布式数据库发展之路 [J]. 中国金融电脑 , 2021 7 : 85 ‒ 86 .

[15]  Leutenegger S T , Dias D M . A modeling study of the TPC-C benchmark [C]. Washington DC : Proceedings of the 1993 ACM International Conference on Management of Data , 1993 .

[16]  计算机学会数据库专业委会 , 清华大学 , 墨天轮社区 . 数据库系统的分类和测评研究 [EBOL]. 2021-12-22 [ 2022-06-06 ]. https:www.modb.prodoc52857 .

[17]  金澈清 , 钱卫宁 , 周敏奇 , 等 . 数据管理系统评测基准: 从传统数据库到新兴大数据 [J]. 计算机学报 , 2015 , 38 1 : 18 ‒ 34 .

[18]  闫义博 , 朱文强 , 杨仝 , 等 . 大数据系统Benchmark测试综述 [J]. 网络新媒体技术 , 2018 , 7 3 : 6 ‒ 13 .

[19]  Bitton D , DeWitt D J , Turbyfill C . Benchmarking database systems—A systematic approach [R]. Madison : University of Wisconsin-Madison , 1983 .

[20]  Xin R , Mokhtar M . Databricks sets official data warehousing performance record [EBOL]. 2021-11-02 [ 2022-06-06 ]. https:databricks.‍comblog20211102databricks-sets-official-data-warehousing-performance-record.html . link1

[21]  Dageville B , Cruanes T . Industry benchmarks and competing with integrity [EBOL]. 2021-11-12 [ 2022-06-06 ]. https:www.snowflake.comblogindustry-bench-marks-and-competing-with-integrity . link1

[22]  Mokhtar M , Tavakoli-Shiraji A , Xin R , et al . Snowflake claims similar priceperformance to data-bricks, but not so fast! [EBOL]. 2021-11-15 [ 2022-06-06 ]. https:databricks.comblog20211115snowflake-claims-similar-price-performance-to-databricks-but-not-so-fast.html . link1

[23]  Cao P , Gowda B , Lakshmi S , et al . From BigBench to TPCx-BB: Standardization of a big data benchmark [C]. New Delhi : 8th TPC Technology Conference , 2016 : 24 ‒ 44 .

[24]  Hao Y , Qin X , Chen Y , et al . TS-Benchmark: A benchmark for time series databases [C]. Chania : 37th IEEE International Conference on Data Engineering , 2021 .

[25]  Murphy R C , Wheeler K B , Barrett B W , et al . Introducing the graph 500 [J]. Cray Users Group , 2010 , 19 : 45 ‒ 74 .

[26]  Dreseler M , Boissier M , Rabl T , et al . Quantifying TPC-H choke points and their optimizations [J]. Proceedings of the VLDB Endowment , 2020 , 13 8 : 1206 ‒ 1220 .

[27]  O´Neil P E , O´Neil E J , Chen X , et al . The star schema benchmark and augmented fact table indexing [C]. Lyon : First TPC Technology Conference , 2009 .

[28]  Ghazal A , Rabl T , Hu M , et al . Bigbench: Towards an industry standard benchmark for big data analytics [C]. New York : The 2013 ACM International Conference on Management of Data , 2013 .

[29]  Eichmann P , Zgraggen E , Binnig C , et al . IDEBench: A benchmark for interactive data exploration [C]. Portland : The 2020 ACM International Conference on Management of Data , 2020 .

[30]  Funke F , Kemper A , Krompass S , et al . Metrics for measuring the performance of the mixed workload CH-benCHmark [C]. Seattle : Third TPC Technology Conference , 2011 .

[31]  Cooper B F , Silberstein A , Tam E , et al . Benchmarking cloud serving systems with YCSB [C]. Indianapolis : The 1st ACM Symposium on Cloud Computing , 2010 .

[32]  Patil S , Polte M , Ren K , et al . YCSB++: Benchmarking and performance debugging advanced features in scalable table stores [C]. Cascais : ACM Symposium on Cloud Computing in conjunction with SOSP 2011 , 2011.

[33]  Chintapalli S , Dagit D , Evans B , et al . Benchmarking streaming computation engines: Storm, flink and spark streaming [C]. Chicago : 2016 IEEE International Parallel and Distributed Processing Symposium Workshops , 2016 .

[34]  Angles R , Antal J B , Averbuch A , et al . The LDBC social network benchmark [EBOL]. 2022-06-06 [ 2022-06-16 ]. http:arxiv.orgabs2001.02299 . link1

[35]  Zhang C , Lu J H , Xu P F , et al . UniBench: A benchmark for multi-model database management systems [C]. Riode Janeiro : 10th TPC Technology Conference , 2018 .

[36]  田稼丰 , 姜春宇 . 基于金融场景的数据库性能评估工具 [J]. 信息通信技术与政策 , 2020 , 46 4 : 85 ‒ 90 .

[37]  Jiang C , Tian J , Ma P . Databench-T: A transactional database benchmark for financial scenarios [C]. Shenyang : 2021 IEEE 20th International Conference on Trust, Security and Privacy in Computing and Communications , 2021 .

[38]  Liew S P , Takahashi T , Ueno M . PEARL: Data synthesis via private embeddings and adversarial reconstruction learning [EBOL]. 2022-03-08 [ 2022-06-16 ]. https:openreview.netpdf?id=M6M8BEmd6dq . link1

Related Research