Knowledge Enhanced Industrial Question-Answering Using Large Language Models

Ronghui Liu; Hao Ren; Haojie Ren; Wu Rui; Wei Cui; Xiaojun Liang; Chunhua Yang; Weihua Gui

doi:10.1016/j.eng.2025.07.035

Engineering ›› DOI: 10.1016/j.eng.2025.07.035

Research

research-article

Knowledge Enhanced Industrial Question-Answering Using Large Language Models

Ronghui Liu ^a^,^c
, Hao Ren ^b^,^c^,^g^,^*
, Haojie Ren ^d^,^f
, Wu Rui ^c
, Wei Cui ^a^,^c^,^*
, Xiaojun Liang ^c
, Chunhua Yang ^c^,^e
, Weihua Gui ^c^,^e

Author information +

History +

PDF (1688KB)

Abstract

Modern industrial systems have grown increasingly extensive, complex, and hierarchical, with operations relying on numerous knowledge-based queries. These queries necessitate considerable human resources while also requiring high levels of accuracy, subjectivity, and consistency, all of which critically influence operational efficiency. To overcome these challenges, this study proposes an industrial retrieval-augmented generation (RAG) method designed to enhance large language models (LLMs) using domain-specific knowledge, thereby improving the precision of question answering. A comprehensive industrial knowledge base was constructed from diverse sources, including journal articles, theses, books, and patents. A Text classification model based on bidirectional encoder representations from transformers (BERTs) was trained to accurately classify incoming queries. Furthermore, the general text embedding-dense passage retrieval (GTE-DPR) model was employed to perform word embedding and vector similarity retrieval, facilitating the alignment of query vectors with relevant entries in the knowledge base to obtain initial responses. These initial results were subsequently refined by LLMs to produce accurate final answers. Experimental evaluations confirm the effectiveness of the proposed approach. In particular, when applied to ChatGLM2-6B, the RAG method increased the ROUGE-L score from 32.52% to 55.04% and improved accuracy from 50.52% to 73.92%. Comparable improvements were also observed with LLaMA2-7B, underscoring the RAG framework’s capability to significantly enhance the accuracy and relevance of industrial question-answering (QA) systems.

Keywords

Retrieval augmented generation / Knowledge enhancement / Question answering / Large language models / Industrial knowledge automation

Cite this article

Download citation ▾

Ronghui Liu, Hao Ren, Haojie Ren, Wu Rui, Wei Cui, Xiaojun Liang, Chunhua Yang, Weihua Gui. Knowledge Enhanced Industrial Question-Answering Using Large Language Models. Engineering DOI:10.1016/j.eng.2025.07.035

登录浏览全文

4963

注册一个新账户忘记密码

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	J. Lu, X. Wang, X. Cheng, J. Yang, O. Kwan, X. Wang. Parallel factories for smart industrial operations: from big AI models to field foundational models and scenarios engineering. IEEE/CAA J Autom Sinica, 9 (12) (2022), pp. 2079-2086.

[2]	T. Wu, S. He, J. Liu, S. Sun, K. Liu, Q.-L. Han, et al. A brief overview of ChatGPT: the history, status quo and potential future development. IEEE/CAA J Autom Sinica, 10 (5) (2023), pp. 1122-1136.

[3]	X. Yuan, W. Xu, Y. Wang, C. Yang, W. Gui. A deep residual PLS for data-driven quality prediction modeling in industrial process. IEEE/CAA J Autom Sinica, 11 (8) (2024), pp. 1777-1785.

[4]	P.Z. Sun, Y. Bao, X. Ming, T. Zhou. Knowledge-driven industrial intelligent system: concept, reference model, and application direction. IEEE Trans Comput Soc Syst, 10 (4) (2023), pp. 1465-1478.

[5]	B. Ojokoh, E. Adebisi. A review of question answering systems. J Web Eng, 17 (8) (2018), pp. 717-758.

[6]	T. Yang, Y. Mei, L. Xu, H. Yu, Y. Chen. Application of question answering systems for intelligent agriculture production and sustainable management: a review. Resour Conserv Recycling, 204 (2024), Article 107497.

[7]	S.H. Masood, A. Soo. A rule based expert system for rapid prototyping system selection. Robot Comput-Integr Manuf, 18 (3-4) (2002), pp. 267-274.

[8]	E. Ruiz, M.I. Torres, A. del Pozo. Question answering models for human-machine interaction in the manufacturing industry. Comput Ind, 151 (2023), Article 103988.

[9]	X. Liu, Z. Cheng, Z. Shen, H. Zhang, H. Meng, X. Xu, et al. Building a question answering system for the manufacturing domain. Ieee Access, 10 (2022), pp. 75816-75824.

[10]	M.A. Calijorne Soares, F.S. Parreiras. A literature review on question answering techniques, paradigms and systems. J King Saud Univ Comput Inf Sci, 32 (6) (2020), pp. 635-646.

[11]	D. Tian, M. Li, Q. Ren, X. Zhang, S. Han, Y. Shen. Intelligent question answering method for construction safety hazard knowledge based on deep semantic mining. Autom Construct, 145 (2023), Article 104670.

[12]	H. Xiong, S. Wang, M. Tang, L. Wang, X. Lin. Knowledge graph question answering with semantic oriented fusion model. Knowl Base Syst, 221 (2021), Article 106954.

[13]	H. Han, J. Wang, X. Wang. Leveraging knowledge graph reasoning in a multihop question answering system for hot rolling line fault diagnosis. IEEE Trans Instrum Meas, 73 (2023), Article 3505014.

[14]	P. Liu, L. Qian, H. Lu, L. Xue, X. Zhao, B. Tao. The joint knowledge reasoning model based on knowledge representation learning for aviation assembly domain. Sci China Technol Sci, 67 (1) (2024), pp. 143-156.

[15]	X. Zhou, S. Zhang, M. Agarwal, J. Akroyd, S. Mosbach, M. Kraft. Marie and BERT - a knowledge graph embedding based question answering system for chemistry. ACS Omega, 8 (36) (2023), pp. 33039-33057.

[16]	S.H. Lee, S.W. Choi, E.B. Lee. A question-answering model based on knowledge graphs for the general provisions of equipment purchase orders for steel plants maintenance. Electronics, 12 (11) (2023), p. 2504.

[17]	Z.W. Zhou, W.R. Jong, Y.H. Ting, S.C. Chen, M.C. Chiu. Retrieval of injection molding industrial knowledge graph based on transformer and BERT. Appl Sci, 13 (11) (2023), p. 6687.

[18]	P. Wen, Y. Ma, R. Wang. Systematic knowledge modeling and extraction methods for manufacturing process planning based on knowledge graph. Adv Eng Inform, 58 (2023), Article 102172.

[19]	P. Yu, W. Gong, Z. Bai, H. Zhao, W. Deng. Knowledge graph civil aviation question answering based on deep learning, IEEE, Xiamen, China. Piscataway (2022), pp. 600-604.

[20]	L. von Rueden, S. Mayer, K. Beckh, B. Georgiev, S. Giesselbach, R. Heese, et al.. Informed machine learning-a taxonomy and survey of integrating prior knowledge into learning systems. IEEE Trans Knowl Data Eng, 35 (1) (2021), pp. 614-633.

[21]	O. Topsakal, T.C. Akinci. Creating large language model applications utilizing langchain: a primer on developing LLM apps fast. Int Conf Appl Eng Nat Sci, 1 (1) (2023), pp. 1050-1056.

[22]	R. Tallat, A. Hawbani, X. Wang, A. Al-Dubai, L. Zhao, Z. Liu, et al. Navigating industry 5.0: a survey of key enabling technologies, trends, challenges, and opportunities. IEEE Commun Surv Tutor, 26 (2) (2023), pp. 1080-11026.

[23]	H. Hostetter, M. Naser, X. Huang, J. Gales. The role of large language models (AI chatbots) in fire engineering: An examination of technical questions against domain knowledge. Nat Hazards Res, 4 (4) (2024), pp. 669-688.

[24]	A.C. Rivera, A. Moore, S. Robinson. Coal mining question answering with LLMs., arXiv:2410.02959 (2024).

[25]	T. Mo, Q. Xiao, H. Zhang, R. Li, Y. Wu. Domain-specific few-shot table prompt question answering via contrastive exemplar selection. Algorithms, 17 (7) (2024), p. 278.

[26]	S. Zheng, K. Pan, J. Liu, Y. Chen. Empirical study on fine-tuning pre-trained large language models for fault diagnosis of complex systems. Reliab Eng Syst Saf, 252 (2024), Article 110382.

[27]

Ji Z, Yu T, Xu Y, Lee N, Ishii E, Fung P. Towards mitigating LLM hallucination via self reflection. Findings of the association for computational linguistics. In: BouamorH, PinoJ, BaliK,editors. Findings of the Association for Computational Linguistics: EMNLP 2023. Kerrville: Association for Computational Linguistics; 2023. p. 1827-43.

[28]	C. Wang, Q. Long, M. Xiao, X. Cai, C. Wu, Z. Meng,et al.. Biorag: a rag-llm framework for biological question reasoning., arXiv:2408.01107 (2024).

[29]	X. Jiang, Y. Fang, R. Qiu, H. Zhang, Y. Xu, H. Chen, et al. TC-RAG: turing-complete RAG’s case study on medical LLM systems, Association for Computational Linguistics, Vienna, Austria. Kerrville (2025), pp. 11400-11426.

[30]	J. Li, Y. Yuan, Z. Zhang. Enhancing llm factual accuracy with rag to counter hallucinations: a case study on domain-specific queries in private knowledge-bases., arXiv:2403.10446 (2024).

[31]	A. Mansurova, A. Mansurova, A. Nugumanova. QA-RAG: exploring LLM reliance on external knowledge. Big Data Cog Comput, 8 (9) (2024), p. 115.

[32]	J. Liu, J. Lin, Y. Liu. How much can rag help the reasoning of LLM?, arXiv:2410.02338 (2024).

[33]	H. Wang, Y.F Li. Large language model empowered by domain-specific knowledge base for industrial equipment operation and maintenance, IEEE, Beijing, China. Piscataway (2023), pp. 474-479.

[34]	W. Zeng, X. Ren, T. Su, H. Wang, Y. Liao,Z. Wang, et al. Pangu-α: large-scale autoregressive pretrained chinese language models with auto-parallel computation.,arXiv:2104.12369 (2021).

[35]	T. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, et al. Language models are few-shot learners, Curran Associates Inc., Vancouver, BC, Canada. New York(2020), pp. 1877-1901.

[36]	B. Zhou, X. Li, T. Liu, K. Xu, W. Liu, J. Bao. CausalKGPT: industrial structure causal knowledge-enhanced large language model for cause analysis of quality problems in aerospace product manufacturing. Adv Eng Inform, 59 (2024), Article 102333.

[37]	X. Zhang, Z. Zhou, C. Ming, Y.Y. Sun. GPT-assisted learning of structure-property relationships by graph neural networks: application to rare-earth-doped phosphors. J Phys Chem Lett, 14 (50) (2023), pp. 11342-11349.

[38]	N. Sitapure, J.S.I. Kwon. CrystalGPT: enhancing system-to-system transferability in crystallization prediction and control using time-series-transformers. Comput Chem Eng, 177 (2023), Article 108339.

[39]	N. Kandpal, H. Deng, A. Roberts, E. Wallace, C. Raffel. Large language models struggle to learn long-tail knowledge, MIT Press, Honolulu, HI, USA. Cambridge(2023), pp. 15696-15707.

[40]	Y. Zhang, Y. Li, L. Cui, D. Cai, L. Liu,T. Fu, et al. Siren’s song in the AI ocean: a survey on hallucination in large language models.,arXiv:2309.01219 (2023).

[41]	S. Robertson, H. Zaragoza. The probabilistic relevance framework: BM25 and beyond. Found Trends Inf Retr, 3 (4) (2009), pp. 333-389.

[42]	S.E. Robertson, S. Walker. On relevance weights with little relevance information, Association for Computing Machinery, Philadelphia, PA, USA. New York (1997), pp. 16-24.

[43]	L. Mariotti, V. Guidetti, F. Mandreoli, A. Belli, P. Lombardi. Combining large language models with enterprise knowledge graphs: a perspective on enhanced natural language understanding. Front Artif Intell, 7 (2024), Article 1460065.

[44]	S. Siriwardhana, R. Weerasekera, E. Wen, T. Kaluarachchi, R. Rana, S. Nanayakkara. Improving the domain adaptation of retrieval augmented generation (RAG) models for open domain question answering. Trans Assoc Comput Linguist, 11 (2023), pp. 1-17.

[45]	J. D.M.W.C. Kenton, T.LK. Bert. pre-training of deep bidirectional transformers for language understanding, Association for Computational Linguistics, Minneapolis, MN, USA. Kerrville (2019), pp. 4171-4186.

[46]	V. Karpukhin, B. Oğuz, S. Min, P. Lewis, L. Wu,S. Edunov, et al. Dense passage retrieval for open-domain question answering, Association for Computational Linguistics, Online. Kerrville(2020), pp. 6769-6781.

[47]	S. Wang, S. Zhuang, G. Zuccon. Bert-based dense retrievers require interpolation with bm 25 for effective passage retrieval, Association for Computing Machinery, Canada. New York city (2021), pp. 317-324.

[48]	W. Fan, Y. Ding, L. Ning, S. Wang, H. Li,D. Yin, et al. A survey on rag meeting LLMs: towards retrieval-augmented large language models, Association for Computing Machinery, Barcelona, Spain. New York city (2024), pp. 6491-6501.

[49]	P. Lewis, E. Perez, A. Piktus, F. Petroni, V. Karpukhin,N. Goyal, et al. Retrieval-augmented generation for knowledge-intensive NLP tasks, Curran Associates Inc., Vancouver, BC, Canada. New York city(2020), pp. 9459-9474.

[50]	Y. Gao, Y. Xiong, X. Gao, K. Jia, J. Pan,Y. Bi, et al. Retrieval-augmented generation for large language models: a survey.,arXiv:2312.10997 (2023).

[51]	X. Ma, Y. Gong, P. He,H. Zhao, N. Duan.Query rewriting for retrieval-augmented large language models, Association for Computational Linguistics, Singapore. Kerrville (2023), pp. 5303-5315.

[52]	Eibich M, Nagpal S, Fred-Ojala A. ARAGOG: advanced RAG output grading. 2024. arXiv:2404.01037.

[53]	Zhou Y, Liu Z, Jin J, Nie JY, Dou Z. Metacognitive retrieval-augmented large language models. In: Proceedings of the ACM Web Conference 2024; 2024 May 13-17; Singapore. New York: Association for Computing Machinery; 2024. p. 1453-63.

[54]	Muludi K, Fitria KM, Triloka J, Sutedi. Retrieval-augmented generation approach: document question answering using large language model. Int J Adv Comput Sci Appl 2024; 15(3):776-85.

[55]	W. Yu, D. Iter, S. Wang, Y. Xu, M. Ju,S. Sanyal, et al. Generate rather than retrieve: large language models are strong context generators.,arXiv:2209.10063 (2022).

[56]	Z. Shao, Y. Gong, Y. Shen, M. Huang,N. Duan, W. Chen.Enhancing retrieval-augmented large language models with iterative retrieval-generation synergy., arXiv:2305.15294 (2023).

[57]	J. Jin, Y. Zhu, Z. Dou, G. Dong, X. Yang,C. Zhang, et al.FlashRAG: a modular toolkit for efficient retrieval-augmented generation research, Association for Computing Machinery, Sydney, NSW, Australia. New York city (2025), pp. 737-740.

[58]	A.L. Bornea, F. Ayed, A. De Domenico,N. Piovesan, A. Maatouk.Telco-RAG: navigating the challenges of retrieval augmented language models for telecommunications, IEEE, Cape Town, South Africa. Piscataway (2024), pp. 2359-2364.

[59]	D. Kim, B. Kim, D. Han, M. Eibich.AutoRAG: automated framework for optimization of retrieval augmented generation pipeline., arXiv:2410.20878(2024).

[60]	S.Q. Yan, J.C. Gu,Y. Zhu, Z.H. Ling.Corrective retrieval augmented generation., arXiv:2401.15884(2024).

[61]	C.M. Chan, C. Xu, R. Yuan, H. Luo, W. Xue,Y. Guo, et al. Rq-rag: learning to refine queries for retrieval augmented generation.,arXiv:2404.00610 (2024).

[62]	Z. Hei, W. Liu, W. Ou, J. Qiao, J. Jiao,G. Song, et al. Dr-rag: applying dynamic document relevance to retrieval-augmented generation for question-answering.,arXiv:2406.07348 (2024).

[63]	Z. Li, X. Zhang, Y. Zhang, D. Long, P. Xie, M. Zhang.Towards general text embeddings with multi-stage contrastive learning., arXiv:2308.03281 (2023).

[64]	J. Johnson, M. Douze, H. Jégou. Billion-scale similarity search with GPUs. IEEE Trans Big Data, 7 (3) (2021), pp. 535-547.

[65]	W. Zhang, M. Wang, G. Han, Y. Feng, X. Tan. A knowledge graph completion algorithm based on the fusion of neighborhood features and vBiLSTM encoding for network security. Electronics, 13 (9) (2024), p. 1661.

[66]	L.CY. Rouge. a package for automatic evaluation of summaries. Text summarization branches out, Association for Computational Linguistics, Kerrville (2004), pp. 74-81.

[67]	A. Joulin, E. Grave, P. Bojanowski, T. Mikolov. Bag of tricks for efficient text classification, Association for Computational Linguistics, Valencia, Spain. Kerrville (2017), pp. 427-431.

[68]	Y.Kim Convolutional neural networks for sentence classification, Association for Computational Linguistics, Doha, Qatar. Kerrville (2014), pp. 1746-1751.

[69]	P. Liu, X. Qiu, X. Huang Recurrent neural network for text classification with multi-task learning, Palo Alto (2016), pp. 2873-2879.

[70]	S. Lai, L. Xu, K. Liu, J. Zhao. Recurrent convolutional neural networks for text classification, AAAI Press, Austin, TX, USA. Palo Alto(2015), pp. 2267-2273.

[71]

L.A. Vaira, J.R. Lechien, V. Abbate, F. Allevi, G. Audino, G.A. Beltramini, et al. others. Validation of the quality analysis of medical artificial intelligence (QAMAI) tool: a new tool to assess the quality of health information provided by AI platforms. Eur Arch Otorhinolaryngol, 281 (2024), pp. 6123-6131