期刊首页 优先出版 当期阅读 过刊浏览 作者中心 关于期刊 English

《工程(英文)》 >> 2022年 第18卷 第11期 doi: 10.1016/j.eng.2021.03.023

机器翻译研究进展

a Baidu Inc., Beijing 100193, China
b Baidu Research, Sunnyvale, CA 94089, USA

收稿日期: 2020-11-15 修回日期: 2021-01-30 录用日期: 2021-03-29 发布日期: 2021-07-14

下一篇 上一篇

摘要

经过70 多年的发展,机器翻译取得了巨大成就。特别是近年来,随着神经网络机器翻译(NMT)的出现,翻译质量得到了极大提高。本文首先回顾机器翻译的发展历程,从基于规则的机器翻译、基于实例的机器翻译,到统计机器翻译。然后详细介绍神经网络机器翻译技术的进展,包括基本原理和当前主流模型(Transformer),以及多语言翻译。接下来介绍机器同声传译的最新进展,探讨如何在翻译质量和时间延迟方面取得平衡。之后,介绍机器翻译丰富的产品形式和应用。最后,简要讨论机器翻译面临的挑战和未来的研究方向。

图片

图1

图2

图3

图4

参考文献

[ 1 ] Weaver W. Translation. Mach Transl Lang 1955;14:15‒23.

[ 2 ] Hutchins J. ALPAC: the (in) famous report. In: Nirenburg S, Somers HL, Wilks YA, editors. Readings in machine translation. Cambridge: MIT Press; 2003. 链接1

[ 3 ] Nagao M. A framework of a mechanical translation between Japanese and English by analogy principle. In: Elithorn A, Banerji R, editors. In: Proceedings of the International NATO Symposium on Artificial and Human Intelligence. New York City: Elsevier North-Holland, Inc; 1984. p. 173‒80. 链接1

[ 4 ] Brown PF, Cocke J, Della Pietra SA, Della Pietra VJ, Jelinek F, Lafferty JD, et al. A statistical approach to machine translation. Comput Linguist 1990;16(2):79‒85.

[ 5 ] Brown PF, Della Pietra SA, Della Pietra VJ, Mercer RL. The mathematics of statistical machine translation: parameter estimation. Comput Linguist 1993;19(2):263‒311.

[ 6 ] Church KW, Mercer RL. Introduction to the special issue on computational linguistics using large corpora. Comput Linguist 1993;19(1):1‒24. 链接1

[ 7 ] Al-Onaizan Y, Curin J, Jahr M, Knight K, Lafferty J, Melamed D, et al. Statistical machine translation: final report. Baltimore: Johns Hopkins University Summer Workshop; 1999.

[ 8 ] Och FJ, Ney H. A systematic comparison of various statistical alignment models. Comput Linguist 2003;29(1):19‒51. 链接1

[ 9 ] Koehn P, Och FJ, Marcu D. Statistical phrase-based translation. In: Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics; 2003 May 27‒Jun 1; Edmonton, AB, Canada; 2003. 链接1

[10] Koehn P, Hoang H, Birch A, Callison-Burch C, Federico M, Bertoldi N, et al. Moses: open source toolkit for statistical machine translation. In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics; 2007 Jun 25‒27; Prague, Czech Republic; 2007. 链接1

[11] Wang H. [Multi-strategy machine translation]. In: Cao YQ, Sun MS, editors. [Frontiers of Chinese information processing]. Beijing: Tsinghua University Press; 2006. p. 45‒52. Chinese.

[12] Koehn P, Hoang H. Factored translation models. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP‒CoNLL); 2007 Jun 25‒27; Prague, Czech Republic; 2007.

[13] Chiang D. Hierarchical phrase-based translation. Comput Linguist 2007;33(2):201‒28. 链接1

[14] Yamada K, Knight K. A syntax-based statistical translation model. In: Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics; 2001 Jul 6‒11; Toulouse, France; 2001. 链接1

[15] Galley M, Graehl J, Knight K, Marcu D, DeNeefe S, Wang W, et al. Scalable inference and training of context-rich syntactic translation models. In:Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics; 2006 Jul 17‒21; Sydney, NSW, Australia; 2006. 链接1

[16] Liu Y, Liu Q, Lin S. Tree-to-string alignment template for statistical machine translation. In: Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics; 2006 Jul 17‒21; Sydney, NSW, Australia; 2006. 链接1

[17] Graehl J, Knight K, May J. Training tree transducers. In: Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics; 2004 May 2‒7; Boston, MA, USA; 2004.

[18] Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate. In: Proceedings of the 3rd International Conference on Learning Representations; 2015 May 7‒9; San Diego, USA; 2015.

[19] Sutskever I, Vinyals O, Le QV. Sequence to sequence learning with neural networks. In: Proceedings of the 27th International Conference on Neural Information Processing Systems; 2014 Dec 8‒13; Montreal, QC, Canada; 2014.

[20] Dong D, Wu H, He W, Yu D, Wang H. Multi-task learning for multiple language translation. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing; 2015 Jul 26‒31; Beijing, China; 2015. 链接1

[21] Pouliquen B. WIPO Translate: patent neural machine translation publicly available in 10 languages [presentation]. In: Machine TranslationXVI; 2017 Sep 18‒22; Nagoya, Japan; 2017.

[22] Wu Y, Schuster M, Chen Z, Le QV, Norouzi M, Macherey W, et al. Google’s neural machine translation system: bridging the gap between human and machine translation. 2016. arXiv: 1609.08144.

[23] Gehring J, Auli M, Grangier D, Yarats D, Dauphin YN. Convolutional sequence to sequence learning. In: Proceedings of the 34th International Conference on Machine Learning; 2017 Aug 6‒11; Sydney, NSW, Australia; 2017. 链接1

[24] Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. Attention is all you need. In: Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017); 2017 Dec 4‒9; Long Beach, CA, USA; 2017.

[25] Gu J, Bradbury J, Xiong C, Li VOK, Socher R. Non-autoregressive neural machine translation. In: Proceedings of the International Conference on Learning Representations; 2018 Apr 30‒May 3; Vancouver, BC, Canada; 2018.

[26] Wei B, Wang M, Zhou H, Lin J, Xie J, Sun X. Imitation learning for nonautoregressive neural machine translation. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics; 2019 Jul 28‒Aug 2; Florence, Italy; 2019. 链接1

[27] Lample G, Conneau A, Denoyer L, Ranzato M. Unsupervised machine translation using monolingual corpora only. In: Proceedings of the International Conference on Learning Representations; 2018 Apr 30‒May 3; Vancouver, BC, Canada; 2018. 链接1

[28] Artetxe M, Labaka G, Agirre E. An effective approach to unsupervised machine translation. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics; 2019 Jul 28‒Aug 2; Florence, Italy; 2019. 链接1

[29] Song K, Tan X, Qin T, Lu J, Liu TY. Mass: masked sequence to sequence pretraining for language generation. In: Proceedings of the 36th International Conference on Machine Learning; 2019 Jun 9‍‒‍15; Long Beach, CA, USA; 2019.

[30] Kato Y. The future of voice-processing technology in the world of computers and communications. Pro Natl Acad Sci USA 1995;92(22):10060‒3. 链接1

[31] Tomita M, Tomabechi H, Saito H. SpeechTrans: an experimental real-time speech-to-speech translation. Lang Res 1990;26(4):663‒72.

[32] Kitano H. Speech-to-speech translation: a massively parallel memory-based approach. Boston: Kluwer Academic Publishers; 1994. 链接1

[33] Waibel A, Jain AN, McNair AE, Saito H, Hauptmann AG, Tebelskis J. JANUS: a speech-to-speech translation using connectionist and symbolic processing strategies. In: Proceedings of the International Conference on Acoustics, Speech and Signal Processing; 1991 Apr 14‒17; Toronto, ON, Canada; 1991. 链接1

[34] Morimoto T, Takezawa T, Yato F, Sagayama S, Tashiro T, Nagata M, et al. ATR’s speech translation system: ASURA. In: Proceedings of the 3rd European Conference on Speech Communication and Technology; 1993 Sep 22‍‒‍25; Berlin, Germany; 1993. 链接1

[35] Roe DB, Pereira FCN, Sproat RW, Riley MD, Moreno PJ, Macarron A. Efficient grammar processing for a spoken language translation system. In: Proceedings of the 1992 IEEE International Conference on Acoustics, Speech and Signal Processing; 1992 Mar 23‒26; San Francisco, CA, USA; 1992. 链接1

[36] Sumita E, Shimizu T, Nakamura S. NICT-ATR speech-to-speech translation system. In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics; 2007 Jun 25‒27; Prague, Czech Republic; 2007. 链接1

[37] Fügen C, Kolss M, Paulik M, Stüker S, Schultz T, Waibel A. Open domain speech translation: from seminars and speeches to lectures. In: Proceedings of TC-STAR Workshop on Speech-to-Speech Translation; 2006 Jun 19‍‒‍21; Barcelona, Spain; 2006. 链接1

[38] Moser-Mercer B, Künzli A, Korac M. Prolonged turns in interpreting: effects on quality, physiological and psychological stress (pilot study). Interpreting 1998;3(1):47‒64. 链接1

[39] Wang H, Wu H, Hu X, Liu Z, Li J, Ren D, et al. The TCH machine translation system for IWSLT 2008. In: Proceedings of International Workshop on Spoken Language Translation; 2008 Oct 20‒21; Honolulu, HI, USA; 2008.

[40] Nakamura S, Markov K, Nakaiwa H, Kikui G, Kawai H, Jitsuhiro T, et al. The ATR multilingual speech-to-speech translation system. IEEE Trans Audio Speech Lang Process 2006;14(2):365‒76. 链接1

[41] He H, Boyd-Graber J, Daume H III. Interpretese vs. translationese: the uniqueness of human strategies in simultaneous interpretation. In: Proceedings of the 15th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies; 2016 Jun 12‒17; San Diego, CA, USA; 2016. 链接1

[42] Wang H, Gao W, Li S. Utterance segmentation of spoken Chinese. Chin J Comput 1999;22(10):1009‒13. Chinese.

[43] Ma M, Huang L, Xiong H, Zheng R, Liu K, Zhang B, et al. STACL: simultaneous translation with implicit anticipation and controllable latency using prefixto- prefix framework. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics; 2019 Jul 28‒Aug 2; Florence, Italy;2019. 链接1

[44] Zhang JJ, Zong CQ. Neural machine translation: challenges, progress and future. Sci China Technol Sci 2020;63(10):2028‒50. 链接1

[45] Edunov S, Ott M, Auli M, Grangier D. Understanding back-translation at scale. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing; 2018 Oct 31‒Nov 4; Brussels, Belgium; 2018. 链接1

[46] Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput 1997;9(8):1735‒80. 链接1

[47] Bengio Y, Simard P, Frasconi P. Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Netw 1994;5(2):157‒66. 链接1

[48] He W, He Z, Wu H, Wang H. Improved neural machine translation with SMT features. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence; 2016 Feb 12‒17; Phoenix, AZ, USA; 2016. 链接1

[49] Tu Z, Lu Z, Liu Y, Liu X, Li H. Modeling coverage for neural machine translation. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics; 2016 Aug 7‒12; Berlin, Germany; 2016. 链接1

[50] Cheng Y, Shen S, He Z, He W, Wu H, Sun M, et al. Agreement-based joint training for bidirectional attention-based neural machine translation. In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence; 2016 Jul 9‒15; New York City, NY, USA; 2016.

[51] Sennrich R, Haddow B. Linguistic input features improve neural machine translation. In: Proceedings of the First Conference on Machine Translation; 2016 Aug 11‒12; Berlin, Germany; 2016. 链接1

[52] Wu S, Zhou M, Zhang D. Improved neural machine translation with source syntax. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence; 2017 Aug 19‒25; Melbourne, VIC, Australia; 2017. 链接1

[53] Li J, Xiong D, Tu Z, Zhu M, Zhang M, Zhou G. Modeling source syntax for neural machine translation. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics; 2017 Jul 30‒Aug 4; Vancouver, CB,Canada; 2017. 链接1

[54] Feng Y, Zhang S, Zhang A, Wang D, Abel A. Memory-augmented neural machine translation. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing; 2017 Sep 7‍‒‍11; Copenhagen, Denmark; 2017. 链接1

[55] Zhao Y, Wang Y, Zhang J, Zong C. Phrase table as recommendation memory for neural machine translation. In: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence; 2018 Jul 13‍‒‍19; Stockholm, Sweden; 2018. 链接1

[56] Wang X, Lu Z, Tu Z, Li H, Xiong D, Zhang M. Neural machine translation advised by statistical machine translation. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence; 2017 Feb 4‒9; San Francisco, CA, USA; 2017. 链接1

[57] Sennrich R, Haddow B, Birch A. Neural machine translation of rare words with subword units. In: Proceedings of Annual Meeting of the Association for Computational Linguistics; 2016 Aug 7‒12; Berlin, Germany; 2016. 链接1

[58] Gage P. A new algorithm for data compression. C Users J 1994;12(2):23‒38.

[59] Devlin J, Chang MW, Lee K, Toutanova K. BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of 17th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies; 2019 Jun 2‍‒‍7; Minnesota, MN, USA; 2019.

[60] Sun Y, Wang S, Li Y, Feng S, Tian H, Wu H, et al. ERNIE 2.0: a continual pretraining framework for language understanding. In: Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence; 2020 Feb 7‍‒‍12; New York City, NY, USA; 2020. 链接1

[61] Zhou C, Neubig G, Gu J. Understanding knowledge distillation in nonautoregressive machine translation. 2019. arXiv:1911.02727.

[62] Guo J, Tan X, Xu L, Qin T, Chen E, Liu TY. Fine-tuning by curriculum learning for non-autoregressive neural machine translation. In: Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence; 2020 Feb 7‍‒‍12; New York City, NY, USA; 2020. 链接1

[63] Koehn P, Knowles R. Six challenges for neural machine translation. In: Proceedings of the First Workshop on Neural Machine Translation; 2017 Aug 4; Vancouver, CB, Canada; 2017. 链接1

[64] Sennrich R, Haddow B, Birch A. Improving neural machine translation models with monolingual data. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics; 2016 Aug 7‒12; Berlin, Germany; 2016. 链接1

[65] Poncelas A, Shterionov D, Way A, Wenniger GMDB, Passban P. Investigating backtranslation in neural machine translation. 2018. arXiv:1804.06189.

[66] Lample G, Conneau A, Denoyer L, Ranzato M. Unsupervised machine translation using monolingual corpora only. In: Proceedings of the International Conference on Learning Representations; 2018 Apr 30‒May 3;Vancouver, BC, Canada; 2018. 链接1

[67] Artetxe M, Labaka G, Agirre E. An effective approach to unsupervised machine translation. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics; 2019 Jul 28‒Aug 2; Florence, Italy; 2019. 链接1

[68] Conneau A, Lample G. Cross-lingual language model pretraining. In:Proceedings of the 33rd Conference on Neural Information Processing Systems; 2019 Dec 8‒14; Vancouver, BC, Canada; 2019. 链接1

[69] Ren S, Wu Y, Liu S, Zhou M, Ma S. Explicit cross-lingual pre-training for unsupervised machine translation. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing; 2019 Nov 3‒7; Hong Kong, China; 2019. 链接1

[70] Wang H, Wu H, Liu Z. Word alignment for languages with scarce resources using bilingual corpora of other language pairs. In: Proceedings of the COLING/ACL2006 Main Conference Poster Sessions; 2006 Jul 17‍‒‍21; Sydney, NSW, Australia; 2006. 链接1

[71] Utiyama M, Isahara H. A comparison of pivot methods for phrase-based statistical machine translation. In: Proceedings of the Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics; 2007 Apr 22‒27; Rochester, NY, USA; 2007.

[72] Khalilov M, Costa-Jussà MR, Henríquez CA, Fonollosa JAR, Hernández A, Mariño JB, et al. The TALP & I2R SMT systems for IWSLT 2008. In: Proceedings of the International Workshop on Spoken Language Translation; 2008 Oct 20‒21; Honolulu, HI, USA; 2008. 链接1

[73] Wu H, Wang H. Pivot language approach for phrase-based statistical machine translation. In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics; 2007 Jun 25‒27; Prague, Czech Republic; 2007. 链接1

[74] Wu H, Wang H. Revisiting pivot language approach for machine translation. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the Association for Computational Linguistics and 4th International Joint Conference on Natural Language; 2009 Aug 2‒7; Singapore; 2009. 链接1

[75] Cohn T, Lapata M. Machine translation by triangulation: making effective use of multi-parallel corpora. In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics; 2007 Jun 25‍‒‍27; Prague, Czech Republic; 2007.

[76] Luong MT, Le QV, Sutskever I, Vinyals O, Kaiser L. Multi-task sequence to sequence learning. In: Proceedings of the International Conference on Learning Representations; 2016 May 2‒4; San Juan, Puerto Rico; 2016.

[77] Zoph B , Knight K . Multi-source neural translation . In: Proceedings of the 15th Annual Conference of the North American Chapter of the Association for . . 10.28920dhm50.1.24-27

[78] Firat O, Cho K, Bengio Y. Multi-way, multilingual neural machine translation with a shared attention mechanism. In: Proceedings of the 15th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies; 2016 Jun 12‍‒‍17; San Diego, CA, USA; 2016. 链接1

[79] Johnson M, Schuster M, Le QV, Krikun M, Wu Y, Chen Z, et al. Google’s multilingual neural machine translation system: enabling zero-shot translation. Trans Assoc Comput Linguist 2017;5:339‒51. 链接1

[80] Kudugunta S, Bapna A, Caswell I, Firat O. Investigating multilingual NMT representations at scale. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing; 2019 Nov 3‍‒‍7; Hong Kong, China; 2019. 链接1

[81] Tan X, Chen J, He D, Xia Y, Qin T, Liu TY. Multilingual neural machine translation with language clustering. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing; 2019 Nov 3‒7; Hong Kong, China; 2019. 链接1

[82] Arivazhagan N, Bapna A, Firat O, Lepikhin D, Johnson M, Krikun M, et al. Massively multilingual neural machine translation in the wild: findings and challenges. 2019. arXiv:1907.05019. 链接1

[83] Fan A, Bhosale S, Schwenk H, Ma Z, El-Kishky A, Goyal S, et al. Beyond Englishcentric multilingual machine translation. 2020. arXiv:2010.11125.

[84] Dalvi F, Durrani N, Sajjad H, Vogel S. Incremental decoding and training methods for simultaneous translation in neural machine translation. 2018. arXiv:1806.03661. 链接1

[85] Sridhar VKR, Chen J, Bangalore S, Ljolje A, Chengalvarayan R. Segmentation strategies for streaming speech translation. In: Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies; 2013 Jun 9‍‒‍14; Atlanta,GA; USA; 2013.

[86] Oda Y, Neubig G, Sakti S, Toda T, Nakamura S. Optimizing segmentation strategies for simultaneous speech translation. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics; 2014 Jun 23‒25; Baltimore, MD, USA; 2014. 链接1

[87] Cho K, Esipova M. Can neural machine translation do simultaneous translation? 2016. arXiv:1606.02012.

[88] Gu J, Neubig G, Cho K, Li VOK. Learning to translate in real-time with neural machine translation. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics; 2017 Apr 3‍‒‍7; Valencia, Spain; 2017. 链接1

[89] Fujita T, Neubig G, Sakti S, Toda T, SimpleNakamura S., lexicalized choice of translation timing for simultaneous speech translation. In: Proceedings of the 14th Annual Conference of the International Speech Communication Association; 2013 Aug 25‒29; Lyon, France; 2013. 链接1

[90] Arivazhagan N, Cherry C, Macherey W, Chiu CC, Yavuz S, Pang R, et al. Monotonic infinite lookback attention for simultaneous machine translation. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics; 2019 Jul 28‒Aug 2; Florence, Italy; 2019. 链接1

[91] Ma X, Pino J, Cross J, Puzon L, Gu J. Monotonic multihead attention. In:Proceedings of the International Conference on Learning Representations; 2020 Apr 26‒May 1; Addis Ababa, Ethiopia; 2020.

[92] Zhang R, Zhang C, He Z, Wu H, Wang H. Learning adaptive segmentation policy for simultaneous translation. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing; 2020 Nov 16‒20; online; 2020. 链接1

[93] Baumann T. Partial representations improve the prosody of incremental speech synthesis. In: Proceedings of the Fifteenth Annual Conference of the International Speech Communication Association; 2014 Sep 14‒18;Singapore; 2014. 链接1

[94] Baumann T. Decision tree usage for incremental parametric speech synthesis. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing; 2014 May 4‒9; Florence, Italy; 2014. 链接1

[95] Pouget M, Hueber T, Bailly G, Baumann T. HMM training strategy for incremental speech synthesis. In: Proceedings of the 16th Annual Conference of the International Speech Communication Association; 2015 Sep 6‒10; Dresden, Germany; 2015. 链接1

[96] Pouget M, Nahorna O, Hueber T, Bailly G. Adaptive latency for part-of-speech tagging in incremental text-to-speech synthesis. In: Proceedings of Inter- Speech 2016; 2016 Sep 8‒12; San Francisco, CA, USA; 2016. 链接1

[97] Yanagita T, Sakti S, Nakamura S. Incremental TTS for Japanese language. In:Proceedings of Interspeech; 2018 Sep 2‒6; Hyderabad, India; 2018. 链接1

[98] Yanagita T, Sakti S, Nakamura S. Neural iTTS: toward synthesizing speech in real-time with end-to-end neural text-to-speech framework. In: Proceedings of the 10th ISCA Speech Synthesis Workshop; 2019 Sep 20‒22; Vienna, Austria; 2019. 链接1

[99] Ma M, Zheng B, Liu K, Zheng R, Liu H, Peng K, et al. Incremental textto- speech synthesis with prefix-to-prefix framework. In: Findings of the Association for Computational Linguistics: EMNLP 2020; 2020 Nov 16‍‒‍20; online; 2020. 链接1

[100] Shimizu H, Neubig G, Sakti S, Toda T, Nakamura S. Collection of a simultaneous translation corpus for comparative analysis. In: Proceedings of Ninth International Conference on Language Resources and Evaluation; 2014 May 26‒31; Reykjavik, Iceland; 2014.

[101] Toyama H, Ryu K, Matsubara S, Kawaguchi, Nobuo K, Inagaki Y. CIAIR simultaneous interpretation corpus. In: Proceedings of Oriental COCOSDA; 2004 Nov 17‒19; New Delhi, India; 2004.

[102] Sandrelli A, Bendazzoli C. Tagging a corpus of interpreted speeches: the European parliament interpreting corpus (EPIC). In: Proceedings of LREC; 2006 May 22‒28; Genoa, Italy; 2004.

[103] Di Gangi MA, Cattoni R, Bentivogli L, Negri M, Turchi M. MuST-C: a multilingual speech translation corpus. In: Proceedings of 17th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies; 2019 Jun 2‍‒‍7; Minnesota, MN, USA; 2019.

[104] Xiong H, Zhang R, Zhang C, He Z, Wu H, Wang H. DuTongChuan: context‒aware translation model for simultaneous interpreting. 2019. arXiv:1907. 12984.

[105] Bansal S, Kamper H, Livescu K, Lopez A, Goldwater S. Pre-training on highresource speech recognition improves low-resource speech-to-text translation. In: Proceedings of the North American Chapter of the Association for Computational Linguistics; 2018 Jun 1‒6; New Orleans, LA, USA; 2018. 链接1

[106] Weiss RJ, Chorowski J, Jaitly N, Wu Y, Chen Z. Sequence-to-sequence models can directly translate foreign speech. In: Proceedings of the 18th Annual Conference of the International Speech Communication Association; 2017 Aug 20‒24; Stockholm, Sweden; 2017. 链接1

[107] Anastasopoulos A, Chiang D. Leveraging translations for speech transcription in low-resource settings. In: Proceedings of the 19th Annual Conference of the International Speech Communication Association; 2018 Sep 2‍‒‍6;Hyderabad, India; 2018. 链接1

[108] Bérard A, Pietquin O, Servan C, Besacier L, Servan C. Listen and translate: a proof of concept for end-to-end speech-to-text translation. In: Proceedings of the 30th Conference on Neural Information Processing Systems; 2016 Dec 5‒10; Barcelona, Spain; 2016.

[109] Liu Y, Xiong H, Zhang J, He Z, Wu H, Wang H, et al. End-to-end speech translation with knowledge distillation. In: Proceedings of the 20th Annual Conference of the International Speech Communication Association; 2019 Sep 15‒19; Graz, Austria; 2019. 链接1

[110] Kano T, Sakti S, Nakamura S. Structured based curriculum learning for end to-end English‍‒‍Japanese speech translation. In: Proceedings of the 18th Annul Conference of the International Speech Communication Association;2017 Aug 20‒24; Stockholm, Sweden; 2017. 链接1

[111] Anastasopoulos A, Chiang D. Tied multitask learning for neural speech translation. In: Proceedings of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies; 2018 Jun 1‒6; New Orleans, Louisiana; 2018. 链接1

[112] Sperber M, Neubig G, Niehues J, Waibel A. Attention-passing models for robust and data-efficient end-to-end speech translation. Transl Assoc Comput Linguist 2019;7:313‒25. 链接1

[113] Liu Y, Zhang J, Xiong H, Zhou L, He Z, Wu H, et al. Synchronous speech recognition and speech-to-text translation with interactive decoding. In: Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence; 2020 Feb 7‒12; New York City, NY, USA; 2020. 链接1

[114] Jia Y, Weiss RJ, Biadsy F, Macherey W, Johnson M, Chen Z, et al. Direct speech-to-speech translation with a sequence-to-sequence model. In:Proceedings of the 20th Annual Conference of the International Speech Communication Association (INTERSPEECH 2019); 2019 Sep 15‍‒‍19; Graz,Austria; 2019. 链接1

[115] Kano T, Sakti S, Nakamura S. Transformer-based direct speech-to-speech translation with transcoder. In: Proceedings of the IEEE Spoken Language Technology Workshop; 2021 Jan 19‒22; Shenzhen, China; 2021. 链接1

[116] Vinyals O, Toshev A, Bengio S, Erhan D. Show and tell: a neural image caption generator. In: Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); 2015 Jun 8‒10; Boston, MA, USA; 2015. 链接1

[117] Lu J, Xiong C, Parikh D, Socher R. Knowing when to look: adaptive attention via a visual sentinel for image captioning. In: Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); 2017 Jul 21‒26; Honolulu, HI, USA; 2017. 链接1

[118] Anderson P, He X, Buehler C, Teney D, Johnson M, Gould S, et al. Bottom‒up and top‒down attention for image captioning and visual question answering. In: Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2018 Jun 18‒23; Salt Lake City, UT, USA; 2018. 链接1

[119] Xu Y, Li M, Cui L, Huang S, Wei F, Zhou M. LayoutLM: pre-training of text and layout for document image understanding. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Minning; 2020 Aug 23‒27; online; 2020. 链接1

[120] Wang Z, He W, Wu H, Wu H, Li W, Wang H, et al. Chinese poetry generation with planning based neural network. Proceedings of the 26th International Conference on Computational Linguistics; 2016 Dec 11‒16; Osaka, Japan;2016. 链接1

[121] Papineni K, Roukos S, Ward T, Zhu WJ. BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics; 2002 Jul 7‒12;Philadelphia, PA, USA; 2002. 链接1

[122] Tomás J, Mas JÀ, Casacuberta F. A quantitative method for machine translation evaluation. In: Proceedings of the EACL 2003 Workshop on Evaluation Initiatives in Natural Language Processing: are evaluation methods, metrics and resources reusable? 2003 Apr 12‒17; Budapest, Hungary; 2003. 链接1

[123] Banerjee S, Lavie A. METEOR: an automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization; 2005 Jun 29; Ann Arbor, MI, USA; 2005.

[124] Tsiartas A, Georgiou PG, Narayanan SS. Toward transfer of acoustic cues of emphasis across languages. In: Proceedings of the 14th Annual Conference of the International Speech Communication Association; 2013 Aug 25‍‒‍29;Lyon, France; 2013. 链接1

[125] Do QT, Sakti S, Nakamura S. Sequence-to-sequence models for emphasis speech translation. IEEE/ACM Trans Audio Speech Lang Process 2018;26(10):1873‒83. 链接1

[126] Do QT, Toda T, Neubig G, Sakti S, Nakamura S. Preserving word-level emphasis in speech-to-speech translation. IEEE/ACM Trans Audio Speech Lang Process 2017;25(3):544‒56. 链接1

相关研究