Research AI for Precision Medicine
Information Science Should Take a Lead in Future Biomedical Research
The Institute of Medical Science, The University of Tokyo, Tokyo 108-8639, JapanReceived：2019-03-25 Revised：2019-06-29 Accepted： 2019-07-22 Available online：2019-09-20
In this commentary, I explain my perspective on the relationship between artificial intelligence (AI)/data science and biomedicine from a long-range retrospective view. The development of modern biomedicine has always been accelerated by the repeated emergence of new technologies. Since all life systems are basically governed by the information in their own DNA, information science has special importance for the study of biomedicine. Unlike in physics, no (or very few) leading laws have been found in biology. Thus, in biology, the "data-to-knowledge" approach is important. AI has historically been applied to biomedicine, and the recent news that an AI-based approach achieved the best performance in an international competition of protein structure prediction may be regarded as another landmark in the field. Similar approaches could contribute to solving problems in genome sequence interpretation, such as identifying cancer-driving mutations in the genome of patients. Recently, the explosive development of next-generation sequencing (NGS) has been producing massive data, and this trend will accelerate. NGS is not only used for "reading" DNA sequences, but also for obtaining various types of information at the single-cell level. These data can be regarded as grid data points in climate simulation. Both data science and AI will become essential for the integrative interpretation/simulation of these data, and will take a leading role in future precision medicine.
 Stent GS. That was the molecular biology that was. Science 1968;160 (3826):390–5. link1
 Van Dijk EL, Auger H, Jaszczyszyn Y, Thermes C. Ten years of next-generation sequencing technology. Trends Genet 2014;30(9):418–26. link1
 Goodwin S, McPherson JD, McCombie WR. Coming of age: ten years of nextgeneration sequencing technologies. Nat Rev Genet 2016;17(6):333–51. link1
 Pollack A. Scientist at work: Leroy Hood; a biotech superstar looks at the bigger picture [Internet]. New York: The New York Times Company; c2019 [cited 2019 Aug 1]. Available from: https://www.nytimes.com/2001/04/ 17/science/scientist-at-work-leroy-hood-a-biotech-superstar-looks-at-thebigger-picture.html. link1
 Birks JB, Segrè E. Rutherford at Manchester. Phys Today 1963;16(12):71.
 Rigden DJ, Fernández XM. The 26th Annual Nucleic Acids Research Database Issue and Molecular Biology Database collection. Nucleic Acids Res 2019;47: D1–7.
 Vijayabaskar MS. Introduction to hidden Markov models and its applications in biology. Methods Mol Biol 2017;1552:1–12. link1
 Shortliffe EH, Davis R, Axline SG, Buchanan BG, Green CC, Cohen SN. Computerbased consultations in clinical therapeutics: explanation and rule acquisition capabilities of the MYCIN system. Comput Biomed Res 1975;8(4):303–20. link1
 Stefik MJ, Martin N. A review of knowledge based problem solving as a basis for a genetics experiment designing system. Stanford: Computer Science Department, Stanford University; 1977. link1
 Nakai K, Kanehisa M. Expert system for predicting protein localization sites in gram-negative bacteria. Proteins 1991;11(2):95–110. link1
 Nakai K, Kanehisa M. A knowledge base for predicting protein localization sites in eukaryotic cells. Genomics 1992;14(4):897–911. link1
 Nakai K, Horton P. PSORT: a program for detecting sorting signals in proteins and predicting their subcellular localization. Trends Biochem Sci 1999;24(1):34–6. link1
 Horton P, Park KJ, Obayashi T, Fujita N, Harada H, Adams-Collier CJ, et al. WoLF PSORT: protein localization predictor. Nucleic Acids Res 2007;35(Suppl 2): W585–7. link1
 Chen H, Yu T, Chen JY. Semantic web meets integrative biology: a survey. Brief Bioinform 2013;14(1):109–25. link1
 Wainberg M, Merico D, Delong A, Frey BJ. Deep learning in biomedicine. Nat Biotechnol 2018;36(9):829–38. link1
 AlQuraishi M. AlphaFold at CASP13. Bioinformatics 2019:btz422.
 Zhou N, Zhang CT, Lv HY, Hao CX, Li TJ, Zhu JJ, et al. Concordance study between IBM Watson for Oncology and clinical practice for patients with cancer in China. Oncologist 2019;24(6):812–9. link1
 Park ST, Kim J. Trends in next-generation sequencing and a new era for whole genome sequencing. Int Neurourol J 2016;20(Suppl 2):S76–83. link1
 DNA sequencing costs: data [Internet]. Bethesda: National Human Genome Research Institute; [cited 2019 Aug 1]. Available from: https://www. genome.gov/about-genomics/fact-sheets/DNA-Sequencing-Costs-Data. link1
 Kuhlwilm M, de Manuel M, Nater A, Greminger MP, Krützen M, MarquesBonet T. Evolution and demography of the great apes. Curr Opin Genet Dev 2016;41:124–9. link1
 Nakato R, Shirahige K. Recent advances in ChIP-seq analysis: from quality management to whole-genome annotation. Brief Bioinform 2017;18 (2):279–90. link1
 Eagen KP. Principles of chromosome architecture revealed by Hi-C. Trends Biochem Sci 2018;43(6):469–78. link1
 Marioni JC, Arendt D. How single-cell genomics is changing evolutionary and developmental biology. Annu Rev Cell Dev Biol 2017;33(1):537–53. link1
 Baslan T, Hicks J. Unravelling biology and shifting paradigms in cancer with single-cell sequencing. Nat Rev Cancer 2017;17(9):557–69. link1
 Hofman P, Heeke S, Alix-Panabières C, Pantel K. Liquid biopsy in the era of immune-oncology. Is it ready for prime-time use for cancer patients? Ann Oncol 2019;30(9):1448–59. link1
 Deurenberg RH, Bathoorn E, Chlebowicz MA, Couto N, Ferdous M, GarcíaCobos S, et al. Application of next generation sequencing in clinical microbiology and infection prevention. J Biotechnol 2017;243:16–24. link1
 SRA database growth [Internet]. Bethesda: National Center for Biotechnology Information; [cited 2019 Aug 1]. Available from: https://www.ncbi.nlm. nih.gov/sra/docs/sragrowth/. link1