Table of Links
2. Community Challenges Overview and 2.1 CCKS
2.2 CHIP and 2.3 CCIR, CSMI, CCL and DCIC
3. Evaluation Tasks Overview and 3.1 Information Extraction
3.2 Text Classification and Text Similarity
3.3 Knowledge Graph and Question Answering
3.4 Text Generation and Knowledge Reasoning and 3.5 Large Language Model Evaluation
4. Translational Informatics in Biomedical Text Mining
5. Discussion and Perspective
5.1. Contributions of Community Challenges
5.2. Limitations of Current Community Challenges
5.3. Future Perspectives in the Era of Large Language Models, and References
2.2 CHIP
CHIP, as an organization specialized in the processing of health information in Chinese, released multiple community challenge evaluation tasks focused on biomedical text mining each year (Table 2). In 2018, two tasks were released, namely medical entity recognition and attribute extraction, as well as health consultation question pairs matching. In 2019, three tasks were released, including clinical terminology standardization, disease question pairs similarity calculation task and clinical trial eligibility criteria text classification task [25, 61]. In 2020, six tasks were released [62, 63]. For task types, literature-based question generation and COVID19 trends prediction are new added. For data types, in addition to electronic medical records, medication instructions and traditional Chinese medicine literature were also involved. In 2021, three tasks were released. It is worth noting that in 2021, CHIP introduced the CBLUE benchmark [64], which encompassed the 8 biomedical NLP tasks as benchmark subtasks. In 2022, five tasks were released [46-49, 65]. For the first time, CHIP included tasks related to optical character recognition and information extraction from medical paper documents [65]. Additionally, CBLUE released its second version, which comprised 18 subtasks. In 2023, six tasks were released, two of which were related to the evaluation of medical large language models. One dataset originated from CBLUE [50], while the other dataset was created manually.
2.3 CCIR, CSMI, CCL and DCIC
There are some other community challenges that occasionally released biomedical text mining evaluation tasks (Table 3). For example, in 2019, CCIR introduced a task, which required system to return results given a medical event knowledge graph and a series of natural language questions. In 2020, CSMI launched a task for classifying public health questions into 6 categories. In 2021, CCL released an intelligent medical dialogue diagnosis and treatment task, which contained 3 subtasks, including the extraction of medical entities and symptom information from medical dialogue texts, the generation of structured medical reports, and the simulation of dialogues process to determine specific diseases.
This paper is available on arxiv under CC BY-NC-ND 4.0 DEED license.
Authors:
(1) Hui Zong, Joint Laboratory of Artificial Intelligence for Critical Care Medicine, Department of Critical Care Medicine and Institutes for Systems Genetics, Frontiers Science Center for Disease related Molecular Network, West China Hospital, Sichuan University, Chengdu, 610041, China and the author contributed equally;
(2) Rongrong Wu, Joint Laboratory of Artificial Intelligence for Critical Care Medicine, Department of Critical Care Medicine and Institutes for Systems Genetics, Frontiers Science Center for Disease related Molecular Network, West China Hospital, Sichuan University, Chengdu, 610041, China and the author contributed equally;
(3) Jiaxue Cha, Shanghai Key Laboratory of Signaling and Disease Research, Laboratory of Receptor-Based Bio-Medicine, Collaborative Innovation Center for Brain Science, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China;
(4) Erman Wu, Joint Laboratory of Artificial Intelligence for Critical Care Medicine, Department of Critical Care Medicine and Institutes for Systems Genetics, Frontiers Science Center for Disease related Molecular Network, West China Hospital, Sichuan University, Chengdu, 610041, China;
(5) Jiakun Li, Joint Laboratory of Artificial Intelligence for Critical Care Medicine, Department of Critical Care Medicine and Institutes for Systems Genetics, Frontiers Science Center for Disease related Molecular Network, West China Hospital, Sichuan University, Chengdu, 610041, China and Department of Urology, West China Hospital, Sichuan University, Chengdu, 610041, China;
(6) Liang Tao, Faculty of Business Information, Shanghai Business School, Shanghai, 201400, China;
(7) Zuofeng Li, Takeda Co. Ltd., Shanghai, 200040, China;
(8) Buzhou Tang, Department of Computer Science, Harbin Institute of Technology, Shenzhen, 518055, China;
(9) Bairong Shen, Joint Laboratory of Artificial Intelligence for Critical Care Medicine, Department of Critical Care Medicine and Institutes for Systems Genetics, Frontiers Science Center for Disease related Molecular Network, West China Hospital, Sichuan University, Chengdu, 610041, China and a Corresponding author.