A BERT-based ensemble learning approach for the BioCreative VII challenges: full-text chemical identification and multi-label classification in PubMed articles

Sheng Jie Lin, Wen Chao Yeh, Yu Wen Chiu, Yung Chun Chang, Min Huei Hsu, Yi Shin Chen, Wen Lian Hsu

研究成果: 雜誌貢獻文章同行評審

摘要

In this research, we explored various state-of-the-art biomedical-specific pre-trained Bidirectional Encoder Representations from Transformers (BERT) models for the National Library of Medicine - Chemistry (NLM CHEM) and LitCovid tracks in the BioCreative VII Challenge, and propose a BERT-based ensemble learning approach to integrate the advantages of various models to improve the system's performance. The experimental results of the NLM-CHEM track demonstrate that our method can achieve remarkable performance, with F1-scores of 85% and 91.8% in strict and approximate evaluations, respectively. Moreover, the proposed Medical Subject Headings identifier (MeSH ID) normalization algorithm is effective in entity normalization, which achieved a F1-score of about 80% in both strict and approximate evaluations. For the LitCovid track, the proposed method is also effective in detecting topics in the Coronavirus disease 2019 (COVID-19) literature, which outperformed the compared methods and achieve state-of-the-art performance in the LitCovid corpus.

原文英語
文章編號056
期刊Database
2022
DOIs
出版狀態已發佈 - 2022

ASJC Scopus subject areas

  • 資訊系統
  • 生物化學、遺傳與分子生物學 (全部)
  • 農業與生物科學 (全部)

指紋

深入研究「A BERT-based ensemble learning approach for the BioCreative VII challenges: full-text chemical identification and multi-label classification in PubMed articles」主題。共同形成了獨特的指紋。

引用此