Unknown

Dataset Information

0

A comparative study of pretrained language models for long clinical text.


ABSTRACT:

Objective

Clinical knowledge-enriched transformer models (eg, ClinicalBERT) have state-of-the-art results on clinical natural language processing (NLP) tasks. One of the core limitations of these transformer models is the substantial memory consumption due to their full self-attention mechanism, which leads to the performance degradation in long clinical texts. To overcome this, we propose to leverage long-sequence transformer models (eg, Longformer and BigBird), which extend the maximum input sequence length from 512 to 4096, to enhance the ability to model long-term dependencies in long clinical texts.

Materials and methods

Inspired by the success of long-sequence transformer models and the fact that clinical notes are mostly long, we introduce 2 domain-enriched language models, Clinical-Longformer and Clinical-BigBird, which are pretrained on a large-scale clinical corpus. We evaluate both language models using 10 baseline tasks including named entity recognition, question answering, natural language inference, and document classification tasks.

Results

The results demonstrate that Clinical-Longformer and Clinical-BigBird consistently and significantly outperform ClinicalBERT and other short-sequence transformers in all 10 downstream tasks and achieve new state-of-the-art results.

Discussion

Our pretrained language models provide the bedrock for clinical NLP using long texts. We have made our source code available at https://github.com/luoyuanlab/Clinical-Longformer, and the pretrained models available for public download at: https://huggingface.co/yikuan8/Clinical-Longformer.

Conclusion

This study demonstrates that clinical knowledge-enriched long-sequence transformers are able to learn long-term dependencies in long clinical text. Our methods can also inspire the development of other domain-enriched long-sequence transformers.

SUBMITTER: Li Y 

PROVIDER: S-EPMC9846675 | biostudies-literature | 2023 Jan

REPOSITORIES: biostudies-literature

altmetric image

Publications

A comparative study of pretrained language models for long clinical text.

Li Yikuan Y   Wehbe Ramsey M RM   Ahmad Faraz S FS   Wang Hanyin H   Luo Yuan Y  

Journal of the American Medical Informatics Association : JAMIA 20230101 2


<h4>Objective</h4>Clinical knowledge-enriched transformer models (eg, ClinicalBERT) have state-of-the-art results on clinical natural language processing (NLP) tasks. One of the core limitations of these transformer models is the substantial memory consumption due to their full self-attention mechanism, which leads to the performance degradation in long clinical texts. To overcome this, we propose to leverage long-sequence transformer models (eg, Longformer and BigBird), which extend the maximum  ...[more]

Similar Datasets

| S-EPMC10976360 | biostudies-literature
| S-EPMC9280463 | biostudies-literature
| S-EPMC11339500 | biostudies-literature
| S-EPMC11623115 | biostudies-literature
| S-EPMC10600487 | biostudies-literature
| S-EPMC11336492 | biostudies-literature
| S-EPMC10644179 | biostudies-literature
| S-EPMC11722495 | biostudies-literature
| S-EPMC10876664 | biostudies-literature
| S-EPMC10853853 | biostudies-literature