Ontology highlight
ABSTRACT:
SUBMITTER: Pan H
PROVIDER: S-EPMC10280405 | biostudies-literature | 2023
REPOSITORIES: biostudies-literature

Pan Hangyu H Xi Yaoyi Y Wang Ling L Nan Yu Y Su Zhizhong Z Cao Rong R
PeerJ. Computer science 20230328
Existing cross-lingual summarization (CLS) datasets consist of inconsistent sample quality and low scale. To address these problems, we propose a method that jointly supervises quality and scale to build CLS datasets. In terms of quality supervision, the method adopts a multi-strategy filtering algorithm to remove low-quality samples of monolingual summarization (MS) from the perspectives of character and semantics, thereby improving the quality of the MS dataset. In terms of scale supervision, ...[more]