With the rapid increase of unstructured text data, grey literature has become an important source of information to support research and innovation activities. In this paper, we propose a novel semi-automated grey literature screening approach that combines a Hierarchical BERT Model (HBM) with active learning to reduce the human workload in grey literature screening. Evaluations over three real-world grey literature datasets demonstrate that the proposed approach can save up to 64.88% of the human screening workload, while maintaining high screening accuracy. We also demonstrate how the use of the HBM model allows salient sentences within grey literature documents to be selected and highlighted to support workers in screening tasks.
JINGHUI LU (University College Dublin)
Brian Mac Namee (University College Dublin)
More from the Same Authors
2021 : Poster »
Shiji Zhou · Nastaran Okati · Wichinpong Sinchaisri · Kim de Bie · Ana Lucic · Mina Khan · Ishaan Shah · JINGHUI LU · Andreas Kirsch · Julius Frost · Ze Gong · Gokul Swamy · Ah Young Kim · Ahmed Baruwa · Ranganath Krishnan