DSpace Repository

Automatic data extraction to support meta-analysis statistical analysis: a case study on breast cancer

Show simple item record

dc.contributor.author Mutinda, Faith Wavinya en
dc.contributor.author Liew, Kongmeng en
dc.contributor.author Yada, Shuntaro en
dc.contributor.author Wakamiya, Shoko en
dc.contributor.author Aramaki, Eiji en
dc.date.accessioned 2022-09-27T07:10:54Z en
dc.date.available 2022-09-27T07:10:54Z en
dc.date.issued 2022-06-18 en
dc.identifier.uri http://hdl.handle.net/10061/14782 en
dc.description.abstract Background: Meta-analyses aggregate results of different clinical studies to assess the effectiveness of a treatment. Despite their importance, meta-analyses are time-consuming and labor-intensive as they involve reading hundreds of research articles and extracting data. The number of research articles is increasing rapidly and most meta-analyses are outdated shortly after publication as new evidence has not been included. Automatic extraction of data from research articles can expedite the meta-analysis process and allow for automatic updates when new results become available. In this study, we propose a system for automatically extracting data from research abstracts and performing statistical analysis. Materials and methods: Our corpus consists of 1011 PubMed abstracts of breast cancer randomized controlled trials annotated with the core elements of clinical trials: Participants, Intervention, Control, and Outcomes (PICO). We proposed a BERT-based named entity recognition (NER) model to identify PICO information from research abstracts. After extracting the PICO information, we parse numeric outcomes to identify the number of patients having certain outcomes for statistical analysis. Results: The NER model extracted PICO elements with relatively high accuracy, achieving F1-scores greater than 0.80 in most entities. We assessed the performance of the proposed system by reproducing the results of an existing meta-analysis. The data extraction step achieved high accuracy, however the statistical analysis step achieved low performance because abstracts sometimes lack all the required information. Conclusion: We proposed a system for automatically extracting data from research abstracts and performing statistical analysis. We evaluated the performance of the system by reproducing an existing meta-analysis and the system achieved a relatively good performance, though more substantiation is required. en
dc.language.iso en en
dc.publisher BMC en
dc.relation.isreplacedby https://bmcmedinformdecismak.biomedcentral.com/articles/10.1186/s12911-022-01897-4 en
dc.rights © The Author(s) 2022. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. ja
dc.subject Automatic meta-analysis en
dc.subject Natural language processing (NLP) en
dc.subject Automatic data extraction en
dc.subject Named entity recognition (NER) en
dc.subject Evidence-based medicine en
dc.title Automatic data extraction to support meta-analysis statistical analysis: a case study on breast cancer en
dc.type.nii Journal Article en
dc.contributor.transcription ヤダ, シュンタロウ ja
dc.contributor.transcription ワカミヤ, ショウコ ja
dc.contributor.transcription アラマキ, エイジ ja
dc.contributor.alternative 矢田, 竣太郎 ja
dc.contributor.alternative 若宮, 翔子 ja
dc.contributor.alternative 荒牧, 英治 ja
dc.textversion none en
dc.identifier.eissn 1472-6947 en
dc.identifier.jtitle BMC Medical Informatics and Decision Making en
dc.identifier.volume 22 en
dc.identifier.issue 1 en
dc.relation.doi 10.1186/s12911-022-01897-4 en
dc.identifier.artnum 158 (2022) en
dc.identifier.NAIST-ID 86639341 en
dc.identifier.NAIST-ID 74655929 en
dc.identifier.NAIST-ID 74652314 en
dc.identifier.NAIST-ID 74652181 en
dc.relation.pmid 35717167 en

Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record

Search DSpace

Advanced Search


My Account