If you use or extend this work, please cite:
@misc{liu2024factual,
title={Factual Serialization Enhancement: A Key Innovation for Chest X-ray Report Generation},
author={Kang Liu and Zhuoqi Ma and Mengmeng Liu and Zhicheng Jiao and Xiaolu Kang and Qiguang Miao and Kun Xie},
year={2024},
eprint={2405.09586},
archivePrefix={arXiv},
primaryClass={eess.IV}
}
- Python 3.9
torch==2.1.2+cu118
transformers==4.23.1
torchvision==0.16.2+cu118
radgraph==0.09
⚠️ Due to RadGraph's specific environment, we recommend two separate virtual environments:
- RadGraph environment: for structural entity extraction (
knowledge_encoder/radgraph_requirements.txt
)- Main FSE environment: for running the rest of the framework (
requirements.txt
)
-
IU X-Ray 📥 Images & Reports: Google Drive
-
MIMIC-CXR 📥 Images: PhysioNet (license required) 📥 Reports: Google Drive
-
MIMIC-CXR: Baidu Netdisk (code:
MK13
) -
IU X-Ray: Baidu Netdisk (code:
MK13
)
git clone https://github.com/dwadden/dygiepp.git
conda create -n dygiepp python=3.7
conda activate dygiepp
cd dygiepp
pip install -r requirements.txt
conda develop .
Refer to
knowledge_encoder/radgraph_requirements.yml
for additional dependencies.
- RadGraph model: PhysioNet RadGraph
- Annotation JSON: Google Drive (requires PhysioNet license)
Set local paths for:
radgraph_model_path
ann_path
(annotation.json)
Run:
python knowledge_encoder/factual_serialization.py
bash pretrain_mimic_cxr.sh
Configure --load
argument in pretrain_inference_mimic_cxr.sh
, then run:
bash pretrain_inference_mimic_cxr.sh
Configure --load
argument in finetune_mimic_cxr.sh
, then run:
bash finetune_mimic_cxr.sh
Download images, reports (mimic_cxr_annotation_sen_best_reports_keywords_20.json
), and checkpoints (finetune_model_best.pth
).
Configure --load
and --mimic_cxr_ann_path
in test_mimic_cxr.sh
, then run:
bash test_mimic_cxr.sh
- MIMIC-CXR (FSE-5,
$M_{gt}=100$ ):
- IU X-Ray (FSE-20,
$M_{gt}=60$ ):
- R2Gen Some codes are adapted based on R2Gen.
- R2GenCMN Some codes are adapted based on R2GenCMN.
- MGCA Some codes are adapted based on MGCA.
[1] Chen, Z., Song, Y., Chang, T.H., Wan, X., 2020. Generating radiology reports via memory-driven transformer, in: EMNLP, pp. 1439–1449.
[2] Chen, Z., Shen, Y., Song, Y., Wan, X., 2021. Cross-modal memory networks for radiology report generation, in: ACL, pp. 5904–5914.
[3] Wang, F., Zhou, Y., Wang, S., Vardhanabhuti, V., Yu, L., 2022. Multigranularity cross-modal alignment for generalized medical visual representation learning, in: NeurIPS, pp. 33536–33549.