GitHub - SVT-Yang/MedST: Offical code of Unlocking the Power of Spatial and Temporal Information in Medical Multimodal Pre-training[ICML 2024]

Unlocking the Power of Spatial and Temporal Information in Medical Multimodal Pre-training 【ICML 2024】

This is the offical code of Unlocking the Power of Spatial and Temporal Information in Medical Multimodal Pre-training.[ICML 2024]

Installation:

Clone this repository and install Python dependencies:

git clone https://github.com/SVT-Yang/MedST.git
pip install -r requirements.txt

Datasets Preparation:

Datasets we used are as follows:

MIMIC-CXR: MIMIC-CXR-JPG is the medical mutimodal dataset we used for pretraining.
MS-CXR-T benchmark：We used MS-CXR-T benchmark for temporal downstream tasks.
RSNA: We used the stage 2 of RSNA dataset in Kaggle.
COVIDx: We used the version 6 of COVIDx dataset in Kaggle which has 3 classes, i.e., no pneumonia/non-COVID-19 pneumonia/COVID-19 pneumonia.

After downloading datasets, please check if the path in constants.py is correct.

Data preprocess:

run mimic_cxr.py to get multi-view image-text pairs and temporal information.
run rsna.py and covidx.py to get train/val/test set.

Pre-training:

First, download pretrained weights we used:

Text encoder (BioClinicalBERT) : download pytorch_model.bin to /medst/emilyalsentzer/Bio_ClinicalBERT folder from Bio_ClinicalBERT.
MGCA pre-trained weights from MGCA.

Before pretraining, please make sure all the path is correct.

Then, we use this command to pretrain:

cd medst/models/medst
CUDA_VISIBLE_DEVICES=0,1 python medst_module.py --gpus 2 --strategy ddp --batch_size 10  --num_workers 8

Our pre-trained MedST can be found here.

Downstream tasks:

First, we need set the path (or ckpt_path) argument to the path of our pre-trained MedST model.

1. Temporal tasks (MS-CXR-T benchmark):

make sure the path of two csv files (temporal image classification and temporal sentence similarity classification) are correct.
run temporal_test.py to get the results.

2. Zero-shot classification on RSNA:

run zeroshot_RSNA.py to get the results.

3. Image classification on COVIDx:

We use --data_pct to specify the portion of training data for finetuning. To run all experiments for COVIDx classification task, we use this command:

./run_cls_covidx.sh

Acknowledgement

This work is built upon the MGCA and TCC.

Citation

@article{yang2024unlocking,
      title={Unlocking the Power of Spatial and Temporal Information in Medical Multimodal Pre-training}, 
      author={Jinxia Yang and Bing Su and Wayne Xin Zhao and Ji-Rong Wen},
      journal={arXiv preprint arXiv:2405.19654},
      year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
medst		medst
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Unlocking the Power of Spatial and Temporal Information in Medical Multimodal Pre-training 【ICML 2024】

Installation:

Datasets Preparation:

Data preprocess:

Pre-training:

Downstream tasks:

1. Temporal tasks (MS-CXR-T benchmark):

2. Zero-shot classification on RSNA:

3. Image classification on COVIDx:

Acknowledgement

Citation

About

Releases

Packages

Languages

SVT-Yang/MedST

Folders and files

Latest commit

History

Repository files navigation

Unlocking the Power of Spatial and Temporal Information in Medical Multimodal Pre-training 【ICML 2024】

Installation:

Datasets Preparation:

Data preprocess:

Pre-training:

Downstream tasks:

1. Temporal tasks (MS-CXR-T benchmark):

2. Zero-shot classification on RSNA:

3. Image classification on COVIDx:

Acknowledgement

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages