-
In the medical domain, particularly in radiography, large-scale datasets are generally limited to English reports and to specific body areas. Data annotation is expensive and poorly available to train supervised deep learning models.
-
To explore vision-language pretraining using bone X-rays paired with French reports sourced from a single university hospital department.
-
We leveraged bone radiographs and their associated French reports from a single hospital to pretrain a versatile vision-language model, to be used as a backbone for a variety of tasks trained with limited supervision.
-
The obtained multi-modal representation is shown to result in downstream task performance that are competitive with models trained with a significantly larger amount of human supervision.
• Project type: research
• Team status: complete
• Initiator: Alexandre Englebert, MD
• Co-investigators: Anne-Sophie Collin, Olivier Cornu, MD & PhD, Christophe De Vleeschouwer, PhD
• Fundings: FNRS/FRIA grant to Alexandre Englebert for his PhD
• Project status: done