|
Subject
Breast cancer is the most common cancer among women worldwide and remains a leading cause of cancer-related mortality. One of the earliest indicators of breast cancer is the presence of microcalcifications (MCs) - small calcium deposits that often signify early-stage malignancy. In medical imaging, Digital Breast Tomosynthesis (DBT) has emerged as a powerful 3D imaging modality, offering enhanced visualization of breast tissue compared to traditional 2D mammography. However, the widespread adoption and clinical applicability of DBT is still evolving, as there is no clear consensus on whether it offers a meaningful overall advantage over standard mammography across different diagnostic scenarios. However, for small structures like MCs, DBT may allow for more detailed analysis. Alongside imaging data, clinical and demographic information (e.g., age, family history) can provide additional predictive value in cancer diagnosis. But: effectively combining 3D imaging data with tabular clinical data remains a challenge in AI-based medical diagnostics. HyperFusion-Net, a novel hypernetwork-based architecture for multimodal data fusion, has been recently proposed to address this challenge by enabling the joint learning of features from medical images and tabular data [1]. The method has shown promising results across different tasks, but it has not yet been applied to DBT images of MCs. In this thesis, the student will implement HyperFusion-Net for classification of breast MCs using only the EMBED DBT images in combination with available clinical data. This approach will enable comprehensive fusion of imaging and non-imaging information and explore its potential to improve classification performance. Optionally, radiomic features extracted from (individual) MCs may also be incorporated (alongside clinical data) to enrich the data representation and further explore.
Kind of work
Objective: To implement and evaluate HyperFusion-Net for the classification of breast MCs using 3D DBT images and clinical (and radiomic features) tabular data from the EMBED dataset [2], and to assess the effectiveness of multimodal data fusion in improving classification accuracy.
Description of work: - Literature Review (ETOC: 2 months): Review HyperFusion-Net and related approaches for multimodal learning in medical imaging. Study DBT-based classification techniques and prior applications of clinical vs clinical data fusion (in the context of MCs analysis). - Dataset Familiarization (ETOC: 1 month) - Understand the EMBED datasets 3D DBT images and associated clinical data. Perform necessary preprocessing, such as image normalization, segmentation of MCs if required, and formatting of tabular data. - Implementation (ETOC: 6 months) - Implement or adapt the HyperFusion-Net architecture to process 3D image volumes and tabular data. Train and validate the model for classification tasks. Assess model performance using classification metrics (e.g., accuracy, AUC, precision/recall). Compare fusion-based results to models using only image or tabular data alone to evaluate the added value of multimodal fusion.
Framework of the Thesis
Related work: [1] Duenias, D., Nichyporuk, B., Arbel, T. and Raviv, T.R., 2025. Hyperfusion: A hypernetwork approach to multimodal integration of tabular and medical imaging data for predictive modeling. Medical Image Analysis, p.103503. [2] Jeong, J.J., Vey, B.L., Bhimireddy, A., Kim, T., Santos, T., Correa, R., Dutt, R., Mosunjac, M., Oprea-Ilies, G., Smith, G. and Woo, M., 2023. The EMory BrEast imaging Dataset (EMBED): A racially diverse, granular dataset of 3.4 million screening and diagnostic mammographic images. Radiology: Artificial Intelligence, 5(1), p.e220047.
Expected Student Profile
Following an MSc in a field related to one or more of the following: Computer Science, Biomedical Engineering, Applied Computer Science - Digital Health. Strong programming skills (Python). Ability to write scientific reports and communicate research results at conferences in English.
|
|