| Issue |
BIO Web Conf.
Volume 200, 2025
Biology, Health & Artificial Intelligence Conference (BHAI 2025)
|
|
|---|---|---|
| Article Number | 01022 | |
| Number of page(s) | 4 | |
| DOI | https://doi.org/10.1051/bioconf/202520001022 | |
| Published online | 05 December 2025 | |
Towards a robust healthcare prediction model using an adaptive multimodal fusion based on hierarchical transformers
Computer Science Research Laboratory(LRI), Faculty of Sciences, Ibn Tofail University, Morocco
The Healthcare field has applied multimodal learning that combines diverse data types to improve the quality of predictions in clinical settings in terms of precision. The limitations of such multimodal approaches are highly related to heterogeneous structure, modality relevance, noise, and data size. In this paper, we present an adaptive multimodal fusion model that assigns distinct importance weights to modalities via attention-based pooling within a hierarchical transformer architecture. Indeed, the proposed model derives features for each modality independently and then aggregates cross-modal features using a hierarchical attention mechanism. We evaluate our architecture in a controlled setting across multiple simulations, and we demonstrate that our model performs effectively across four data types for predicting health. Besides, our experiments revealed clear improvements in validation AUC and F1 scores, especially when the data is limited or noisy, which supports the strength and reliability of our hierarchical transformer-based fusion approach.
Key words: Multimodal Fusion / Hierarchical Transformers / Healthcare Prediction / Generalization and Robustness / Adaptive Attention Mechanisms
© The Authors, published by EDP Sciences, 2025
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.

