| Issue |
BIO Web Conf.
Volume 200, 2025
Biology, Health & Artificial Intelligence Conference (BHAI 2025)
|
|
|---|---|---|
| Article Number | 01020 | |
| Number of page(s) | 10 | |
| DOI | https://doi.org/10.1051/bioconf/202520001020 | |
| Published online | 05 December 2025 | |
Few-Shot Learning for Predicting Genetic Biomarkers in Cerebral Autosomal Dominant Arteriopathy with Subcortical Infarcts and Leukoencephalopathy (CADASIL)
Laboratory of Integrative Biology, Faculty of Science Ain Chock, Casablanca, University Hassan II, Morocco
* Corresponding author: This email address is being protected from spambots. You need JavaScript enabled to view it.
Cerebral Autosomal Dominant Arteriopathy with Subcortical Infarcts and Leukoencephalopathy (CADASIL) is a rare hereditary cerebral small-vessel disorder with an estimated prevalence of 4.6 per 100,000 adults, primarily caused by NOTCH3 mutations. Its rarity has limited the availability of genetic data, which is important to understand this pathology. Traditional prediction methods require large datasets and fail with limited data. Given these challenges, our study aims to enrich the genetic data on CADASIL. To achieve this, we applied a Few-Shot Learning (FSL) strategy. A total of 4 previously validated CADASIL single nucleotide polymorphisms (SNPs) and 938,544 negative SNPs were extracted from the GWAS catalogue, with their genetic annotations. Based on the assumption of genetic proximity, we generated for each SNP a genomic context string. These strings were embedded into dense vector representations using paraphrase-MiniLM-L6-v2. Similarity scores then ranked candidate SNPs, and the top 100 were identified as novel biomarkers. This in silico framework predicted 100 SNPs and 24 genes. It provides potential biomarkers for early diagnosis, insights into disease mechanisms, and candidate therapeutic targets. This study also validates the compatibility of FSL in the context of rare diseases, paving the way for other applications.
Key words: CADASIL / Genetic Biomarkers / Few-Shot Learning / Genomic Embeddings / Genome-Wide Association Studies
© The Authors, published by EDP Sciences, 2025
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.

