In a recent study published in NEJM AI, researchers developed the artificial intelligence (AI)-based Model Organism Aggregated Resources for Rare Variant ExpLoration (MARRVEL) model to select causal genes and their mutations for Mendelian illnesses based on clinical characteristics and genetic sequences.
​​​​​​​Study: AI-MARRVEL — A Knowledge-Driven AI System for Diagnosing Mendelian Disorders. Image Credit: Antiv/Shutterstock.com
Background
Millions of individuals globally are born with genetic illnesses, typically Mendelian illnesses caused by single gene mutations. Identifying these mutations takes effort and requires significant expertise.
Comprehensive, systematic, and efficient procedures could increase diagnostic speed and accuracy. AI has shown potential but has only had mediocre success in primary diagnosis.
Bioinformatics-based re-assessment is less expensive but has limited accuracy, making it tedious to prioritize non-coding variations, and requires using simulation data.
About the study
In the present study, researchers introduce the knowledge-driven MARRVEL AI-based model (AIM) to identify Mendelian illnesses.
AIM is a machine-learning classifier that combines over 3.5 million variations from thousands of identified cases and expert-engineered variables to enhance molecular diagnosis. The team compared AIM to patients from three cohorts and developed a confidence score to find diagnosable instances in unresolved pools.
They trained AIM on high-quality samples and expertly developed features. They tested the model on three patient datasets for various applications such as dominant, recessive, triple diagnosis, new disease gene identification, and large-scale re-evaluation.
Researchers collected Human Phenotype Ontology (HPO) keywords and exome sequences from three patient groups: DiagLab, the Undiagnosed Disease Network (UDN), and the Deciphering Developmental Disorders (DDD) Project. They divided DiagLab data into training and testing datasets and tested DDD and UDN separately.
They guided AIM by knowledge-driven feature engineering, which used clinical expertise and genetic principles to select 56 raw features such as minor allelic frequency, disease database, evolutionary conservation, variant impact, phenotype matching, inheritance pattern, variant pathogenicity estimation scores, gene constraint, sequencing quality, and splicing prediction.
The team created six modules for genetic diagnostic decision-making, resulting in 47 extra characteristics. They used random forest classifiers as the primary AI algorithm and consulted benchmarking publications and top performers.
They used characteristics such as SpliceAI to prioritize splicing variations. They developed the AIM-without-VarDB model to examine the impact of erroneous phenotypic data.
They used the “feature climbing” approach to assess the contribution of each feature and classify all characteristics according to their biological significance.
The researchers developed a cross-sample score to estimate the chance of a diagnostic variation being successfully diagnosed in a patient using AIM.
They divided patients into two groups based on their level of confidence: those with high confidence had manual review, while those with low confidence underwent reanalysis.
They constructed four degrees of confidence, applied them to UDN and DDD samples, and evaluated them by distinguishing positive patients from negative ones and unaffected relatives of de novo patients.
Results
AIM dramatically increased genetic diagnostic accuracy, tripling the number of solved cases relative to benchmarked approaches in three real-world cohorts. AIM attained a 98% accuracy rate and detected 57% of diagnoseable out of 871.
It also showed promise in novel illness gene discovery by accurately predicting two recently reported genes from the Undiagnosed Diseases Network. AIM outperformed existing methods on three separate datasets, outperforming Genomiser in the UDN and DiagLab cohorts.
The AIM method successfully distinguished between non-diagnostic and diagnostic pathogenic variations in ClinVar. AIM-without-VarDB had a little performance drop but yet outperformed the other benchmarked techniques.
Expert feature development increased the aim model’s accuracy while delaying training saturation. Using 20% of training data, AIM maintained a top-1 diagnostic accuracy of 54%. With more training samples, the model trained using the engineered variables showed 66% accuracy, whereas the model without engineering features was 58% accurate.
The researchers discovered an 11% drop in top-1 diagnostic accuracy, showing that precise phenotypic annotation is critical. Even with useless phenotypic information, AIM obtained 78% top-5 diagnostic accuracy, highlighting the significance of molecular evidence.
An increase in the OMIM-based phenotypic similarity score from zero to 0.25 increased prediction results by 60.0% to 90.0%. However, subsequent increments over 0.3 only resulted in a slight rise, indicating a lack of requirement for the precise match to OMIM phenotypes.
The trio classifier (AIM-Trio) outperformed the Exomiser and Genomiser Trio models while marginally outperforming the proband-only model (AIM). The AIM-NDG model removed characteristics linked to recognized illness databases.
Based on the study findings, AIM is a machine-learning genetic diagnostic tool capable of identifying novel disease genes and analyzing thousands of samples in days. It is very accurate and beneficial for initial diagnosis, reanalysis of unresolved cases, and identifying new disease genes.
AIM analyzes approximately 3.5 million variation data points from thousands of diagnosed cases and provides a Web interface for users to submit cases and examine findings.
However, limitations include not assessing structural or copy-number changes and focusing on situations with coding mutations. Large language models, such as PhenoBCBERT and PhenoGPT, have demonstrated higher performance.