
Graphical abstract. credit: Chronicles of rheumatic diseases (2025). doi:10.1016/j.ard.2025.05.009
Systemic sclerosis (SSC) is a severe autoimmune disease with complex genetic causes. While some genetic contributors have been identified, others remain unknown, impeding the development of targeted therapies. With a new study published in Chronicles of rheumatic diseasesresearchers at the Baylor College of Medicine of Medicine and Collaborating institution used a complementary approach to integrating Exome sequences and evolutionary action machine learning to identify protein changes in SSCs and their associated mechanisms.
Previous genome-wide association studies (GWAS) that analyzed the frequency of common genetic variants show the strongest genetic contributors in the human leukocyte antigen (HLA) region of chromosome 6. In this study, researchers led by first author Dr. Shamika Ketkar performed GWAS at the University of Texas Health Sciences Center in Houston using 2,559 cases of SSC patient and 893 health control cases, and GWAS using 893 health control cases. They aimed to find new genes and rare variants that contribute to SSC risk.
“What really surprised us and excited us was the discovery and replication of MICB, a gene located within the HLA region but acting independently of the classic HLA gene. MICBs were not previously involved in systemic sclerosis.
Spanish collaborators used previously published European GWAS data to replicate the findings using nearly 10,000 cases, further strengthening the importance of the findings. At Baylor, Dr. Olivier Lichtarge's lab used the Evolutionary Action Machine Learning (EAML) framework to analyze and prioritize exome sequence data with shocking variants predicting SSCs.
The results pointed back to MICB, as well as other genes on chromosome 6, such as Notch4, a gene enriched for interferon signaling (an important pathway of the immune system), including IFI44L and IFIT5.
“The machine learning framework not only determines whether variants occur frequently, but also uses evolutionary data from all species to weigh the possibility that variants may be functionally destructive to proteins and ultimately patients.
“We previously used this method in diseases with much larger genomic datasets, such as Alzheimer's disease and heart disease. This study shows that the patient's dataset is effective in small, complex diseases.”
To understand the functional effects of genetic variation identified in the study, the researchers integrated published single-cell RNA sequencing data from SSC skin biopsies to resolve cell type-specific expression patterns of risk genes. They also performed expression quantitative trait trajectory (EQTL) analysis using whole blood datasets to establish a regulatory link between disease-related variants and transcriptome changes.
MICB and NOTCH4 were found to be expressed in two cell types, fibroblasts and endothelial cells, which play a central role in the key clinical features of SSCs, fibrosis and vascular disorders. These complementary analyses confirmed the functional regulatory effects of identified risk genes.
“To solve complex diseases like SSCs, various approaches and machine learning must be combined with analysis of large-scale DNA, RNA and protein datasets to discover other hidden targets for treatment.”
detail:
Shamika Ketkar et al, identifying MICB and interferon pathway genes as contributors to integrated exome sequencing and machine learning SSC risk; Chronicles of rheumatic diseases (2025). doi:10.1016/j.ard.2025.05.009
Provided by Baylor College of Medicine
Quote: Integrated exome sequencing and machine learning identify new genes that contribute to systemic sclerosis risk (June 16, 2025) From June 16, 2025 https://medicalxpress.com/news/2025-06-exome-sakesing-machine-genes-contributing.htmll
This document is subject to copyright. Apart from fair transactions for private research or research purposes, there is no part that is reproduced without written permission. Content is provided with information only.
