Integrative exome sequencing and machine learning identify MICB and interferon pathway genes as contributors to SSc risk

作者信息Shamika Ketkar, Hongzheng Dai, Lindsay Burrage, David Murdock, Brian Dawson, Marialbert Acosta-Herrera, Martin Kerick, Javier Martin, Kevin Wilhelm, Jennifer Kay Asmussen, Olivier Lichtarge, Regeneron Genetics Center, Shervin Assassi, Maureen D Mayes, Brendan H Lee
PMID40514331
期刊Ann Rheum Dis
发布时间2025-08
DOI10.1016/j.ard.2025.05.009

摘要

Objectives: Systemic sclerosis (SSc) is a complex autoimmune disease with both known and unidentified genetic contributors. While genome-wide association studies (GWAS) have implicated multiple loci, many reside in noncoding regions. We aimed to identify novel protein-coding variants and pathogenic pathways using exome sequencing (ES) integrated with an Evolutionary Action-Machine Learning (EAML) framework, single-cell RNA sequencing (scRNA-seq), and expression quantitative trait locus (eQTL) analysis. Methods: GWAS was conducted in 2,559 SSc cases and 893 controls of Caucasian ancestry, with replication in 9,846 cases and 18,333 controls of European ancestry. EAML prioritized genes with high-impact missense variants predictive of disease. Public scRNA-seq data from SSc and control skin biopsies were analyzed to localize gene expression across cell types. Whole blood eQTL data were used to identify regulatory effects of risk variants. Results: A novel SSc risk locus at MICB (rs2516497, P = 3.66 × 10-13) was identified and replicated. EAML highlighted 284 genes enriched in interferon signaling. scRNA-seq localized MICB and NOTCH4 to fibroblasts and endothelial cells, while HLA class II genes were enriched in macrophages and fibroblasts. eQTL analysis confirmed regulatory effects at MICB, NOTCH4, and other prioritized genes, linking SSc-associated variants to transcriptional dysregulation. Conclusions: This integrative genomic study identifies novel risk loci and mechanistic pathways in SSc, highlighting MICB, NOTCH4, and interferon-related genes. The findings provide insight into the cellular and regulatory architecture of SSc and support the utility of combining ES, machine learning, scRNA-seq, and eQTL data in complex disease genetics.

实验方法

产品清单

名称品牌货号
集成DNA技术公司xGen外显子组研究面板v1.0Integrated DNA Technologies (IDT)xGen Exome Research Panel v1.0