- Research
- Open access
- Published:
Integrating clinical data and genetic susceptibility to elucidate the relationship between systemic lupus erythematosus and human cytomegalovirus infection
Virology Journal volume 21, Article number: 311 (2024)
Abstract
Background
Viral infections are known to induce the occurrence and pathogenesis of systemic lupus erythematosus (SLE). Previous studies have indicated a possible relationship between SLE and human cytomegalovirus (HCMV) infection and have attributed HCMV to be associated with various autoantibodies; however, these studies were constrained by variations in sample size and potential selection bias. Therefore, in the present study, we aimed to elucidate the relationship between HCMV and autoantibodies in patients with SLE by integrating clinical data and genetic susceptibility.
Methods
Using various statistical methods, we conducted a retrospective analysis of the spectrum of SLE autoantibodies and HCMV infections among patients hospitalized at our center over the past 10 years. Machine learning modeling was used to predict active HCMV infections based on the antinuclear (ANA) spectrum. Moreover, Mendelian randomization (MR) was used to investigate the causal relationship between SLE and HCMV infection.
Results
In the HCMV group, the levels of ANA, anti-dsDNA, anti-histone antibody (AHA), and anti-nucleosome antibody (ANuA) were significantly increased (P < 0.001) and were linked to the presence of CMV-pp65-antigen-positive polymorphonuclear leukocytes (P < 0.001). A weak correlation was observed between the titers of anti-CMV IgM and ANA (P < 0.001). The ANA spectrum demonstrated a strong predictive performance for active HCMV infection based on principal component analysis (Adonis and ANOSIM P < 0.001) as well as support vector machine and extreme gradient boosting modeling. MR analyses of inverse-variance weighted, weighted mean, MR-Egger, and weighted mode revealed that patients with SLE were at a higher risk of developing HCMV infection (P < 0.05). However, HCMV infection did not have a causal effect on SLE (P > 0.05).
Conclusion
The ANA spectrum in patients with SLE can be used to predict HCMV infection status. Due to the inherent susceptibility of patients with SLE to HCMV infection, we propose for the first time that if a patient with SLE exhibits high serum titers of ANA, anti-dsDNA, ANuA, and AHA, caution should be exercised for HCMV infection, which can contribute to the clinical assessment of SLE and improve patient prognosis.
Background
Systemic lupus erythematosus (SLE) is a chronic autoimmune disease characterized by highly heterogeneous clinical manifestations and multifactorial involvement. SLE can affect the skin, joints, kidneys, brain, and hematological system. Notably, SLE is characterized by a wide spectrum of serum autoantibodies that target self-antigens, mainly of nuclear origin, including DNA chromatin and its component proteins, spliceosome components, and Ro/La complexes [1]. The cause of SLE remains unknown; however, genetic and environmental factors might contribute to SLE development [2]. In some individuals with SLE, autoantibodies are not detectable in the early years; these individuals exhibit clinical symptoms when their immune systems become more active over time, indicating that environmental factors play a crucial role in SLE pathogenesis [3]. Viral infections, such as those caused by human cytomegalovirus (HCMV), Epstein-Barr virus, and Parvovirus B19, are important factors that contribute to the occurrence and development of SLE [4,5,6,7].
HCMV, a member of the Herpesviridae family, contains a double stranded (ds) DNA genome with a potential to encode more than 200 open reading frames (ORFs), and approximately 751 translated ORFs have been identified [8]. HCMV can establish lifelong latent infection [9, 10]. Globally, among adults, HCMV seroprevalence ranges from 45 to 100% [10]. Additionally, in patients with SLE, HCMV seroprevalence is as high as 90% [11]. HCMV is the most common opportunistic infection in hospitalized patients with SLE, accounting for approximately 61.1% of SLE-associated infections [12]. HCMV infection is one of the significant complications and causes of death in severely ill patients receiving pulse methylprednisolone therapy [13]. Therefore, the association between SLE and HCMV infection is substantial and should not be disregarded in clinical treatment and prognosis.
For decades, researchers have proposed that viral infections induce the occurrence and/or pathogenesis of SLE [14, 15]. The underlying mechanisms include molecular mimicry, epitope spreading, superantigen production, bystander activation, persistent viral infection, apoptotic changes, clearance defects, and epigenetic changes [16, 17]. Patients with SLE exhibit a high level of anti-CMV UL44 IgG, which can bind to nucleolin, dsDNA, and Ku70 in the cell nucleus [18]. The HCMV Pp150 protein shares similar antigenic epitopes with CIP2A proteins, which are expressed on the surface of and intracellularly in CD56 + NK cells; the combination of these proteins mediates the death of CD56 + NK cells [19]. Although clinical retrospective data have provided some insights into the relationship between autoantibodies and HCMV infection in patients with SLE, the findings vary due to sample size and potential selection bias [20, 21]. Furthermore, the association between SLE and HCMV remains uncertain. While some believe that HCMV infection can potentially trigger the development of SLE, others suspect that the weakened immune responses of individuals with SLE make them more susceptible to secondary HCMV infections.
In the present study, we conducted a retrospective analysis to elucidate the relationship between HCMV infection and autoantibodies in patients with SLE hospitalized at our center over the past 10 years. Moreover, we developed artificial intelligence prediction models using machine learning and investigated the genetic causality between SLE and HCMV infection using bidirectional two-sample Mendelian randomization (TSMR) analysis. This study can provide new insights for basic research and predictive antibody markers for physicians.
Methods
The study design and workflow are shown in Fig. 1.
Patients
This study included patients with SLE who received inpatient treatment at Peking Union Medical College Hospital (PUMCH) between January 2012 and December 2021. All patients fulfilled the 1997 American College of Rheumatology (ACR) criteria [22] or the 2019 European Alliance of Associations for Rheumatology/ACR (EULAR)/ACR classification criteria [23], which were confirmed by two or more experts.
Exclusion criteria: Patients diagnosed with bacterial, parasitic, spirochetal, chlamydial, rickettsial, and viral infections, as well as those with mixed infections, determined using medical records. In addition, patients with missing peripheral blood test information for CMV-DNA, CMV phosphoprotein 65 (PP65) antigen, anti-CMV-IgM antibodies, or anti-CMV-IgG antibodies were excluded.
Definition and grouping
All eligible cases that met the inclusion criteria (“eligible cases”) were classified into the HCMV and non-HCMV groups as well as other cases, based on whether they were accompanied by active HCMV infection. Active HCMV infection refers to a positive CMV DNA test, a positive CMV pp65 antigen test, or the presence of CMV inclusion bodies [24, 25]. Few patients in this study underwent anti-CMV IgM testing twice within a short period, and it was difficult to determine active HCMV infection based on an increase in anti-CMV IgM titers. Thus, cases with elevated anti-CMV IgM titers but negative CMV-PP65 and CMV-DNA titers were classified as other cases.
Data collection
Upon admission, we collected the following information: demographic characteristics such as sex and age; clinical characteristics including organ involvement; and laboratory findings including CMV DNA, CMV pp65, anti-CMV IgM antibodies, anti-CMV IgG antibodies, anti-nuclear antibody spectrum (anti-nuclear antibody (ANA), anti-double-stranded DNA (dsDNA) antibody, anti-SSA antibody, anti-SSB antibody, anti-ribosomal RNP (rRNP) antibody, anti-histone antibody (AHA), anti-nucleosome antibody (ANuA), anti-PM-Scl antibody, anti-Ro52 antibody, anti-Smith (Sm) antibody, anti-Jo-1 antibody, anti-centromere protein B (CENP) antibody, anti-mitochondrial M2 antibody (AMA-M2), anti-ribonucleoprotein (RNP) antibody, anti-Scl-70 (topoisomerase I) antibody, and anti-proliferating cell nuclear antigen (PCNA) antibody).
Laboratory assays
Real-time fluorescence quantitative polymerase chain reaction (RT-qPCR) was used to quantify CMV DNA. A positive threshold of ≥ 500 copies/ml HCMV DNA was used.
The CMV pp65 assay uses a monoclonal antibody indirect immunofluorescence (IF) method to detect the CMV pp65 antigen. CMV-pp65-antigen results were considered positive at > 1/2 × 105 polymorphonuclear leukocytes (PMNs).
An enzyme-linked immunosorbent assay was used to detect anti-CMV IgM/IgG antibodies in peripheral blood. The result was determined by comparing the value of the sample (S) to the cutoff (CO), with a ratio (S/CO) greater than 0.9, indicating a positive outcome.
To detect antinuclear (ANA) and anti-dsDNA antibodies, immunofluorescence assays were performed using Hep-2 cells. ANAs in the samples were initially screened at a dilution of 1:80. If a sample showed a positive result, the titer concentrations at specific dilutions (1:80, 1:160, 1:320, 1:640, and 1:1280) were determined. In contrast, for anti-dsDNA samples, initial screening was performed at a dilution of 1:10. If a sample tested positive, the titer concentration was determined at dilutions of 1:10, 1:20, 1:40, 1:80, 1:160, and 1:320. The highest dilution factor that yielded a positive result was reported as the positive titer.
A linear immunoassay (LIA) was used to measure the levels of IgG antibodies, including anti-SSA, anti-SSB, anti-rRNP, AHA, ANuA, anti-PM-Scl, anti-Ro52, anti-Sm, anti-Jo-1, anti-CENP, AMA-M2, anti-RNP, anti-Scl-70, and anti-PCNA antibodies. LIA scores were classified as follows:0–15 was considered negative (-); 16–36 was weakly positive (+); 37–71, positive (++); and > 72, strongly positive (+++).
Statistical analysis
Categorical variables were presented as absolute and relative frequencies or prevalence. Associations between categorical variables were tested using the χ2 test and Cochran-Artimage test (CAT) for trend analysis, whereas rank-biserial correlation analysis was used to examine the strength of the relationship. Principal component analysis (PCA) was conducted to reduce the dimensionality of ordinal variables in the ANA spectrum, considering that ordinal variables are frequently utilized in clinical practice. Moreover, the adonis and anosim functions in R and the ANA ordinal variables were used to determine the significance of dissimilarities between two groups spectrum. The quantitative data were analyzed using the Wilcoxon rank-sum test or t-test, depending on the data distribution. Spearman’s correlation analysis was used to analyze the correlation between continuous and ordinal variables. |r| between 0.2 and 0.4 indicated a weak correlation, |r| between 0.4 and 0.7 indicated a moderate correlation, and |r| > 0.7 indicated a strong correlation.
Machine learning
The data were randomly split, with 80% allocated for training and 20% for validation. Two machine learning algorithms, a Support Vector Machine (SVM) and a novel simple tree-based ensemble method called Extreme Gradient Boosting (XGBoost), were employed for model construction. The train function in ‘caret’ package was used to evaluate the effect of model tuning parameters on performance and a tenfold cross-validation was applied for resampling to obtain the best-tuned parameters. The predictive models were assessed based on their performance in terms of discrimination, which was measured quantitatively using the area under the receiver operating characteristic (ROC) curve (AUC).
Bidirectional TSMR analysis
TSMR is a novel approach that employs genetic information to investigate causal relationships [26, 27]. A TSMR analysis is based on three assumptions [28]. First, it assumes that the instrumental variables (IVs) selected from the datasets are directly associated with the exposure. Second, there should be no relationship between the IVs and any other factors that could potentially confound the relationship between the exposure and the outcome. Finally, the IVs should be connected to the outcome only through their association with the exposures, and not through any other ways (Fig. 1). We conducted a bidirectional TSMR analysis to investigate the causal direction of the relationship between HCMV and SLE. First, we considered SLE exposure and HCMV infection as outcomes. In addition, we considered HCMV infection as the exposure and SLE as the outcome.
Data sources
We performed a genome-wide meta-analysis of SLE, using 5,201 patients with SLE and 9,066 controls of European descent [29]. SLE datasets (GWAS ID: ebi-a-GCST003156) are available from the IEU Open GWAS project (https://gwas.mrcieu.ac.uk/datasets). The GWAS summary data for HCMV infection were obtained from the FinnGen GWAS summary database (https://r9.finngen.fi/pheno/CMV_NOS) with 451 cases and 376,730 controls.
Instrumental variable selection and quality control
Candidate IVs were selected based on single nucleotide polymorphisms (SNPs) that were highly correlated with exposure. For SLE, SNPs with P-value < 5 × 10− 8 were taken into consideration. To study HCMV infection, we selected SNPs with a less strict P-value threshold of < 5 × 10− 6. This approach increased the number of SNPs that could be further analyzed [30]. Candidate IVs were analyzed for linkage disequilibrium (LD) using the 1000 Genomes Project European sample data as a reference panel. The LD correlation coefficient was set to R2 value less than 0.001, and the clumping window size was set to 10,000 kb. We used the PhenoScanner V2 [31, 32] to identify potential confounders and avoid the issue of horizontal pleiotropy. Any IV correlated with risk factors related to the outcome would be excluded. To detect IV bias, we employed the F-statistic and removed IVs with a value below 10 [33, 34] using a prescribed formula (Eq. A and Eq. B) [34, 35]. At last, the exposure and outcome of the SNPs were harmonized to ensure that the effect alleles of the SNPs on exposure aligned with the effect alleles on the outcome. Any unmatched SNPs were excluded from the analysis. The harmonization process also involved deducing and eliminating palindromic and ambiguous SNPs.
\(\:{\text{R}}^{2}\) represents the exposure variance for each IV, \(\:\text{E}\text{A}\text{F}\) is the effect allele frequency, \(\:{{\beta\:}}_{\text{e}\text{x}\text{p}\text{o}\text{s}\text{u}\text{r}\text{e}}\) and \(\:{\text{S}\text{E}}_{\text{e}\text{x}\text{p}\text{o}\text{s}\text{u}\text{r}\text{e}}^{2}\) stand for the estimated effect and standard error of SNPs on the exposure, and \(\:\text{N}\) is the sample size.
Pleiotropy effect analysis
To detect potential horizontal pleiotropy effects, the MR-PRESSO (NbDistribution = 10000) and MR-Egger regression tests were utilized. The MR-PRESSO outlier test assessed the significance of individual SNPs in terms of pleiotropy, while the MR-PRESSO global test provided an overall p-value for horizontal pleiotropy. The SNPs were arranged in ascending order based on their MR-PRESSO outlier test p-values and eliminated one by one. After removing each SNP, the MR-PRESSO global test was conducted on the remaining SNPs. This iterative process continued until the global test p-value became statistically insignificant (P > 0.05). Prior to the Mendelian randomization (MR) analysis, all SNPs exhibiting pleiotropy were eliminated.
MR analysis
Various methods commonly used for causal inference include inverse-variance weighted (IVW), weighted median (WM), MR-Egger regression, simple mode, and weighted mode methods. The IVW method is particularly important for causal detection in TSMR analysis without horizontal pleiotropy [36]. Other methods, such as weighted median [37], MR-Egger regression [28], simple mode [38], and weighted mode [39], can be used in conjunction with IVW to expand the range of confidence intervals [40]. If both IVW and WM methods consistently show the same direction and estimate of the causal effect, and the p-value of the IVW method is less than 0.05, a causal conclusion can be drawn. To assess the robustness of the findings, sensitivity analysis was performed, including a leave-one-out analysis to identify any influential single SNP on the results [41].
Heterogeneity
Cochran’s IVW Q statistics were used to assess the variability of IVs. Heterogeneity is indicated when the Q statistic p-value < 0.05 [42, 43].
All statistical analyses, modeling processes, and MR analyses were performed using R software [44] (v 4.1.3) and R packages including ‘DescTools’ (v 0.99.50), ‘rcompanion’ (v 2.4.34), ‘vegan’ (v 2.6-4), ‘caret’ (v 6.0–94),‘xgboost’ [45] (v 1.7.5.1),‘e1071’ (v 1.7–13), ‘TwoSampleMR’ [38, 41] (version 0.5.6), and ‘MR-PRESSO’ [46] (version 1.0). Statistical significance was set at P < 0.05. significant.
Results
Participant characteristics
A total of 3619 patients with SLE were hospitalized at PUMCH between January 2012 and December 2021. Three thousand one hundred fifty-two of these patients were excluded, and 467 patients were included in the study. Of the 467 eligible patients, 248 were in the HCMV group, 154 in the non-HCMV group, and 65 in the other cases group (Fig. 1). The average age was 29 years and 88.4% (413/467) of the participants were women. There were no statistically significant differences in age, sex, or other organ involvement between the two groups (P > 0.05, Table 1; Fig. 2).
The relationship between the ANA spectrum and HCMV infection
We found a trend of proportional changes in titration levels (ordinal variables) for ANA, anti-dsDNA, AHA, and ANuA, indicating the presence of active HCMV infection (P < 0.001; Table 2). Moreover, a higher proportion of patients in the HCMV group were associated with a higher titration level (r of the rank-biserial correlation test > 0; Fig. 3A). However, no such trend was observed in the other ANA spectra (P > 0.05, Supplementary Table 1). We performed PCA on the ordinal variables (n = 232) of the ANA spectrum and found that the ANA spectrum is significantly different within each group (adonis: R2 = 0.0710,P < 0.001; ANOSIM: R = 0.0798, P < 0.001). Using difference analysis on continuous variables, we found that the antibody titers of AHA and ANuA in the HCMV group were significantly higher than those in the non-HCMV group (P < 0.001; Fig. 3C and Supplementary Table 2), which were consistent with the CAT results.
Interestingly, the number of CMV-pp65-antigen-positive PMN was associated with ANA (r = 0.386), anti-dsDNA (r = 0.517), AHA (r = 0.405), and ANuA (r = 0.525) titers. In addition, anti-CMV IgM and ANA titers showed a weak correlation (r = 0.201, P < 0.001; Fig. 3D, Supplementary Table 3).
(A) Rank-biserial correlation analysis. The x-axis represents the correlation coefficient (r), and the y-axis represents anti-nuclear antibodies. The light red bar represents the correlation of antibody titers with the HCMV group, and the light green bar represents the correlation of antibody titers with the non-HCMV group. The larger the |r| value, the stronger the correlation. (B) Principal component analysis. (C) Boxplot of the continuous variables of ANA spectrum (Wilcoxon rank-sum test). (D) Heatmap of HCMV corresponding laboratory tests and ANA spectrum
The ANA spectrum predicts HCMV infection status
The machine learning model incorporated ordinal variables of the ANA spectrum. The data were divided into two sets: 80% (186 cases) for training and 20% (46 cases) for testing. SVM and XGBoost were used to train the model, and both algorithms demonstrated strong predictive performances when evaluated using the test data. The AUC values for SVM and XGBoost were 0.806 (Fig. 4A) and 0.803 (Fig. 4B), respectively. The four indicators (ANA, anti-dsDNA, AHA, and ANuA) related to active HCMV infection mentioned previously were also ranked “TOP4” in the variable importance ranking of the XGBoost model.
(A, B) ROC curve of training models. (C) The top nine features chosen, using XGBoost, and their respective variable importance scores are represented in the graph. The x-axis represents the importance score, which is a measure of variability. The y-axis represents the top nine variables, ranked by their weighted scores
The genetic predisposition of SLE and HCMV infection
Following IV selection and quality control, 45 and three SNPs remained as candidate IVs for SLE and HCMV infection, respectively (Supplementary Table 5), with no weak IVs or risk factors for each outcome. A total of 42 IVs for SLE and three for HCMV infection remained after harmonizing the exposure and outcome data (Supplementary Table 6).
The IVW, WM, MR-Egger, and weighted models revealed that SLE was associated with a higher HCMV infection risk (P < 0.05, Fig. 5A and B). In addition, the simple model yielded similar causal estimates for magnitude and direction. Conversely, HCMV infection had no causal effect on SLE (P > 0.05; Fig. 5A).
No horizontal pleiotropy effects or outliers were found among the IVs (P > 0.05; Supplementary Table 7). Additionally, the Cochrane Q statistics indicated no noticeable heterogeneity (P > 0.05) (Supplementary Table 8). Leave-one-out analysis demonstrated that no individual SNP significantly affected the correlation between SLE and HCMV infection (Fig. 5C).
(A) Forest plot of MR analysis. (B) Single nucleotide polymorphisms (SNPs) were used to assess the impact of SLE on HCMV infection, using five MR methods. The dots represent the effect size (β) of each SNP on SLE (x-axis) and HCMV infection (y-axis), and the grey crosses represent the standard errors. Regression slopes show the estimated causal effect of SLE on HCMV infection. The light blue, dark blue, light green, dark green, and red regression lines represent the inverse variance weighted method, MR-Egger regression, simple mode, weighted median method, and weighted mode, respectively. (C) Leave-one-out analysis. The sensitivity of the causal effect of different single nucleotide polymorphisms (SNPs) of SLE on HCMV infection was analyzed using leave-one-out analysis. The error bar depicts the 95% confidence interval using the inverse variance weighted method
Discussion
To the best of our knowledge, this SLE study included the longest time span and largest number of cases. This was a cross-sectional retrospective study that aimed to elucidate the relationship between SLE autoantibodies and cytomegalovirus infection. We found that out of the 467 HCMV patients included in the study, 248 had an active HCMV infection, accounting for 53.1% (248/467) of the participants. Sixty-five of the other patients with only anti-CMV IgM positivity suggested a recent HCMV infection, accounting for 13.9% (65/467) of the participants. Our study revealed a high rate of active/current HCMV infection (approximately 67%) among hospitalized patients with SLE, which is consistent with previous findings [12, 47, 48]. There was no statistically significant difference in organ involvement; however, there was an increasing trend of renal involvement in the cases with active HCMV infection (P = 0.051). Anti-CMV antibodies cross-react with various nuclear components, leading to early manifestations of lupus nephritis in BALB/c mice [17, 49]. The pathogenic properties of anti-dsDNA antibodies are attributed to glomerular basement membrane binding [50].
Interestingly, using different statistical methods, we found that ANA, anti-dsDNA, AHA, and ANuA levels were significantly elevated in patients with SLE with active HCMV infection. This finding is consistent with results observed in new-onset SLE patients [21]. The molecular mimicry theory is commonly used to explain the relationship between HCMV and SLE. In 1964, Damian introduced this theory to describe the sequence similarities between host organism proteins and infectious pathogens. This similarity can potentially disrupt the body’s self-tolerance and trigger autoimmune responses involving B and/or T cells [51]. Autoantibodies can spread to other self-antigens through B cell epitope spreading. This occurs when the structural characteristics of the two antigens are sufficiently similar to imitate each other in the immune system [52]. Many studies have reported that the CMV antigen induces the production of various autoantibodies in patients with SLE or lupus mice [6, 18, 53,54,55,56]. Patients with SLE with higher ANA levels and anti-dsDNA titers have higher disease activity [57]. Our study results are consistent with that of previous studies showing higher disease activity in SLE patients with concurrent HCMV infection [58, 59]. HCMV phosphoprotein 65 (PP65) acts as a framework for the virus and is present in large amounts in viral particles outside cells. PP65 modulates viral kinase activity and suppresses immune responses against the virus [60]. The anti-CMV PP65422-439 antibodies, produced by SLE patients, may be related to increased levels of anti-dsDNA [17]. AHA and ANuA are highly associated with SLE; their diagnostic sensitivity for SLE is low, but specificity was high [61,62,63]. A positive correlation was found between ANuA titers and the systemic lupus erythematosus disease activity index [64, 65]. Flares and antibody production in SLE are closely associated with HCMV infection.
We further studied the correlation between ANA spectrum titers and various indicators of HCMV infection (CMV pp65 assay, anti-CMV IgM, and anti-CMV IgG). Because the proportion of CMV DNA-positive cases was low, it was not included in the correlation analysis. A positive CMV pp65 assay represents active HCMV infection, and our study showed that the number of CMV-pp65-positive PMN positively correlated with ANA, anti-dsDNA, AHA, and ANuA. Anti-CMV IgM antibodies reflect recent HCMV infections and can be detected 2 weeks to 4–6 months after symptom onset [66]. This study found a weak correlation between anti-CMV IgM antibody titers and ANA, but no reliable correlation between anti-CMV IgG antibody titers and the ANA spectrum. In conclusion, our study suggests that HCMV infection affects the humoral immune response of patients with SLE, thereby exacerbating lupus activity. The stronger the HCMV infection, the higher the SLE activity.
In this study, we focused on the close relationship between HCMV infection and the ANA spectrum. To predict the presence of active HCMV infection in patients with SLE, we first used PCA and machine-learning algorithms. Our findings indicated that the ANA spectrum was able to effectively differentiate the infection status; ANA, anti-dsDNA, AHA, and ANuA were the four most important predictive antibodies.
However, the association between SLE and HCMV remains unclear. This study aimed to provide an answer from a genetic perspective. We used genotypes instead of exposure factors to infer causality by employing a MR analysis, which is similar to a naturally randomized controlled trial [26, 27]. Surprisingly, we discovered that HCMV infection was not the cause of the increased SLE risk, but rather that SLE patients themselves were susceptible to HCMV infection. This alerts clinicians to be vigilant of HCMV infection in patients with SLE, especially in those with significantly elevated levels of ANA, anti-dsDNA, AHA, and AnuA. It is also recommended that clinicians consider administering an HCMV vaccine to such patients when their conditions are stable.
Conclusions
This study underscores the complex relationship between SLE and HCMV, highlights the association between clinical ANA profiles and active HCMV infection, and suggests a genetic susceptibility to HCMV infection in the presence of SLE. In clinical practice, the ANA spectrum of patients with SLE can help predict their HCMV infection status, which can aid in the evaluation of SLE and lead to better patient outcomes.
Data availability
Data is provided with the manuscript or supplementary.
Abbreviations
- ACR:
-
American College of Rheumatology
- AHA:
-
Anti-histone antibody
- AMA-M2:
-
Anti-mitochondrial M2 antibody
- ANA:
-
Antinuclear antibody
- ANuA:
-
Anti-nucleosome antibody
- AUC:
-
Area under the curve
- CAT:
-
Cochran–Artimage test
- CENP:
-
Centromere protein
- HCMV:
-
Human cytomegalovirus
- IF:
-
Immunofluorescence
- IV:
-
Instrumental variable
- IVW:
-
Inverse-variance weighted
- LD:
-
Linkage disequilibrium
- LIA:
-
Linear immunoassay
- MR:
-
Mendelian randomization
- ORF:
-
Open reading frame
- PCA:
-
Principal component analysis
- PCNA:
-
Proliferating cell nuclear antigen
- PMNs:
-
Polymorphonuclear leukocytes
- PP65:
-
Phosphoprotein 65
- PUMCH:
-
Peking Union Medical College Hospital
- RNP:
-
Ribonucleoprotein
- ROC:
-
Receiver operating characteristic
- RT–qPCR:
-
Real-time–quantitative polymerase chain reaction
- SLE:
-
Systemic lupus erythematosus
- SNP:
-
Single nucleotide polymorphism
- SVM:
-
Support vector machine
- TSMR:
-
Two-sample Mendelian randomization
- WM:
-
Weighted median
- XGBoost:
-
Extreme gradient boosting
References
Vina ER, Utset TO, Hannon MJ, Masi CM, Roberts N, Kwoh CK. Racial differences in treatment preferences among lupus patients: a two-site study. Clin Exp Rheumatol. 2014;32(5):680–8.
Silman AJ, MacGregor AJ, Thomson W, Holligan S, Carthy D, Farhan A, Ollier WE. Twin concordance rates for rheumatoid arthritis: results from a nationwide study. Br J Rheumatol. 1993;32(10):903–7.
Arbuckle MR, McClain MT, Rubertone MV, Scofield RH, Dennis GJ, James JA, Harley JB. Development of autoantibodies before the clinical onset of systemic lupus erythematosus. N Engl J Med. 2003;349(16):1526–33.
Akagi S, Ichikawa H, Suzuki J, Makino H. Systemic lupus erythematosus associated with cytomegalovirus infection. Scand J Rheumatol. 2004;33(1):58–9.
Díaz F, Urkijo JC, Mendoza F, De la Viuda JM, Blanco M, Flores M, Berdonces P. Systemic lupus erythematosus associated with acute cytomegalovirus infection. J Clin Rheumatol. 2006;12(5):263–4.
Rasmussen NS, Draborg AH, Nielsen CT, Jacobsen S, Houen G. Antibodies to early EBV, CMV, and HHV6 antigens in systemic lupus erythematosus patients. Scand J Rheumatol. 2015;44(2):143–9.
Quaglia M, Merlotti G, De Andrea M, Borgogna C, Cantaluppi V. Viral infections and systemic Lupus Erythematosus: New players in an Old Story. Viruses 2021, 13(2).
Stern-Ginossar N, Weisburd B, Michalski A, Le VT, Hein MY, Huang SX, Ma M, Shen B, Qian SB, Hengel H, et al. Decoding human cytomegalovirus. Sci (New York NY). 2012;338(6110):1088–93.
Sinnott JT, Cancio MR. Cytomegalovirus. Infect Control Hosp Epidemiol. 1987;8(2):79–82.
Cannon MJ, Schmid DS, Hyde TB. Review of cytomegalovirus seroprevalence and demographic characteristics associated with infection. Rev Med Virol. 2010;20(4):202–13.
Barber C, Gold WL, Fortin PR. Infections in the lupus patient: perspectives on prevention. Curr Opin Rheumatol. 2011;23(4):358–65.
Qin L, Qiu Z, Hsieh E, Geng T, Zhao J, Zeng X, Wan L, Xie J, Ramendra R, Routy JP, et al. Association between lymphocyte subsets and cytomegalovirus infection status among patients with systemic lupus erythematosus: a pilot study. Med (Baltim). 2019;98(39):e16997.
Hung M, Huang DF, Chen WS, Lai CC, Chen MH, Liao HT, Tsai CY. The clinical features and mortality risk factors of cytomegalovirus infection in patients with systemic lupus erythematosus. J Microbiol Immunol Infect. 2019;52(1):114–21.
Yamazaki S, Endo A, Iso T, Abe S, Aoyagi Y, Suzuki M, Fujii T, Haruna H, Ohtsuka Y, Shimizu T. Cytomegalovirus as a potential trigger for systemic lupus erythematosus: a case report. BMC Res Notes. 2015;8:487.
Ramos-Casals M. Viruses and lupus: the viral hypothesis. Lupus. 2008;17(3):163–5.
Pan Q, Liu Z, Liao S, Ye L, Lu X, Chen X, Li Z, Li X, Xu YZ, Liu H. Current mechanistic insights into the role of infection in systemic lupus erythematosus. Biomed Pharmacother. 2019;117:109122.
HoHsieh A, Wang CM, Wu YJ, Chen A, Chang MI, Chen JY. B cell epitope of human cytomegalovirus phosphoprotein 65 (HCMV pp65) induced anti-dsDNA antibody in BALB/c mice. Arthritis Res Therapy. 2017;19(1):65.
Neo JYJ, Wee SYK, Bonne I, Tay SH, Raida M, Jovanovic V, Fairhurst A-M, Lu J, Hanson BJ, MacAry PA. Characterisation of a human antibody that potentially links cytomegalovirus infection with systemic lupus erythematosus. Sci Rep. 2019;9(1):1–9.
Liu Y, Mu R, Gao Y-P, Dong J, Zhu L, Ma Y, Li Y-H, Zhang H-Q, Han D, Zhang Y. A cytomegalovirus peptide-specific antibody alters natural killer cell homeostasis and is shared in several autoimmune diseases. Cell Host Microbe. 2016;19(3):400–8.
Newkirk MM, van Venrooij WJ, Marshall GS. Autoimmune response to U1 small nuclear ribonucleoprotein (U1 snRNP) associated with cytomegalovirus infection. Arthritis Res. 2001;3(4):253–8.
Xin L. Clinical characteristics of new-onset systemic lupus erythematosus patients with human cytomegalovirus infection. master’s thesis. Upublished: Chinese Academy of Medical Sciences & Peking Union Medical College; 2021.
Hochberg MC. Updating the American College of Rheumatology revised criteria for the classification of systemic lupus erythematosus. Arthritis Rheum. 1997;40(9):1725.
Aringer M, Costenbader K, Daikh D, Brinks R, Mosca M, Ramsey-Goldman R, Smolen JS, Wofsy D, Boumpas DT, Kamen DL, et al. 2019 European League Against Rheumatism/American College of Rheumatology Classification Criteria for systemic Lupus Erythematosus. Arthritis Rheumatol. 2019;71(9):1400–12.
Tan YT, Shi XC, Liu XQ, Zeng XF, Zhou BT. [Clinical features and risk factors of systemic Lupus Erythematosus complicated with cytomegalovirus infection]. Zhongguo Yi Xue Ke Xue Yuan Xue Bao. 2020;42(6):749–54.
Ljungman P, Boeckh M, Hirsch HH, Josephson F, Lundgren J, Nichols G, Pikis A, Razonable RR, Miller V, Griffiths PD. Definitions of cytomegalovirus infection and disease in transplant patients for use in clinical trials. Clin Infect Dis. 2017;64(1):87–91.
Greenland S. An introduction to instrumental variables for epidemiologists. Int J Epidemiol. 2000;29(4):722–9.
Burgess S, Thompson SG. Mendelian randomization: methods for causal inference using genetic variants. CRC; 2021.
Bowden J, Davey Smith G, Burgess S. Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression. Int J Epidemiol. 2015;44(2):512–25.
Bentham J, Morris DL, Graham DSC, Pinder CL, Tombleson P, Behrens TW, Martín J, Fairfax BP, Knight JC, Chen L, et al. Genetic association analyses implicate aberrant regulation of innate and adaptive immunity genes in the pathogenesis of systemic lupus erythematosus. Nat Genet. 2015;47(12):1457–64.
Zhang M, Ming Y, Du Y, Xin Z. Two-sample mendelian randomization study does not reveal a significant relationship between cytomegalovirus (CMV) infection and autism spectrum disorder. BMC Psychiatry. 2023;23(1):559.
Kamat MA, Blackshaw JA, Young R, Surendran P, Burgess S, Danesh J, Butterworth AS, Staley JR. PhenoScanner V2: an expanded tool for searching human genotype–phenotype associations. Bioinformatics. 2019;35(22):4851–3.
Staley JR, Blackshaw J, Kamat MA, Ellis S, Surendran P, Sun BB, Paul DS, Freitag D, Burgess S, Danesh J. PhenoScanner: a database of human genotype–phenotype associations. Bioinformatics. 2016;32(20):3207–9.
Lawlor DA, Harbord RM, Sterne JA, Timpson N, Davey Smith G. Mendelian randomization: using genes as instruments for making causal inferences in epidemiology. Stat Med. 2008;27(8):1133–63.
Burgess S, Thompson SG. Avoiding bias from weak instruments in mendelian randomization studies. Int J Epidemiol. 2011;40(3):755–64.
Shim H, Chasman DI, Smith JD, Mora S, Ridker PM, Nickerson DA, Krauss RM, Stephens M. A multivariate genome-wide association analysis of 10 LDL subfractions, and their response to statin treatment, in 1868 caucasians. PLoS ONE. 2015;10(4):e0120758.
Burgess S, Butterworth A, Thompson SG. Mendelian randomization analysis with multiple genetic variants using summarized data. Genet Epidemiol. 2013;37(7):658–65.
Bowden J, Davey Smith G, Haycock PC, Burgess S. Consistent estimation in mendelian randomization with some Invalid instruments using a weighted median estimator. Genet Epidemiol. 2016;40(4):304–14.
Hemani G, Zheng J, Elsworth B, Wade KH, Haberland V, Baird D, Laurin C, Burgess S, Bowden J, Langdon R et al. The MR-Base platform supports systematic causal inference across the human phenome. Elife 2018, 7.
Hartwig FP, Davey Smith G, Bowden J. Robust inference in summary data mendelian randomization via the zero modal pleiotropy assumption. Int J Epidemiol. 2017;46(6):1985–98.
Slob EAW, Burgess S. A comparison of robust mendelian randomization methods using summary data. Genet Epidemiol. 2020;44(4):313–29.
Hemani G, Tilling K, Davey Smith G. Orienting the causal relationship between imprecisely measured traits using GWAS summary data. PLoS Genet. 2017;13(11):e1007081.
Greco MF, Minelli C, Sheehan NA, Thompson JR. Detecting pleiotropy in mendelian randomisation studies with summary data and a continuous outcome. Stat Med. 2015;34(21):2926–40.
Bowden J, Del Greco MF, Minelli C, Zhao Q, Lawlor DA, Sheehan NA, Thompson J, Davey Smith G. Improving the accuracy of two-sample summary-data mendelian randomization: moving beyond the NOME assumption. Int J Epidemiol. 2019;48(3):728–42.
R. A Language and Environment for Statistical Computing [https://www.R-project.org/].
Chen T, Guestrin C. Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining: 2016. 785–794.
Verbanck M, Chen C-Y, Neale B, Do R. Detection of widespread horizontal pleiotropy in causal relationships inferred from mendelian randomization between complex traits and diseases. Nat Genet. 2018;50(5):693–8.
Varani S, Landini MP. Cytomegalovirus-induced immunopathology and its clinical consequences. Herpesviridae. 2011;2(1):6.
Rozenblyum EV, Levy DM, Allen U, Harvey E, Hebert D, Silverman ED. Cytomegalovirus in pediatric systemic lupus erythematosus: prevalence and clinical manifestations. Lupus. 2015;24(7):730–5.
Hsieh A-H, Kuo C-F, Chou IJ, Tseng W-Y, Chen Y-F, Yu K-H, Luo S-F. Human cytomegalovirus pp65 peptide-induced autoantibodies cross-reacts with TAF9 protein and induces lupus-like autoimmunity in BALB/c mice. Sci Rep. 2020;10(1):9662.
Rekvig OP. The dsDNA, Anti-dsDNA antibody, and Lupus Nephritis: what we agree on, what must be done, and what the best strategy Forward could be. Front Immunol. 2019;10:1104.
Agmon-Levin N, Blank M, Paz Z, Shoenfeld Y. Molecular mimicry in systemic lupus erythematosus. Lupus. 2009;18(13):1181–5.
Poole BD, Scofield RH, Harley JB, James JA. Epstein-Barr virus and molecular mimicry in systemic lupus erythematosus. Autoimmunity. 2006;39(1):63–70.
Hsieh A-H, Jhou Y-J, Liang C-T, Chang M, Wang S-L. Fragment of tegument protein pp65 of human cytomegalovirus induces autoantibodies in BALB/c mice. Arthritis Res Therapy. 2011;13(5):1–15.
Chen J, Zhang H, Chen P, Lin Q, Zhu X, Zhang L, Xue X. Correlation between systemic lupus erythematosus and cytomegalovirus infection detected by different methods. Clin Rheumatol. 2015;34(4):691–8.
Curtis HA, Singh T, Newkirk MM. Recombinant cytomegalovirus glycoprotein gB (UL55) induces an autoantibody response to the U1-70 kDa small nuclear ribonucleoprotein. Eur J Immunol. 1999;29(11):3643–53.
HoHsieh A, Wang CM, Wu Y-JJ, Chen A, Chang M-I, Chen J-Y. B cell epitope of human cytomegalovirus phosphoprotein 65 (HCMV pp65) induced anti-dsDNA antibody in BALB/c mice. Arthritis Res Therapy. 2017;19(1):1–14.
Gladman DD, Ibañez D, Urowitz MB. Systemic lupus erythematosus disease activity index 2000. J Rheumatol. 2002;29(2):288–91.
Lino K, Trizzotti N, Carvalho FR, Cosendey RI, Souza CF, Klumb EM, Silva AA, Almeida JR. Pp65 antigenemia and cytomegalovirus diagnosis in patients with lupus nephritis: report of a series. J Bras Nefrol. 2018;40(1):44–52.
Zeng P, Zhang F. Cytomegalovirus infection in systemic lupus erythematosus: an analysis of 121 cases. Chin J Rheumatol. 2011;15(4):249–51.
Odeberg J, Plachter B, Brandén L, Söderberg-Nauclér C. Human cytomegalovirus protein pp65 mediates accumulation of HLA-DR in lysosomes and destruction of the HLA-DR alpha-chain. Blood. 2003;101(12):4870–7.
Li H, Zheng Y, Chen L, Lin S. High titers of antinuclear antibody and the presence of multiple autoantibodies are highly suggestive of systemic lupus erythematosus. Sci Rep. 2022;12(1):1687.
Bizzaro N, Villalta D, Giavarina D, Tozzoli R. Are anti-nucleosome antibodies a better diagnostic marker than anti-dsDNA antibodies for systemic lupus erythematosus? A systematic review and a study of metanalysis. Autoimmun Rev. 2012;12(2):97–106.
Cohen MG, Pollard KM, Webb J. Antibodies to histones in systemic lupus erythematosus: prevalence, specificity, and relationship to clinical and laboratory features. Ann Rheum Dis. 1992;51(1):61–6.
Zeng Y, Xiao Y, Zeng F, Jiang L, Yan S, Wang X, Lin Q, Yu L, Lu X, Zhang Y, et al. Assessment of anti-nucleosome antibody (ANuA) isotypes for the diagnosis and prediction of systemic lupus erythematosus and lupus nephritis activity. Clin Exp Med. 2023;23(5):1677–89.
Oliveira RC, Oliveira IS, Santiago MB, Sousa Atta ML, Atta AM. High Avidity dsDNA Autoantibodies in Brazilian Women with Systemic Lupus Erythematosus: Correlation with Active Disease and Renal Dysfunction. J Immunol Res 2015, 2015:814748.
Chou S. Newer methods for diagnosis of cytomegalovirus infection. Rev Infect Dis. 1990;12(Suppl 7):727–36.
Acknowledgements
We would like to extend our appreciation to both FinnGen and IEU Open GWAS for providing invaluable datasets that were instrumental in our research. Their efforts in making these data resources available are greatly acknowledged.
Funding
This work was supported by Beijing Natural Science Foundation (L222085). The founders had no role in study design, data collection or manuscript preparation.
Author information
Authors and Affiliations
Contributions
LX, QLL and YX contributed to the study design. LX and QLL performed data collecting and manuscript writing. LX completed the statistical analysis and computational aspects. LX and QLL made equal contributions to this study. LQT and RHT guided the machine learning modeling parts. LY and MJQ performed data organization. All authors contributed to the article and approved the submitted version. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
This study has received ethics approval from the Peking Union Medical College Hospital Ethics Committee. The approval number is I-22PJ565. This study is conducted as a retrospective study, which involves the analysis of pre-existing, de-identified data, and does not require obtaining informed consent from participants. For the utilization of public databases, it is hereby confirmed that the current study strictly adheres to the ethical committee’s original permissions and guidelines.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Luo, X., Quan, L., Lin, Q. et al. Integrating clinical data and genetic susceptibility to elucidate the relationship between systemic lupus erythematosus and human cytomegalovirus infection. Virol J 21, 311 (2024). https://doi.org/10.1186/s12985-024-02578-6
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s12985-024-02578-6