
25OHD and BMD GWAS summary statistics
Details of the contributing GWAS consortiums are listed in Table 2. To obtain a more comprehensive and reliable conclusion of the causal link between 25OHD and BMD, we selected studies investigating traits related to 25OHD or BMD, having the largest sample sizes and consisting of the most similar populations while minimizing dataset overlap.
We retrieved summary data for the association between SNPs and the serum 25OHD concentration from the UK Biobank (UKB) with phenotype, genotype, and clinical information on 417,580 individuals of European ancestry (age ranging from 40 to 69 years old)25. Serum 25OHD levels were measured quantitatively in blood samples collected from 2006 to 2013 by a chemiluminescent immunoassay (CLIA) in nmol/L. Participants with 25OHD concentrations below or above the validated range for the assay (10–375 nmol/L) were excluded. Individuals were identified by projecting the UKB sample to the first two principal components of the 1,000 Genome Project using Hap Map 3 SNPs with MAF > 0.01 in both data sets. Genotype data were quality-controlled and imputed to the Haplotype Reference Consortium and UK10K reference panels by the UKB group. Genetic variants with a minor allele count > 5 and imputation score > 0.3 for all individuals were extracted and converted genotype probabilities to hard-call genotypes using PLINK2. In total, 8,806,780 variants, including 260,713 SNPs on the X chromosome, were available for analysis. A linear mixed model GWAS was performed to identify the associations between genetic variants and the 25OHD concentration. Information regarding the quality control and statistical analyses has been reported previously25.
The femoral neck, lumbar spine, and forearm are the three common skeletal sites of postmenopausal women and men 50 years or older for measurement of BMD based on DXA. Total body BMD (TB-BMD) GWAS summary data is used to estimate the general effect of 25OHD on whole-body BMD. TB-BMD measurement is the most appropriate method for an unbiased assessment of BMD variation in the same skeletal site from childhood to old age46. GWAS summary statistics for BMD (unit, g/cm2) were downloaded from the GEFOS consortium (http://www.gefos.org/). We could also download GWAS summary statistics of BMD from the publicly available GWAS catalog website (https://www.ebi.ac.uk/gwas/downloads/summary-statistics) or the IEU GWAS database (https://gwas.mrcieu.ac.uk/datasets/). Three separate GWAS summary statistics of European participants’ femoral neck BMD (FN-BMD, n = 49,988), lumbar spine BMD (LS-BMD, n = 44,731), and forearm BMD (FA-BMD, n = 10,805) were downloaded from GEFOS; it was the largest GWAS on DXA-measured BMD to date45. A meta-analysis comprising 56,284 individuals of European ancestry was performed to investigate the genetic determinants of TB-BMD. The meta-analyzed effect size estimates were used in this study. Moreover, summary statistics for TB-BMD across the lifespan include 0–15 years (n = 11,807), 15–30 years (n = 4180), 30–45 years (n = 10,062), 45–60 years (18,805), and 60 or more years (n = 22,504), in which participants were mainly of the European ancestry (86%)46.
Instrumental variables
From the GWAS summary data of 25OHD, we conducted a series of quality control steps to select eligible instrumental SNPs. Firstly, we extracted all SNPs that strongly and independently (R2 < 0.001) predicted exposure at genome-wide significance (p < 5E−08). Secondly, SNPs with a minor allele frequency (MAF) of < 0.01 were excluded to avoid the potential statistical bias from the original GWAS since they usually carry low confidence. Thirdly, extracting data for the above-selected SNPs from the outcome trait (BMD) GWAS summary. By default, if a particular requested SNP were not present in the outcome GWAS, then an SNP (proxy) that was in linkage disequilibrium (LD) (R2 > 0.8) from LDlink (https://ldlink.nci.nih.gov/) with the requested SNP (target) would be searched for instead. LD proxies were defined using 1000 genomes of European sample data. The effect of the proxy SNP on the outcome was returned, along with the proxy SNP, the effect allele of the proxy SNP, and the corresponding allele (in-phase) for the target SNP54,55. Fourthly, the effect of ambiguous SNPs with nonconcordant alleles (e.g., A/G vs. A/C) and palindromic SNPs with an ambiguous strand (i.e., A/T or G/C) was corrected, or the ambiguous and palindromic SNPs were directly excluded from the above-selected instrument SNPs in harmonizing process to ensure that the effect of an SNP on the exposure, and the effect of that same SNP on the outcome, corresponded to the same allele. These stringently selected SNPs were used as the instrumental variables for subsequent two-sample MR analysis.
According to the assumptions of MR analysis, the selected instrumental SNPs should be strongly associated with exposure. We calculated the F statistic to test whether there was a weak instrumental variable bias, namely, genetic variants selected as instrumental variables had a weak association with exposure. If the F statistic is much greater than 10 for the instrument-exposure association, the possibility of weak instrumental variable bias is small56.
Mendelian randomization estimates
MR analysis uses genetic variants as instrumental variables to estimate the causative effect of exposure variables on an outcome. The study combined the summary statistics (β coefficients and standard errors) to estimate the causal associations between 25OHD and BMDs (including FN-BMD, LS-BMD, FA-BMD TB-BMD, and TB-BMD with different age groups) using different methods. Several robust methods have been proposed since it is unlikely that all genetic variants would be valid instrumental variables. The methods which included inverse variance weighted (IVW) with fixed effects (IVW-fixed) and random effects (IVW-random), MR-Egger regression, weighted median (WM), robust adjusted profile score (MR.RAPS) method, and MR-Pleiotropy RESidual Sum and Outlier (MR-PRESSO) method were based on different assumptions.
The IVW method uses a meta-analysis approach to combine Wald estimates for each SNP (i.e., the β coefficient of the SNP for BMD divides by the β coefficient of the SNP for 25OHD) to get the overall estimates of the effect of 25OHD on BMD57. If there is no violation of the IV2 assumption (no horizontal pleiotropy), or the horizontal pleiotropy is balanced, an unbiased causal estimate can be obtained by IVW linear regression58,59. Fixed and random effects IVW approaches are available. If significant heterogeneity (P < 0.05) is observed, a random-effect IVW model is applied.
If there is a particular direction of the horizontal pleiotropic effect, then constraining the slope to go through zero will introduce bias. Egger regression which allows the intercept to pass through a value other than zero, will relax the constraint. Based on the assumption of InSIDE, the MR-Egger regression performs a weighted linear regression of the outcome coefficients on the exposure coefficients60. Under the Instrument Strength Independent of Direct Effect (InSIDE) condition that instrument-exposure and pleiotropic effects are uncorrelated, it gives a valid test of the null causal hypothesis and a consistent causal effect estimate even when all the genetic variants are invalid IVs61,62.
The weighted median approach will provide an unbiased estimate of the causal effect in unbalanced horizontal pleiotropy even when up to 50% of SNPs are invalid IVs (e.g., due to pleiotropy)63.
MR-PRESSO is a method for detecting and correcting outliers in IVW linear regression. MR-PRESSO has three components, including (a) detection of horizontal pleiotropy (MR-PRESSO global test), (b) correction for horizontal pleiotropy via outlier removal (MR-PRESSO outlier test), and (c) testing of significant differences in the causal estimates before and after correction for outliers (MRPRESSO distortion test). The MR-PRESSO outlier test requires that at least 50% of the variants are valid instruments, have balanced pleiotropy, and rely on the InSIDE assumption64.
However, MR-Egger estimates may be inaccurate and strongly influenced by outlying genetic variants. The weighted median estimate, which does not require the InSIDE assumption, has been confirmed to have distinct superiorities over MR-Egger for its improved power of causal effect detection and lower type I error63. When the InSIDE assumption is valid, and the percentage of horizontal pleiotropic variants is small (≤ 10%), the causal estimate of the MR-PRESSO outlier adjustment is less biased and has better precision (smaller standard deviation) than MR-Egger. However, when the percentage of horizontal pleiotropic variants is high (≥ 50%), the opposite is found64. The weighted median has less bias and less precision in the causal estimate than the MR-PRESSO outlier test, mainly when the percentage of horizontal pleiotropic variants is < 50%64.
Since we might include weak instrumental variables in the analyses, we developed a recently proposed method called robust adjusted profile score (MR.RAPS) to make our results more reliable65. This method is robust to systematic and idiosyncratic pleiotropy and can give a robust inference for MR analysis with many weak instruments. It can correct for pleiotropy using robust adjusted profile scores and is recommended to routinely use the RAPS estimator in practice, especially if the exposure and the outcome are both complex traits.
Analyses were performed using R version 4.1.2 (R Foundation for Statistical Computing, Vienna, Austria) using the TwoSampleMR R package of MR-Base (https://github.com/MRCIEU/TwoSampleMR)58. The results reported the effect size as the effect of a one-standard-deviation (1-SD) change in rank-based inverse normal transformed (RINT) 25OHD level. Results were considered insignificant if P values ≥ 0.05 for all MR methods. If the estimates of different methods were inconclusive, the link between exposure and outcome phenotype with an adjusted P value < 0.05/5 = 0.01 (Bonferroni correction for multiple testing) was considered significant.
Sensitivity analyses
We conducted the IVW and MR-Egger66 methods to detect heterogeneity. The Cochran’s Q statistic quantified heterogeneity, and a P value of < 0.05 would be regarded as significant heterogeneity. To identify potentially influential SNPs, we performed a “leave-one-out” sensitivity analysis. We excluded one SNP at a time and performed an IVW-random method on the remaining SNPs to identify the potential influence of outlying variants on the estimates.
The intercept of the MR-Egger regression line was used to quantify the amount of horizontal pleiotropy present in the data averaged across the genetic instruments61,62. Under the InSIDE assumption, the MR-Egger intercept test identifies directional horizontal pleiotropy if the intercept from the MR-Egger analysis is not equal to zero62.
We further applied the MR-PRESSO global test64 to reduce heterogeneity in estimating the causal effect of removing SNPs that were horizontal pleiotropic outliers. We conducted this analysis using the MR-PRESSO R package (https://github.com/rondo lab/MR-PRESSO). The number of distributions was set to 1000, and the threshold was set to 0.05.
When selecting SNPs from a very large GWAS, it can be challenging to determine whether an SNP satisfies the second and third MR assumptions. To reduce the risk of pleiotropy, we repeated our analyses using SNPs in or near genes (DHCR7, GC, CYP2R1, and CYP24A1) encoding enzymes and carrier proteins involved in vitamin D synthesis or metabolism as instrumental variables.
Multivariable MR analysis
One advantage of multivariable MR (MVMR) analysis compared to univariable MR is that SNPs that are thought to affect multiple exposures potentially, or where it is not clear exactly which exposure they affect, can be included when estimating the effects of the exposures on the outcome. This makes MVMR particularly useful when the exposures are closely related, or one (or more) is thought to be a potential pleiotropic pathway from the SNPs to the outcome. Meanwhile, MVMR does not introduce collider bias into the results. Because of a solid phenotypic association between vitamin D and BMI, we included BMI in an MVMR analysis to explore the direct effect of vitamin D on BMD after adjusting for BMI. Summary-level data for BMI was drawn from a meta-analysis comprising 322,154 individuals of European ancestry from the Genetic Investigation of Anthropometric Traits (GIANT) consortium67, without sample overlap with our 25OHD GWAS.
Replication analysis
To increase the power of analysis in our study, the GWAS summary statistics for ultrasound-measured heel BMD were also drawn from the UKB, including 394,929 European participants68. While performing a two-sample MR analysis where both effects for exposure and outcome come from the same population might induce weak instrument bias or bias due to winners’ curse, we extracted effects of the above-selected SNPs from the SUNLIGHT consortium GWAS including 79,366 European participants and studied the association69.
Power assessment
We estimated the power of our study according to a method suggested by Brion et al.70. The equations use an approximate linear model, which requires the proportion of phenotypic variation explained by IV SNPs, the effect size of the exposure to the outcome at the epidemiological level, sample size, and the standard deviation (SD) of the exposure and outcome.
Procedures of MR analysis
Our study first performed MR analysis with all the above-selected SNPs as IVs. If the MR-PRESSO analysis detected a significant horizontal pleiotropy, we removed the outlier variants (with a P value less than the threshold in the MR-PRESSO outlier test) and performed MR analysis again. After the MR-PRESSO outlier removal step, if the heterogeneity was still significant, we would perform MR analysis under the condition of removing all the SNPs. The P value was less than 1 in the MR-PRESSO outlier test. At last, if potentially influential SNPs were identified in the “leave-one-out” sensitivity analysis, we should conclude with caution.
Ethics
Our analysis used published studies or publicly available GWAS summary data. No original data was collected for this manuscript, and thus, no ethical committee approval was required. Each study included was approved by their institutional ethics review committees, and written informed consent was provided by all participants or by their parents in the case of children.