Sunday , Sept. 29, 2024, 6:59 p.m.
News thumbnail
Health / Tue, 23 Apr 2024 BMC Medicine

Very short sleep duration reveals a proteomic fingerprint that is selectively associated with incident diabetes mellitus but not with incident coronary heart disease: a cohort study - BMC Medicine

Participants who were diagnosed with both incident DM and incident CHD were included in the analyses of both outcomes unless the incident CHD event was fatal while preceding incident DM. Proteomic markersThe 92 proteomic markers investigated in this study were from the Olink Proseek Multiplex CVD 1 panel (Additional file 1: Table S1). Sleep durationHabitual sleep duration was assessed through two open questions asking participants how long they slept on weekdays and weekends, respectively. Proteomic markers that were significantly associated with Q1, Q2, Q4, and Q5, respectively, were retained (p for retention: p < 0.05). The HR of proteomic sleep duration scores are expressed in terms of incremental increases per SD.

The Malmö Diet and Cancer (MDC) study is a population-based, prospective study in the city of Malmö, Sweden. Men and women aged 45–73 years were randomly selected between the years 1991 and 1996 and recruited for a baseline examination. Participants provided anthropometric data and blood samples and answered a detailed lifestyle questionnaire on heredity, socioeconomic variables, social network, occupation, physical activity, alcohol consumption, smoking, diseases, and medication. Details of the study have been described elsewhere [10]. Between 1991 and 1994, 6103 individuals from the MDC study were randomly selected to participate in the MDC Cardiovascular Cohort (MDC-CC). The purpose of the MDC-CC is to study the epidemiology of carotid artery disease [11]. MDC-CC participants underwent detailed examinations, including ultrasonography of the carotid artery, and provided plasma for the measure of novel proteomic markers.

For the purpose of the present study, MDC-CC participants were excluded if they had prevalent DM (n = 293) or a fasting whole blood glucose concentration ≥ 6.1 mmol/l (n = 261) or prevalent CHD (n = 100) at baseline (Fig. 1). Prevalent DM was defined as: having a measured fasting whole blood glucose ≥ 6.1 mmol/l (corresponding to fasting plasma glucose concentration ≥ 7 mmol/l) at the MDC baseline examination, a self-reported history of physician diagnosed DM, the use of DM medication according to the MDC baseline questionnaire, or being diagnosed and registered in any of the local or national diabetes registries as described elsewhere [12]. Participants were further excluded if they had provided incomplete information on sleep duration (n = 294) or if their sleep duration represented outlier values of more than 3 interquartile ranges below or above the first and fourth quartiles, respectively (n = 15), had missing data on kidney function, as measured by cystatin C (n = 677), or missing data for any of the covariates (n = 267). In order to minimize reverse causation bias, participants who were diagnosed with incident DM (n = 41) or incident CHD (n = 24) in the first 3 years of follow-up were also excluded. Finally, those with missing data for any of the proteomic markers (n = 552) or who had levels of proteomic markers that represented outlier values of more than 3 interquartile ranges below or above the first and fourth quartiles, respectively (n = 243), were also excluded. The final sample for analyses consisted of 3336 participants. All participants were followed from starting point until December 31, 2018, with person-years calculated from starting point to incident DM and incident CHD, date of death, or end of follow-up period, whichever came first. Participants who were diagnosed with both incident DM and incident CHD were included in the analyses of both outcomes unless the incident CHD event was fatal while preceding incident DM.

Fig. 1 Flowchart of the inclusion and exclusion of study participants Full size image

The MDC study was approved by the ethics committee at Lund University, and all participants provided written informed consent.

Proteomic markers

The 92 proteomic markers investigated in this study were from the Olink Proseek Multiplex CVD 1 panel (Additional file 1: Table S1). The proteomic markers were measured in stored fasting plasma specimens from the MDC-CC baseline examination. The specimens were immediately frozen to − 80 °C following collection. Plasma concentrations of the proteomic markers were quantified using a validated high-specificity immunoassay, the Proximity Extension Assay (PEA), which has been described in detail elsewhere [13, 14].

Concentrations of proteomic markers were provided on a logarithmic (log2) scale. Any markers that were below the limit of detection were provided a missing value. For the purpose of this study, 14 protein markers with missing information for ≥ 10% (n = 413) of the study population were excluded from analyses (details in Additional file 1: Table S1). The concentrations of the 78 proteomic markers included in analyses were standardized, where the standard score of each marker represented the number of standard deviations above or below the mean.

Sleep duration

Habitual sleep duration was assessed through two open questions asking participants how long they slept on weekdays and weekends, respectively. The two questions were formulated as: (a) “How many hours do you usually sleep per night during a typical week (Monday-Friday)?” and (b) “How many hours do you usually sleep per night during a typical weekend (Saturday-Sunday)?” A weighted average sleep duration [((weekday × 5) + (weekend × 2))/7] was calculated for all participants, thereby allowing the subsequent construction of a categorical variable representing sleep duration quintiles (Quintile 1 [Q1]: 4.00–6.57 h; Q2: 6.64–7.14 h; Q3: 7.21–7.57 h; Q4; 7.64–8.00 h; Q5: 8.14–11.00 h). Q3 was chosen as the reference category for two main reasons: first, the sleep duration range in this group was most similar to other studies which often use 7–8 h as the reference, thus facilitating comparisons and second, selection of the mid-quintile allowed for the investigation of J-shaped or U-shaped associations that are often reported between sleep duration and the respective outcomes e.g., [2, 15].

Incident diabetes mellitus

Incident DM was defined as new onset DM in individuals without prevalent DM at the MDC baseline examination. All incident DM events were identified through linkage of a 10-digit national personal identification number with 6 local and national registers: the Malmö HbA 1c register, the Regional Diabetes 2000 register of the Scania region [16], the Swedish National Diabetes Register [17], the Swedish National Inpatient Register [18], the Swedish Cause of Death Register [19], and the Swedish Prescribed Drug Register [20]. Participants with incident DM could also be identified by having a fasting plasma glucose concentration ≥ 7 mmol/l or a 120-min plasma glucose value of > 11.0 mmol/l in subpopulations of the MDC participating in a MDC re-examination [21] or the Malmö Preventive Project re-examination [22]. The detailed process of DM end-point retrieval has been described elsewhere [23].

Incident coronary heart disease

Incident CHD event was defined as a first fatal or non-fatal myocardial infarction, coronary artery bypass graft (CABG), or percutaneous coronary intervention (PCI). All incident CHD events were identified through linkage of a ten-digit national personal identification number with three registries validated for classification of outcomes as described elsewhere [24, 25]: the Swedish National Discharge Registry, the Swedish National Cause of Death Registry, and the Swedish Coronary Angiography and Angioplasty Registry. CABG and PCI were classified using the national classification of surgical procedures operation codes (KKÅ or Op6): 3065, 3066, 3068, 3080, 3092, 3105, 3127, 3158 for CABG, and FNG02 and FNG05 for PCI. Coronary event was defined according to the International Classification of Diseases, ninth (ICD-9) and tenth (ICD-10) revisions with fatal or non-fatal myocardial infarction (MI) or death due to CHD corresponding to codes 410, 412, and 414 (ICD-9), and I21-I23, and I25 (ICD-10).

Covariates

Covariates in the statistical models included age at baseline (continuous), sex, cystatin C (continuous in mg/l), education (elementary school or higher than elementary school), and physical activity defined as quartiles of leisure time physical activity based on 18 items adapted from the Minnesota Leisure Time Physical Activity instrument which has been described in detail elsewhere [26]. Smoking was defined as never, past, and current smoker (< 20 cig./day, or ≥ 20 cig./day); alcohol consumption was considered as quartiles with non-consumption as the reference value (none, 0.02–1.70, 1.70–6.87, 6.88–14.36, or > 14.36 g ethanol/day), and shift work was a binary variable (yes/no). Symptoms of insomnia were based on four items: difficulty initiating sleep, difficulty maintaining sleep, early morning awakening, and not feeling rested after sleep; the score of each included item (0–3 points) were summed to an overall discrete insomnia score (0–12 points). Body mass index (BMI) was considered in categories (< 18.5 kg/m2, 18.5–24.9 kg/m2, 25.0–29.9 kg/m2, or ≥ 30.0 kg/m2). Waist circumference (in cm), low-density lipoprotein cholesterol (LDL-C, in mmol/l), high-density lipoprotein cholesterol (HDL-C, in mmol/l), triglycerides (TG, in mmol/l), and hemoglobin A1c (HbA 1c , in %) were all treated as continuous variables. HbA 1c was used as a stratifying variable given that its inclusion as a covariate violated the assumptions of proportional hazards in semi-parametric survival analyses with incident DM as the outcome.

Statistical analyses

Several of the proteomic markers in the Olink panels are expected to be highly correlated. Multicollinearity of predictors necessitates a model selection procedure retaining only markers that are relevant for the outcome of interest and have the largest effect. Lasso, least absolute shrinkage and selection operator, is an analysis method that produces sparse models by shrinking the coefficients of some predictors while setting the coefficients of other predictors to “0” [27]. This allows for improved prediction accuracy and improved interpretation [28]. However, lasso does not provide accurate standard errors of its estimate [27] which in turn precludes significance testing [29]. Cross-fit partialing-out lasso, also known as double machine learning [30], is a method that produces both coefficients and standard errors of predictors while controlling for relevant covariates. Cross-fit partialing out lasso splits samples into multiple folds selecting covariates and estimating post-selection coefficients [31]. The final results are obtained by averaging the results of multiple estimates acquired across the folds [30].

Ten-fold cross-fit partialing-out lasso logistic regression controlling for age and sex was used to estimate beta coefficients, standard errors, and p-values for each of the 78 proteomic markers’ associations with specific quintiles of sleep duration. Each tenfold cross-fit partialing-out lasso logistic regression was performed against a binary outcome variable where sleep duration Q3 was considered the reference value and where, in four separate analyses, the remaining four sleep duration quintiles were considered the respective outcomes of interest, i.e., cross-fit lasso 1: Q1 vs. Q3; cross-fit lasso 2: Q2 vs. Q3; cross-fit lasso 3: Q4 vs. Q3; and cross-fit lasso 4: Q5 vs. Q3. Proteomic markers that were significantly associated with Q1, Q2, Q4, and Q5, respectively, were retained (p for retention: p < 0.05). Using their beta coefficients as weights, the retained proteomic markers were then used to create four separate proteomic scores, one for each sleep quintile compared to Q3.

The four proteomic scores were included as independent variables in linear regression analyses with continuous sleep duration as the dependent variable. All possible combinations (n = 15) of the 4 proteomic scores were considered and each model was compared to the remaining models using the Akaike Information Criterion (AIC). The proteomic scores from the model with the lowest AIC were retained for the semi-parametric survival analyses.

Cox proportional hazards regression was used to determine hazard ratios (HR) and 95% confidence intervals (CI) for the associations of sleep duration and proteomic risk scores with incident DM and incident CHD, respectively. Primary analyses investigated the association between sleep duration with the respective outcomes. Secondary analyses considered identical models to the primary analyses with proteomic risk scores as additional main exposures, i.e., the models were adjusted for both sleep duration and proteomic score. All statistical models were stratified by HbA 1c using the Stata option “strata()” to account for different baseline hazards across concentrations of HbA 1c . The minimally adjusted model (model 1) was adjusted for age, and sex; model 2 was additionally adjusted for cystatin C, education, physical activity, smoking, alcohol consumption, shift work, and symptoms of insomnia; and model 3 was additionally adjusted for BMI, waist circumference, LDL-C, HDL-C, and TG. No sex-specific stratification was done as there were no significant interactions between sex and sleep duration.

The HR of proteomic sleep duration scores are expressed in terms of incremental increases per SD.

Global tests for proportionality were conducted for the final multivariable models and did not reveal any significant deviation from the proportional hazards assumption.

A proteomic score was considered a probable mediator for the association between sleep duration and incident DM or incident CHD if the following conditions were met: (1) the inclusion of the proteomic score in the final multivariable model abrogated any significant associations between sleep duration and the outcome and (2) the proteomic score was significantly associated with the outcome. Mediating effects (survival functions for the follow-up period and the proportion mediated including 95% CI) were determined using the Stata “standsurv” post-estimation command on a fitted flexible parametric survival model that included only the main exposure (the relevant sleep duration quintile) and the probable mediator (protein score). The fitted models used three degrees of freedom for the baseline hazard and considered the sleep duration quintile as a time varying effect with three degrees of freedom.

All statistical analyses were performed using Stata/MP 17.0 (StataCorp LP, College Station, TX). The significance level was set as p < 0.05.

Data sharing statement

Due to ethical and legal restrictions related to the Swedish Biobanks in Medical Care Act (2002:297) and the Personal Data Act (1998:204), data are available upon request from the data access group of Malmö Diet and Cancer study by contacting Anders Dahlin ([email protected]).

Role of the funding source

The sponsors had no role in study design; in the collection, analysis, and interpretation of data; in the writing of the report; or in the decision to submit the paper for publication.

logo

Stay informed with the latest news and updates from around India and the world.We bring you credible news, captivating stories, and valuable insights every day

©All Rights Reserved.