Background

Attention Deficit Hyperactivity Disorder (ADHD) is a prevalent neurodevelopmental condition affecting approximately 5% of children globally [1]. Characterised by impulsivity, hyperactivity and inattention [2], children with ADHD often suffer from co-occurring mental health difficulties [3], have poorer educational outcomes than their non-affected peers [4] and are more likely to have sleep difficulties including insomnia, restless legs syndrome, sleep-disordered breathing [5,6,7]. An estimated 50–85% of children diagnosed with ADHD may experience some kind of sleep difficulties [8, 9] which can persist across the lifespan [10]. These sleep difficulties are not only more prevalent but also more complex in ADHD, encompassing a broad spectrum of disorders and frequently reflecting circadian features such as delayed sleep phase [11,12,13]. Additionally, recent systematic reviews confirm these problems are common, even though objective sleep measures often show only modest differences between children with ADHD and their neurotypical peers [14]. This complexity highlights the importance of using both subjective and objective assessments when evaluating sleep in individuals with ADHD. Furthermore, ADHD has a significant impact on individuals, families and societies, with marked effects on quality of life [15] and substantial economic costs [16, 17]. Current therapeutic interventions for ADHD often focus on pharmaceutical options, such as methylphenidate [18], or behavioural approaches like parent-training programs [19]. While these treatments can be effective at managing core symptoms, they are not helpful for all children and not without limitations. Issues such as side effects, adherence challenges, and a lack of long-term effectiveness underscore the need for additional or alternative strategies. Dietary interventions have emerged as a promising avenue, with some studies reporting that exclusion diets [20, 21] and essential fatty acids supplementation [21,22,23] may reduce ADHD symptoms. However, findings are inconsistent [24] and further research is needed to clarify the mechanisms underlying these effects.

One of the most effective dietary interventions in children with ADHD is the Few Foods Diet, which excludes foods most likely to provoke sensitivities, requiring the child to eat just a small number of whole-food items (such as lamb, chicken, potatoes, rice, banana, apple and brassica) [25]. However, the restrictive nature of this diet limits its practical application outside clinical settings. We have proposed elsewhere [26] that one mechanism through which dietary interventions may succeed is their ability to alter gut microbiota composition. Emerging research suggests that the gut microbiome composition may be atypical in individuals with ADHD [27,28,29] and it has been reported that children with ADHD may have increased incidence of gastrointestinal symptoms [30]. Thus, we hypothesise that dietary interventions specifically designed to target the gut microbiota may help induce improvements in ADHD characteristics [31]. Burgeoning interest in the role of probiotics and prebiotics for ADHD has led to some interesting research findings. A recent systematic review of seven randomised controlled trials (RCTs) and two cohort studies revealed an association between probiotics, prebiotics and symptom severity in ADHD [31]. However, findings were inconsistent, and outcomes may be dependent on multiple factors including dosage, duration, age of participants and strain/combination of bacteria. Bacterial strains including Lactobacillus rhamnosus, Bifidobacterium bifidum, and synbiotics may warrant further investigation in this population.

One way of introducing potentially beneficial bacteria into the gut is through the consumption of fermented food and drinks. Kefir is a fermented drink containing microorganisms and is created through the ‘symbiotic fermentation of milk by lactic acid bacteria and yeasts contained within an exopolysaccharide and protein complex called a kefir grain’ [32]p. 1). The fermented product contains a variety of bacteria (often including representatives of the genera Lactobacillus, Lactococcus, Streptococcus, and Leuconostoc) and yeast species (commonly including Saccharomyces and Kluyveromyces) [32, 33]. It has been demonstrated that the consumption of kefir can positively alter the composition of the gut microbiome, in both animal models [34, 35] and human studies [36, 37]. Preliminary evidence also suggests that kefir consumption may influence behaviour, reducing fatigue [38], and altering reward-seeking and repetitive behavior [39] in animal models. We conducted a small feasibility study on a diet designed to impact the composition of gut microbiota in children with ADHD [26]. As part of this diet, children consumed daily kefir with a high average compliance at 97.6% across six-weeks. We present our full rationale for the current study in the protocol paper [40].

Building on these findings, this study aimed to evaluate the impact of daily kefir consumption on ADHD symptom severity, sleep, and the microbiome in children with ADHD. The primary aim of this RCT was to assess the effects of kefir on ADHD symptoms (including inattention and hyperactivity/impulsivity) as assessed by the Strengths and Weaknesses of ADHD Symptoms and Normal Behaviour (SWAN) scale [41]. Secondary aims were to assess the effects of kefir on gut symptomatology, sleep, inattention and impulsivity, and gut microbiota composition. We hypothesised that supplementation with kefir would be superior to placebo at inducing improvements in 1. ADHD symptoms; 2. gut symptomatology; 3. sleep; 4. cognitive test measures of inattention and impulsivity; and 5. gut microbiota composition. To our knowledge, this is the first RCT exploring the impact of kefir in ADHD.

Methods

Study design

This is a six-week, parallel-group, double-blind, randomised controlled trial of the supplementation of daily kefir versus a placebo dairy drink in children with ADHD. This trial was registered with ClinicalTrials.gov (Identifier: NCT05155696). It was developed and reported in accordance with the Standard Protocol Items: Recommendations for Interventional Trials (SPIRIT) guidelines [42]. Primary and secondary outcomes were measured through assessments conducted prior to commencement of the intervention (at baseline) and at completion (week six). Ethical approval for the study was granted by St Mary’s University Ethics Committee (SMU_ETHICS_2020-21_240) and parents and children were provided with full written information about the study before written consent and assent were obtained. Detailed descriptions of the methods have been published in our protocol paper [40] and are summarised below.

Study location

The study was based in the UK with participants taking part within their own home.

Randomisation

To minimise imbalance in a relatively small sample, baseline adaptive randomisation was used by an independent statistician. Participants were sequentially assigned to treatment groups based on their covariate values (age, sex, and medication status), while accounting for all previously randomised participants. This approach aimed to balance covariates across groups more effectively than simple randomisation. The algorithm described in “A SAS® Program to Perform Adaptive Randomization” [43] was implemented in excel.

Participants

Inclusion and exclusion criteria

The eligibility criteria included participants aged between 8 years and 13 years at onset of study who had received a diagnosis of ADHD according to the Diagnostic and Statistical Manual of Mental Disorders, Fourth or Fifth edition (DSM-IV or DSM-V) criteria by a specialist qualified healthcare professional. Participants were excluded if they: (1) were currently undergoing a course of behavioural therapy; (2) had a milk allergy or lactose intolerance; (3) reported use of antibiotics, probiotics, antifungals or steroids in the four weeks before starting the trial; (4) had a diagnosis of a gastrointestinal disorder e.g. Inflammatory Bowel Disease or Coeliac disease; (5) had a diagnosis of an auto-immune disease or compromised immunity. Originally, we planned to exclude those currently taking ADHD medication. To aid recruitment, we later decided to include children on stable medication (ClinicalTrials.gov version 2, 30 Nov 2022) as recommended in recent research [44].

Sample recruitment

Participants were recruited through convenience sampling in the UK community. Study information was disseminated via email, flyers, and social media, with support from ADHD support groups, local community organisations, and schools. Parents were informed that the study was a randomised controlled trial investigating the effects of a probiotic drink in children with ADHD. Interested parents contacted the trial manager, who provided further information via email or telephone. Parents and children were given the opportunity to ask questions before completing a screening questionnaire (Jisc Online Surveys, Bristol, UK) to determine eligibility. Enrolment occurred on a rolling basis until the target sample size was reached. Participants were randomised by a statistician to treatment or placebo groups using block randomisation stratified by age, sex, and medication status. An independent researcher at St Mary’s University managed group allocation and notified the drinks distribution company. Researchers, participants, and their families remained blinded to group assignment.

Sample size calculation

A sample size of 27 per group was estimated using software G*Power version 3.1 and consultation with a statistician. It is based on ANOVA repeated measure, within-between interaction using an alpha level of 0.05, a power (1-ß) of 0.95 with an effect size of 0.25 between measures. The power of 0.95 was selected in case the drop-out rate was higher than expected or the effect size was lower than expected. We aimed to recruit 35 participants per group, to allow for a 20% drop out rate.

Patient and public involvement

Patients and families were first involved in the research at the conception, development, execution and evaluation of the pilot study, preceding this trial, which assessed a broader microbiome targeted dietary intervention in ADHD [26]. In particular, families informed the decision to simplify the dietary intervention to using kefir alone and to use online questionnaire and experimental outcome measures. Talks and webinars were delivered by the research team to ADHD support groups throughout the recruitment process. These will continue after study publication to disseminate findings.

Intervention

Participants were instructed to consume 125 ml of the study drink daily (kefir or placebo) for six weeks. Both kefir and control drinks were supplied in plain packaging and delivered chilled by the same courier. Participants were asked to maintain their usual diet and daily routines throughout the intervention. Drinks were delivered at the start and mid-point of the study to accommodate fridge storage. The drink could be consumed on its own, with food, or blended into a smoothie, with recipe suggestions provided.

Kefir group

Participants in the kefir group consumed organic cow’s milk kefir, supplied by Nourish Kefir. Each 125 ml serving was estimated to contain approximately 50 billion live microorganisms. Microbial composition varies due to fermentation but typically includes representatives of the genera Leuconostoc, Lactococcus, Lactobacillus, Bifidobacterium, and the species Saccharomyces cerevisiae, as well as the exopolysaccharide kefiran.

Placebo group

Participants in the placebo group consumed widely available UHT cow’s milk, which contains no live microorganisms and no additional ingredients likely to produce positive or negative effects.

Adverse event recording and management

Adverse events were recorded according to the Common Terminology Criteria for Adverse Events [45] and European Commission guidelines [46] throughout the duration of the study. Parents were asked to report any adverse events immediately to the research team, who were blinded to the allocated trial group.

Adherence

Adherence was monitored by parental report, using a chart to keep a record of daily adherence to drink consumption which was returned at the end of the study.

Outcome measures

Primary outcome measure

The primary outcome was ADHD symptom severity, measured using the 18-item Strengths and Weaknesses of ADHD Symptoms and Normal Behaviour (SWAN) scale [47]. The SWAN scale, developed by Swanson et al., has been widely used in research examining ADHD symptoms in children and adolescents and is validated for children aged 6–18 [47]. The SWAN was completed online by parents and teachers, who rated the child relative to same-age peers on a scale (−3 to +3; total score range −54 to 54), with higher scores indicating more severe symptoms. Nine items are averaged to compute an inattention subscale, and nine items are averaged to compute a hyperactivity/impulsivity subscale. The scale has been reported to have good internal consistency; acceptable longitudinal stability [48]; and good discriminant validity [49]. In the current study, the SWAN parent scale demonstrated an excellent level of internal consistency both pre-intervention (α = 0.92) and post-intervention (α = 0.95). The teacher scale also demonstrated an excellent level of internal consistency pre-intervention (α = 0.90) and post-intervention (α = 0.96).

Secondary outcome measures

Measure of impulsivity and inattention

Impulsivity and inattention were assessed using a computerised Go/NoGo task [50], completed online at home via the GDPR-compliant platform Gorilla.sc. Participants viewed letters presented in a 2 × 2 grid and responded to target letters (P or R), shown for 500 ms with a 1500 ms inter-stimulus interval across 320 trials. In the first block (160 trials), participants responded to P; in the second, to R. The target-to-non-target ratio was 80:20 in both blocks. Outcome measures included: (1) Go errors (omissions; higher scores indicate greater inattention), (2) NoGo errors (commissions; higher scores indicate greater impulsivity), (3) Go reaction time (higher scores indicate lower impulsivity), and (4) reaction time variability (higher scores indicate greater inattention). This measure is sensitive to changes in ADHD symptoms in response to medication and exercise [51, 52].

Actigraphy recordings

Actigraphy was used to objectively assess sleep quality and daytime activity. Children wore a Motionlogger Micro Watch (Ambulatory Monitoring, Inc.) on their non-dominant wrist for seven consecutive days, removing it only for swimming or contact sports. Off-wrist periods were excluded from analysis. Participants pressed an event-marker button when attempting to sleep. The device recorded movement, temperature, light, and activity during sleep and wake periods.

Data were downloaded and analysed using Ambulatory Monitoring software with the Sadeh algorithm in Action-W (version 2). Movement was recorded in 60-second epochs using zero-crossing mode. Sleep diaries, temperature, and light data were used to identify and remove artefacts. Downtime onset and offset were defined using zero-crossing activity thresholds ( > 200). Wake blocks were defined as ≥ 5 minutes awake and ended after ≥15 minutes of continuous sleep.

Sleep outcomes included: (1) mean activity during sleep (higher scores = more disrupted sleep), (2) minutes awake during the down period (higher score reflects poorer quality sleep), (3) sleep latency (minutes to fall asleep, higher score reflects more time taken to fall asleep), (4) sleep efficiency (percentage of down period spent asleep, excluding latency, higher score reflects better quality sleep), (5) wake after sleep onset (minutes spent awake during the down period after removing sleep latency, higher score reflects poorer quality sleep), and (6) sleep fragmentation (number of awakenings per total minutes of sleep × 100, higher score reflects more fragmented sleep). Mean and median daytime activity were calculated using proportional integration mode sampling.

Sleep diary

Subjective sleep was assessed using the Consensus Sleep [53], completed by children with parental assistance for seven days at baseline and during the final week of the study. The diary was used to help identify and remove artefacts from the actigraphy data.

Sleep problems

Sleep habits were assessed using the Child’s Sleep Habits Questionnaire (CSHQ) [54] and the Sleep Self-Report (SSR) [54], both rated on a three-point scale reflecting sleep patterns in the previous week. Parents completed the 33-item CSHQ (score range 33–99), and participants completed the 26-item SSR (score range 26–78), with higher scores on both indicating more sleep problems. Both questionnaires have demonstrated validity [54, 55] and reasonable internal consistency [56] in children with ADHD. In the current study, the CSHQ demonstrated a good level of internal consistency both pre-intervention (α = 0.81) and post-intervention (α = 0.78). The SSR also demonstrated a good level of internal consistency (α = 0.73) pre-intervention and post intervention (α = 0.71).

Gastrointestinal symptoms

Gastrointestinal symptoms were assessed using the parent-completed Gastrointestinal Severity Index (GSI), a validated tool for children aged 2–18 years [57]. The GSI rated six symptoms (constipation, diarrhoea, stool consistency, stool smell, flatulence, and abdominal pain) on a 3-point scale (0–2), yielding a total score from 0 to 12, with higher scores indicating greater symptom severity. It has been found to be sensitive to detecting gastrointestinal symptoms in ADHD [30]. In the current study, the GSI demonstrated poor internal consistency pre-intervention (α = 0.36) and post-intervention (α = 0.24), indicating that the scale may not be unidimensional, with the GSI measuring heterogeneous GI symptoms (constipation, diarrhoea, pain, etc.) that do not necessarily co-occur. The results should therefore be interpreted with caution. However, it is widely used within paediatric research [57, 58] and we report total GSI for comparability with prior studies.

Stool microbiome collection and analysis

Faecal samples were collected in Shield faecal collection tubes (Zymo Research), delivered to participants at baseline and week six, together with gloves, cardboard bowl, ziplock bag and instructions. Parents collected a sample of the child’s faeces in the bowl, transferred a small amount to the tube, and returned the samples in the pre-paid packaging provided.

Faeces was extracted using the Qiagen QIAamp Fast DNA Stool Mini Kit (https://www.qiagen.com). The resultant DNA was quantified using the Qubit double-stranded DNA (dsDNA) high-sensitivity assay kit (Bio-Sciences, Dublin, Ireland). Samples were prepared for shotgun metagenomic sequencing according to Illumina DNA Prep library preparation kit guidelines, with the use of unique dual indexes for multiplexing with the Integrated DNA Technologies (IDT) for Illumina index kit (https://eu.idtdna.com/). Final clean libraries were quantified by Qubit as before and pooled at equimolar concentrations. Final sequencing pool quality check and quantification were performed by quantitative polymerase chain reaction (qPCR) using the KAPA Library Quantification Kit for Illumina (Roche KAPA). High-throughput sequencing was performed on a NextSeq 2000 platform using a P1 mid-output flow cell. The resultant data was quality checked and filtered using Kneaddata. The associated taxonomic profile was determined by Kraken2+Braken [59] and its functional potential determined using Humann3 [60].

Procedure

Outcome measures were assessed at two time points: baseline (prior to the intervention) and week 6 (final week of the intervention). A study pack—including a stool sample collection kit, Motionlogger Micro Watch actigraph, sleep diary, daily adherence chart, recipe suggestions, and instructions—was delivered to participants’ homes ahead of each assessment week. Families were provided with a pre-paid, signed-for Royal Mail envelope to return the actigraph, stool sample, and adherence chart. Parents completed a brief background questionnaire on variables such as ethnicity, birth delivery type, and antibiotic use. To support retention, families were contacted throughout the trial to check progress, answer questions, prompt outcome measure completion, and encourage compliance. All questionnaire data were collected via the GDPR-compliant JISC Online Surveys platform (https://www.onlinesurveys.ac.uk).

Statistical analysis

An independent statistician, blinded to group allocation, performed the analysis that included data from all randomised participants with valid baseline and endpoint data. Participants with missing data for a given variable were excluded from that specific analysis. For questionnaire data, if less than 20% of items were missing, multiple imputation at item score level was applied using predictive mean matching, and the mean of five imputations was used to replace missing values before computing total scale scores [61]. For all outcome variables the residuals after fitting the best fit model were checked for normality using Shapiro-Wilk Test and outliers greater than 1.5 standard deviations were identified using boxplots. Variables that significantly deviated from normality (p < 0.05) underwent log transformation prior to analysis to approximate normality. The effects of time (pre/post-intervention) and group (kefir/placebo) were analysed using a Mixed Model Analysis of Covariance (ANCOVA). The subject was the random effect in the model, and the treatment was the fixed effect. Age group, gender, medication status, and pre-treatment value of the outcome variable were used as covariates. All main effects and estimable interactions were assessed in the initial model and all covariates were included.

Non-significant interactions were removed first, starting with highest order interactions and the resulting model was compared to the previous one using the AICc criterion. The resulting model was compared with the previous model with the use of the McQuarrie and Tsai Aikake information criterion (AICc) [62]. The AICc gives an indication of the amount of remaining unexplained variance after the model has been fitted, in which a smaller AICc value indicates a better model. If an improvement in model fit was found, another non-significant interaction or covariate was removed and again the AICc criterion was used to evaluate the model fit. Models were chosen on the basis of “best fit”, and interaction terms or covariates that improved the fit were retained [63].

Main effects of group were explored using the Least squares (LS) means procedure. This employed a t-test to compare the LS mean scores of each group at the average level of the corresponding baseline score. In the event of inhomogeneity of regression slopes, indicated by a significant baseline*group interaction, LS means comparisons were used to compare the effects of each group on LS mean scores at different levels of the baseline score using the t-test. The reported ANCOVA models are the best fit (i.e. lowest AICc) models adjusted for covariates. All statistical analyses were conducted using JMP software, version 17.2 (SAS Institute Inc., Cary, NC, USA). All statistical analysis were planned and pre-registered, as set out in out protocol paper [40] and on ClinicalTrials.gov: NCT05155696. For the microbiome analysis, no significant results were obtained after multiple testing correction, and thus the reported p-values are unadjusted.

Writing

Writing assistance from GPT-3.5 (OpenAI - March 14 version) was used for language editing some portions of the manuscript, specifically for suggestions on grammar and succinctness of our written content. All scientific content, data interpretation, and conclusions were fully authored and validated by the research team. The use of GPT-3.5 did not influence study design, data analysis, or interpretation of results.

Results

Baseline characteristics

A total of 93 participants were assessed for eligibility, with 80 participants recruited into the trial between January 2022 and September 2023. Twenty-seven participants (33.75%) withdrew from the trial: nine from the control group and eighteen from the kefir group, with their data omitted from the analysis. A CONSORT flow chart [64] outlining the study schedule, number of participants and reasons for withdrawal is displayed in Fig. 1. For the primary outcome (SWAN scores), parent-report data were available for 53 participants (kefir: n = 22; placebo: n = 31) and teacher-report data for 37 participants (kefir: n = 14; placebo: n = 23), with missing questionnaire data due to non-completion despite three follow-up attempts or greater than 20% of responses missing. For the secondary outcomes: Sleep Self-Report data were available for 51 participants, Gastrointestinal Symptom Index for 51 participants, Child Sleep Health Questionnaire for 36 participants, GoNo/Go task for 32 participants. Actigraphy data were available for 35 participants (kefir: n = 13; placebo: n = 22) with reasons for missing data including inability to tolerate the device (n = 4), lost device (n = 2) and device malfunction with failure to record or transfer the data (n = 12). Microbiome data were available for 52 participants (kefir: n = 22; placebo: n = 30).

Fig. 1
figure 1

Consort 2025 flow diagram. Flow diagram of the progress through the phases of a randomised trial of kefir compared to placebo for children with ADHD

The mean age of participants was 10.19 years (SD = 1.70), with 42 (79%) identifying as male. The majority of participants were White (n = 41, 81%). Comorbidities were identified based on parent report of existing psychiatric diagnoses. In total, 8 participants (4 in each group) had a co-occurring diagnosis of autism spectrum disorder, and 3 participants (1 in kefir group; 2 in placebo group) had a co-occurring diagnosis of anxiety. No other psychiatric diagnoses were reported, and the distribution of these comorbidities was comparable across intervention groups. Groups were well-matched for baseline demographics and background characteristics, which are reported in Table 1. The only participant characteristic that differed significantly between groups was time since diagnosis. Participants in the kefir group had, on average, been diagnosed for a longer period (M = 38.72 months, SD = 20.70) compared to those in the control group (M = 18.21 months, SD = 11.84).

Table 1 Baseline demographic and clinical characteristics for each group

Adherence to drinks

Participants were contacted during the first week of the study, at midpoint, and at the end to identify any non-compliance issues. Participants reporting non-compliance were withdrawn from the study, with seven withdrawing due to disliking the drinks (six in the kefir group and one in the placebo group). All other participants reported regular drink consumption. Participants were asked to complete a daily record of drinks consumption. Of those returned (kefir group n = 11, placebo group n = 17), the mean adherence out of 42 days was 39.82 days (SD 2.36) for the kefir group and 38.59 days (SD 2.46) for the placebo group. Overall adherence was very good at 93%.

Outcome measure comparisons

Primary outcome measures – ADHD symptom severity

There were no significant main effects of group and no significant baseline-by-group interactions for SWAN parent ratings of inattention, hyperactivity or total score (all p > 0.43) (see Table 2). Similarly, no significant group differences were found for SWAN teacher ratings (all p > 0.68). Exploring the baseline-by-group interaction for SWAN teacher-rated total scores, as shown in Fig. 2, a difference in slopes of the regression lines for the two groups is found. The difference in slopes between the groups suggests a potential group × time interaction, indicating that the difference between the groups depends on the baseline score. Among participants with baseline SWAN teacher total score of 2.61, those assigned to the kefir group had a non-significant trend for lower scores at week six (M = 2.03, SE = 0.33 vs 2.86, SE = 0.34), p = 0.088. At a baseline score of −0.44, the kefir group showed a mean score of −0.05 at week 6 compared to the placebo group with a score of −1.14 at week 6.

Fig. 2
figure 2

Scatterplot showing the association between teacher-rated swan scores at baseline (T1) and follow-up (T2), by treatment group. Solid lines represent regression lines for the kefir (blue) and placebo (red) groups; shaded areas indicate 95% confidence intervals

Table 2 Baseline and week six descriptive statistics for primary outcome variables by intervention

Secondary outcome measures

There were no significant main effects of group and no significant baseline-by-group interactions for Gastrointestinal Severity Index, GoNoGo mean RT, Go errors, NoGo errors and Go/NoGo RT variance (see Table 3). For the log-transformed Go/NoGo RT variance, an interaction between baseline value and post intervention value was found. The interaction was just not significant at the 5% significance level (p = 0.052) but was retained in the final model as indicated by AICc. As shown in Fig. 3, there was a positive association between RT variance at baseline (T1) and follow-up (T2) in the placebo group, such that participants with greater variance at T1 tended to show similar variance at T2. In contrast, the kefir group showed no clear relationship between T1 and T2 RT variance. The difference in predicted means between the kefir and placebo group approached significance at the highest RT variability at baseline (p = 0.086).

Fig. 3
figure 3

Scatterplot showing the association between log-transformed reaction time (RT) variance at baseline (T1) and follow-up (T2) on the Go/NoGo task, by treatment group. Solid lines represent regression lines for the kefir (blue) and placebo (red) groups; shaded areas indicate 95% confidence intervals

Table 3 Baseline and week six descriptive statistics for secondary outcome variables by intervention
Sleep related measures

There were no significant main effects of group and no significant baseline-by-group interactions for the Child Sleep Habits Questionnaire, or actigraphy measures of duration, mean activity during sleep, sleep efficiency, sleep onset latency, sleep fragmentation, daytime activity. For the actigraphy measure of minutes spent awake during the down period, the residuals were not normally distributed. A log transformation was applied, and analysis of the log-transformed data revealed a significant effect of group (t (32) = 2.11, p = 0.04). LS means indicated that following the intervention, participants in the kefir group spent fewer minutes awake during their down period (M = 4.25, SE = 0.09) than those in the placebo group (M = 4.50, SE = 0.07). With means back transformed into original units, this reflects minutes awake of (M = 70.10) for the kefir group and (M = 89.72) for the placebo group. There was also a significant effect of group on self-report sleep total scores (t (48) = 2.42, p = 0.02). Unexpectedly, comparison of LS means indicated that those in the kefir group self-reported significantly more sleep problems than those in the placebo group following the intervention (M = 39.81, SE = 0.75 vs. M = 37.40, SE = 0.65), p = 0.02.

Gut microbiome comparisons

To gain insight into one potential route by which kefir supplementation may affect neurodevelopmental outcomes in ADHD, we examined its impact on gut microbiota composition. Kefir supplementation did not lead to significant changes in overall alpha diversity compared to placebo (Supplementary Fig. 1). Likewise, species-level beta diversity was not significantly different between groups (PERMANOVA: R2 = 2.1%, p = 0.29; Fig. 4A). However, when analysing functional profiles (gut metabolic modules), significant differences were observed between kefir and placebo groups post-intervention (PERMANOVA: R2 = 3.8%, p = 0.03), suggesting that kefir may modulate microbial functional capacity even in the absence of broad taxonomic shifts (Fig. 4B).

Fig. 4
figure 4

Comparative community and functional profiling between kefir and placebo groups. (A-B) Beta diversity at the species level (A) and functional level based on gut metabolic modules (B), visualized by Principal Component Analysis (PCA) of centered log-ratio (CLR) transformed abundances using the Aitchison distance. Ellipses represent 80% confidence intervals, and ridgeline plots show the projection of samples along PC1 and PC2. (C-D) Volcano plots of differential abundance at the species level (C) and GMM level (D) between kefir and placebo groups. Each point represents a taxon or module; the x-axis shows effect size differences and the horizontal line indicates an unadjusted p-value threshold of 0.05. Significant features above this threshold are color-coded and annotated

Several specific taxonomic shifts were observed such as three different Bifidobacterium species (B. adolescentis, B. infantis, and B. longum) and members of the Alistipes genus (Alistipes sp021204515 and A. timonensis) showed statistically significant increases (Wilcoxon test, p < 0.05) in relative abundance from baseline to post-intervention in the kefir group, but not in the placebo group (Fig. 4C). Conversely, the relative abundance of species such as Copromonas sp900556965, Blautia_A wexlerae_B, Coprococcus eutactus, and Intestinibacter bartlettii decreased following kefir intervention (Fig. 4C). At the gut metabolic pathway level, glutamate degradation I (MF0030) was significantly decreased in the kefir group compared to placebo, while metabolic modules involved in amino acid metabolism showed increased relative abundance following kefir supplementation (Fig. 4D). No evidence was found for long-term colonisation by kefir-derived species such as Lactococcus, Lactobacillus, and Leuconostoc, which were abundant in the kefir drink but not detected in participant stool samples post-intervention.

Correlations between microbiome changes and ADHD symptom severity outcomes

To explore the relationship between microbiome changes and ADHD symptom severity or sleep improvements, we performed Spearman correlation analyses between absolute changes in key microbial taxa and functional modules and changes in both primary and secondary outcome measures, with analyses conducted separately for the kefir and placebo group. Data for the primary outcome measures are displayed in Supplementary Files 2 and 3. Several statistically significant correlations (R2 ≥ |0.50|, p < 0.01) were observed between changes in specific microbial taxa or functional modules and behavioural outcomes in both groups. Notably, the pattern and strength of these associations varied between groups and across different behavioural endpoints, and no single taxon or pathway consistently predicted response. These findings highlight the inherent complexity and individual variability in microbiome–behaviour interactions, suggesting that the relationship between gut microbial shifts and behavioural outcomes may depend on broader ecological and host factors.

Adverse events

Three participants reported a mild (grade 1) adverse event during the intervention. One participant receiving the placebo experienced three episodes of diarrhea during the intervention and another participant in the placebo group experienced vomiting for four days. Both participants continued with the intervention with no further adverse events. One participant in the kefir group contracted Chicken Pox during the final week of the study and had to withdraw as they were too ill to complete the outcome assessments. The cause of these events was unknown, but not perceived, by the parents, to be related to the study intervention. No other adverse events were reported during the intervention.

Discussion

This double-blind, randomised, placebo-controlled trial evaluated the impact of daily kefir consumption on symptom severity, sleep and microbiomes in children with ADHD. We hypothesised that supplementation with kefir would be superior to placebo at inducing improvements in ADHD symptom severity, gut microbiota composition, gut symptomatology, sleep, inattention and impulsivity. To our knowledge, this is the first RCT exploring the impact of kefir in ADHD.

No significant overall effects were observed on the primary outcome measures of parent- and teacher-rated ADHD symptoms, as assessed by the SWAN scale. Exploratory planned analyses indicated that children in the kefir group with the highest teacher-rated ADHD symptom scores at baseline showed a non-significant trend toward lower SWAN scores at week six compared to the placebo group. Given the small sample size and limited statistical power, this observation should be interpreted cautiously and considered to inform further lines of enquiry rather than confirming an association. For the Go/NoGo RT variance, an interaction between baseline value and post-intervention value was found. There was a positive association between RT variance at baseline and follow-up in the placebo group, but not in the kefir group. Given that greater variance indicates greater inattention, this perhaps suggests a differential impact on inattention according to treatment group and baseline severity. Actigraphy data revealed the kefir group spent fewer minutes awake during the down period at week six than the placebo group, suggesting a positive impact of kefir on sleep quality. However, the other measures of actigraphy revealed no significant group differences and paradoxically the kefir group self-reported a greater number of sleep problems post-intervention than the placebo group, highlighting inconsistencies across measures.

It is notable that while objective actigraphy measures indicated improved sleep in the kefir group, subjective self-report measures did not reflect this change and, in fact, suggested greater perceived sleep problems post-intervention. This mismatch between objective and subjective sleep outcomes is well documented in paediatric and neurodevelopmental research, particularly in ADHD, where subjective reports of sleep disturbance are often more pronounced than what is captured by actigraphy or polysomnography [12, 65, 66]. Such discrepancies likely reflect a combination of perceptual differences, reporting bias, and the multifaceted nature of sleep difficulties in these populations. We propose to explore these relationships further in subsequent analyses. Future research would benefit from using a diverse range of sleep assessment methods in a larger sample of children.

Moreover, given that circadian disruption, including delayed sleep phase and altered melatonin secretion, is a well-documented feature of ADHD [12, 13], the observed improvement in objectively measured sleep in the kefir group is especially notable. While the significant effect was found only for a single actigraphy measure and our findings are correlational, it is plausible, based on current literature, that modulation of the gut microbiome may influence sleep via pathways involving serotonin and melatonin synthesis along the gut–brain axis. However, we did not directly measure these mechanisms in our study, and further research is needed to clarify the causal relationships involved.

To explore the clinical relevance of the microbiome shifts observed with kefir supplementation, we next investigated how changes in gut taxa and metabolic pathways related to key sleep improvements in our cohort. Our findings demonstrate that kefir supplementation resulted in the modulation of the gut microbiota, specifically, an increase in Bifidobacterium and Alistipes species, as well as enrichment of functional modules related to SCFA and neuroactive metabolite production.Bifidobacterium species are widely recognised for their health-promoting roles, including the production of acetate and GABA, both of which influence gut–brain signalling and sleep regulation [67,68,69]. While Alistipes has been linked to both positive and negative outcomes in the literature, recent research supports a context-dependent effect, i.e., increases in Alistipes may contribute positively to gut–brain modulation when occurring within a resilient and diverse microbial environment, as in our study [70]. This aligns with broader findings that associations between specific microbial genera and brain health outcomes are highly context-dependent, varying according to the composition and function of the wider microbial ecosystem [71]. Crucially, in the kefir group, the observed microbiome changes coincided with better sleep continuity (as measured by actigraphy) and a measurable trend for improvements in teacher ratings of ADHD symptoms in children with greater baseline severity. Correlation analyses revealed that while some significant associations emerged between changes in specific microbial taxa or functional pathways and behavioural or sleep outcomes, these associations were not consistent across all endpoints or participants. This highlights both the potential for microbiome modulation to influence neurodevelopmental outcomes via metabolites such as SCFAs, GABA, and serotonin/melatonin precursors [72,73,74,75], and the inherent complexity and individual variability of these gut–brain relationships.

Given increasing evidence that host neurobiology and genetics shape the gut environment, we contextualised our findings within the broader framework of bidirectional gut–brain communication. Recent multi-omics and genetic research documents how the host’s neurological phenotype – including genetic, neurochemical, immune, and behavioural features – actively shape the gut biochemical environment, which in turn influences the structure and function of the gut microbiota. Host-derived factors such as neurotransmitters, immune mediators, and stress hormones create a biochemical context that both selects for and modulates the metabolic behaviour of gut microbes, establishing a dynamic, bidirectional dialogue along the gut–brain axis [76,77,78].

Compelling evidence from animal models, human neuropsychiatric cohorts, and systematic reviews shows that neurodevelopmental and psychiatric phenotypes, such as ADHD, are associated not only with distinct patterns of gut microbial diversity and function, but also with host genetic variants that co-regulate both brain function and gut microbiota composition [79, 80]. This host–microbiome interplay means that observed changes in microbial taxa or metabolites are best interpreted in the context of the individual’s neurobiological, immunological, and genetic background. Accordingly, the effects of dietary or probiotic interventions on the gut–brain axis are likely mediated by this context-sensitive, reciprocal relationship.

To interpret the observed associations, we evaluated the plausible mechanistic pathways—such as SCFA, GABA, and serotonin metabolism, linking kefir-induced microbiome changes to symptom severity and sleep outcomes. The observed increase in SCFA-producing and neuroactive-compound-producing bacteria offers plausible mechanisms for the sleep improvements and symptom severity trends seen with kefir. SCFAs like butyrate and acetate are known to cross the blood–brain barrier, influence neuroinflammation [81], and modulate sleep architecture across the lifespan, from infants to older adults [82, 83]. Microbially produced GABA may help reduce neuronal excitability, with potential calming and sleep-promoting effects [84]. Similarly, modulation of tryptophan metabolism by the microbiota may enhance serotonin and melatonin synthesis, supporting improved sleep onset and maintenance [85]. The concurrent increases in Bifidobacterium and Alistipes may be especially relevant, given their joint roles in producing both SCFAs and neuroactive metabolites. This functional signature appears to be associated with improved sleep and behavioural regulation, particularly in children with greater symptom severity at baseline.

The genus Alistipes has a controversial role in microbiome and neuropsychiatric research due to its context dependent role in health and disease. Several studies have reported higher Alistipes levels in dysbiotic states and conditions such as major depressive disorder (MDD) and other neuropsychiatric or inflammatory illnesses [86, 87]. However, recent literature suggests that Alistipes may play a dual role depending on the overall microbial environment and host state. Recent literature suggests that Alistipes may play a dual role depending on the overall microbial environment and host state [88]. For instance, Alistipes is capable of producing neuroactive compounds, including GABA and indole derivatives, which can influence enterochromaffin cell serotonin release and vagal nerve signalling [89]. Some animal studies and research protocols further hypothesise that Alistipes-derived GABA modulates serotonin release via enterochromaffin cells and the vagus nerve, with possible implications for mood and behaviour [84, 90]. However, whether this pathway is ultimately beneficial or detrimental likely depends on both the host’s baseline physiology and the overall balance of the microbial ecosystem.

In our study, the increase in Alistipes occurred alongside a significant increase in Bifidobacterium species, which are generally associated with gut homeostasis and the production of SCFAs and GABA [91,92,93,94]. This was accompanied by shifts in microbial function towards SCFA and neuroactive metabolite synthesis and positive clinical outcomes, notably objective sleep improvement and a trend to reduced ADHD symptoms for children with higher baseline severity. Taken together, these findings suggest that Alistipes is not acting in isolation or simply as a marker of dysbiosis, but rather as part of a balanced, metabolically active microbiome that may be supporting gut–brain axis modulation. This interpretation aligns with emerging views, including recent multi-omics studies in humans, that the clinical significance of Alistipes is determined by broader ecological and functional context, not by its absolute abundance alone [70, 71, 88]. Thus, our findings provide preliminary evidence that increases in Alistipes, when occurring alongside beneficial commensals and functional pathways, may mediate or amplify the effects of dietary interventions like kefir in supporting neurodevelopmental outcomes.

Whilst we have revealed some modest changes in sleep and gut microbiota for children consuming kefir, significant improvements were not seen for all outcome variables. Replication of this study in a larger and more diverse sample will be important to establish the robustness and generalisability of these findings. We had a high number of withdrawals from the study, which is consistent with other trials with ADHD participants [95]. The reduced final sample size and statistical power increase the likelihood of Type II errors and may have introduced attrition bias if families who withdrew differed systematically from those who completed the study, for example in motivation, or tolerance of the intervention. To minimise imbalance within a small sample, baseline adaptive randomisation was employed. However, despite this method, some group differences were observed at baseline, perhaps reflecting the heterogeneity of ADHD as a condition [96]. Significant differences were observed between groups in ‘time since ADHD diagnosis’ at baseline, which may have introduced potential bias. Children with a longer duration since diagnosis may differ in treatment history, behavioural adaptation, or symptom presentation compared with those more recently diagnosed, potentially influencing their response to the intervention. As our analysis plan was pre-specified and published in the trial protocol, we did not include time since diagnosis as a covariate post hoc to avoid analytical flexibility. Future studies could stratify randomisation by diagnostic duration or include this variable as a planned covariate to better control for such differences.

Psychiatric comorbidities are highly prevalent in ADHD, and we included children in our study with diagnoses of autism spectrum disorder and anxiety. These disorders may also influence sleep, and gut–brain interactions, so future large-scale trials could include systematic assessment of these comorbidities to better understand their contribution to intervention effects. The gut microbiota is also shaped by multiple factors beyond the intervention, such as diet, home environment, and socioeconomic status. Future research should monitor dietary intake during the intervention period, as dietary variability may have influenced microbiome outcomes. Inclusion criteria were modified during the trial to allow participation of children on stable ADHD medication. This decision improved recruitment feasibility and sample representativeness and is in line with recent recommendations [44]. We stratified randomisation by medication status to reduce this potential source of variability, however, it may have introduced additional heterogeneity, as medication effects could interact with responses to the intervention. Future studies could potentially further control for this by having large sample sizes, with sufficient power to analyse medicated and unmedicated participants separately.

Conclusions

To our knowledge, this is the first RCT to assess the impact of kefir on the symptoms of ADHD in children. Our findings tentatively support the potential role of fermented foods as adjunctive therapy in ADHD; however, additional research is required to establish efficacy, mechanisms, and clinical applicability. Fermented foods such as kefir may serve as a low-cost, accessible addition to standard ADHD care, with a favourable safety profile and the potential to support sleep outcomes via microbiome modulation. Kefir is widely accessible, easy and inexpensive to make at home [97], and does not require specialist guidance for consumption. However, these conclusions should be regarded as preliminary and require replication in larger, more diverse cohorts. Future research should aim to identify the subgroups of children most likely to benefit, and to explore how targeted microbiome interventions can be integrated into broader lifestyle and dietary strategies for neurodevelopmental support.