Report

Illegal online gambling: Consumer engagement and trends

The Gambling Commission’s report into estimated trends in consumer engagement with illegal gambling websites.

Published
25 September 2025

Last updated
30 September 2025 - Changes

Document actions
Print or save (opens in new tab)

Illegal online gambling: Consumer engagement and trends

Annex B - Bootstrapping methodology for web traffic estimates

Overview

To quantify the uncertainty in web traffic estimates, we employed a bootstrapping approach. This is one way to leverage the properties of a sample to further estimate properties of the population that sample came from.

Bootstrapping is a method that can estimate the variability of a given statistic by randomly resampling the data with replacement. By computing an estimated mean number of visits and visit duration for each of these 1,000 bootstrapped datasets, we defined the realistic range of our estimates as the 95 percent confidence interval that spans the 2.5th to the 97.5th percentiles of bootstrap-estimated values.

This method allows us to generate confidence intervals for key metrics without relying on parametric assumptions about the underlying data distribution. It is particularly useful when working with observational data that may exhibit skewness, outliers, or non-normality.

Metrics considered

The analysis focused on 2 primary metrics:

estimated visits per site per month
average visit duration provided in seconds and converted to minutes.

These were combined to derive a third metric: total time spent on identified illegal gambling websites, expressed in millions of minutes.

Bootstrapping methodology

For each month in the dataset:

a subset of the data corresponding to that month was extracted
from this subset, 1,000 bootstrap samples were drawn with replacement
for each sample, the following statistics were calculated:

mean number of visits
total number of visits
mean visit duration.

The 2.5th and 97.5th percentiles of the bootstrap distributions were used to construct 95 percent confidence intervals for each metric.

This process was repeated for every month in the dataset, resulting in a time series of bootstrapped estimates and associated confidence intervals.

Combining metrics

To estimate total time spent:

the mean number of visits was multiplied by the number of sites reporting data and the mean visit duration
confidence intervals for total time spent were derived by multiplying the lower bounds of the visit and duration intervals, and likewise for the upper bounds.

This approach assumes independence between the visit and duration metrics.

Interpretation

The resulting confidence intervals provide a range within which the true values of each metric are likely to fall given the observed data. This helps to communicate the inherent uncertainty in web traffic estimates.

The confidence interval calculated through this analysis is not uniform across a time series – it varies in magnitude between months. These variations are driven by the level of variation within the population of websites that month. The confidence interval will generally be smaller in months with the lower difference between the websites with the most and least traffic. We see larger confidence intervals In months where overall traffic is dominated by a smaller number of websites with high volumes of traffic.

Previous section
Annex A - Summary of approach, assumptions and caveats Next section
Annex C - Consumer research on VPN use

Last updated: 30 September 2025

Show updates to this content

No changes to show.

Cookies on the Gambling Commission website

Illegal online gambling: Consumer engagement and trends

Contents

Annex B - Bootstrapping methodology for web traffic estimates

Overview

Metrics considered

Bootstrapping methodology

Combining metrics

Interpretation

Feedback

Complaints and queries

User research