The methodology used to detect the seasonal onset of epidemics, can be divided into two essential criteria:
Here, k denotes the window size employed to obtain the local estimate of the exponential growth rate and SoC. When both of these criteria are met, an alarm is triggered and the onset of the seasonal epidemic is detected.
The model is implemented in the seasonal_onset()
function of the aedseo
package. Criteria one is fulfilled
if the growth_warning
in the output is TRUE
.
Criteria two is fulfilled if the sum_of_cases_warning
in
the output is TRUE
.
The exponential growth rate, denoted as r, represents the per capita change in the number of new cases per unit of time. Given that incidence data are integer-valued, the proposed method relies on generalized linear models (GLM). For count data, the Poisson distribution is a suitable choice as a model. Hence, the count observations denoted as Y are assumed to follow a Poisson distribution
Here, the link function, log (), connects the linear predictor to the expected value of the data point, expressed as log (λ) = μ. Given a single continuous covariate t, the mean μ can be expressed as
This is equivalent to a multiplicative model for λ, i.e.
Intuitively, negative values of r result in a decline in the number of observed cases, while r = 0 represents stability, and positive values of r indicate an increase.
It is important to note that the Poisson distribution assumes that the mean and variance are equal. In reality, real data often deviate from this assumption, with the variance (v) being significantly larger than the mean. This biological phenomenon, known as overdispersion, can be addressed within a model in various ways. One approach is to employ quasi-Poisson regression, which assumes v = σλ, or to use negative binomial regression (not implemented yet), which assumes v = λ + λ2/θ, where both σ and θ are overdispersion parameters.
First we generate some data as an tsd
object, with the
generate_seasonal_data()
function.
# Construct an 'tsd' object with time series data
set.seed(222)
tsd_data <- generate_seasonal_data(
years = 3,
start_date = as.Date("2021-05-26"),
amplitude = 300,
mean = 300,
phase = 0,
trend_rate = 1.002,
noise_overdispersion = 3,
time_interval = "week"
)
Next, the tsd
object is passed to the
seasonal_onset()
function. Here, a window size of
k=5
is specified, meaning that a total of 5 weeks is used
in the local estimate of the exponential growth rate.
na_fraction_allowed = 0.4
defines how large a fraction of
observations in the k
window that are allowed to be
NA
, here 0.4*5 = 2
observations. Additionally,
a 95% confidence interval is specified. Finally, the exponential growth
rate is estimated using quasi-Poisson regression to account for
overdispersion in the data. A disease-specific threshold can
additionally be passed to the function, but is not necessary if only the
growth rate estimations are wanted. season_start
and
season_end
can be used to specify the season to stratify
the observations by. This algorithm runs across seasons, such that the
first observation in a new season will use the last k-1
observations from the previous season. The seasonal_onset()
function provides a tsd_onset
object with a comprehensive
seasonal onset analysis.
In the first figure, observations over time are shown with a legend
for the seasonal onset alarm. In the second figure, the local estimates
of the growth rates are presented along with their corresponding 95%
confidence interval with a legend for the growth warning. This
visualisation can be generated by utilizing the plot()
S3
method with objects of the tsd_onset
class.
The predict()
S3 method for tsd_onset
objects allows you to make predictions for future time steps based on
the estimated growth rates. Following is an example of predict
observations for the next 5 weekly time steps.
In the example above, we use the predict method to predict growth
rates for the next 5 time steps, according to the
time_interval = "week"
in the tsd_onset
object. The n_step
argument specifies the number of steps
into the future you want to forecast. The resulting
tsd_predict
object contains observation estimates, lower
bounds, and upper bounds for each time step.
The summary()
S3 method for tsd_onset
objects provides a concise summary of your automated early detection of
seasonal_onset
analysis. You can use it to retrieve
important information about your analysis, including the first seasonal
onset alarm (reference time point), SoC at reference time point (here
over a 5 week window), growth rate estimates at reference time point,
total number of growth warnings in the series and latest warnings
(growth and SoC). It helps you quickly assess the key findings of your
analysis.
summary(seasonal_onset_results)
#> Summary of tsd_onset object with disease_threshold
#>
#> Model output:
#> Reference time point (first seasonal onset alarm in season): 2023-05-24
#> Observations at reference time point: 444
#> Sum of observations at reference time point: 1570
#> Growth rate estimate at reference time point:
#> Estimate Lower (2.5%) Upper (97.5%)
#> 0.194 0.217 0.170
#> Total number of growth warnings in the series: 61
#> Latest growth warning: 2024-05-15
#> Latest sum of cases warning: 2024-05-15
#> Latest seasonal onset alarm: 2024-05-15
#>
#> The season for reference time point:
#> 2023/2024
#>
#> Model settings:
#> Called using distributional family: quasipoisson
#> Window size for growth rate estimation and calculation of sum of cases: 5
#> The time interval for the observations: week
#> Disease specific threshold: 100