Bootstrapping and finding peaks

We provide functions to return the peak of the incidence data (grouped or ungrouped), bootstrap from the incidence data, and estimate confidence intervals around a peak.

bootstrap()

dat <- fluH7N9_china_2013
x <- incidence(dat, date_index = date_of_onset, groups = gender)
#> 10 missing observations were removed.
bootstrap(x)
#> An incidence object: 67 x 3
#> date range: [2013-02-19] to [2013-07-27]
#> cases: 126
#> interval: 1 day
#> 
#>    date_index gender count
#>    <date>     <fct>  <int>
#>  1 2013-02-19 m          4
#>  2 2013-02-27 m          0
#>  3 2013-03-07 m          1
#>  4 2013-03-08 m          3
#>  5 2013-03-09 f          0
#>  6 2013-03-13 f          1
#>  7 2013-03-17 m          0
#>  8 2013-03-19 f          2
#>  9 2013-03-20 f          1
#> 10 2013-03-20 m          2
#> # … with 57 more rows

find_peak()

dat <- fluH7N9_china_2013
x <- incidence(dat, date_index = date_of_onset, groups = gender)
#> 10 missing observations were removed.

# peaks across each group
find_peak(x)
#> # A tibble: 2 x 3
#>   gender date_index count
#>   <fct>  <date>     <int>
#> 1 f      2013-04-11     3
#> 2 m      2013-04-03     6

# peak without groupings
find_peak(regroup(x))
#> # A tibble: 1 x 2
#>   date_index count
#>   <date>     <int>
#> 1 2013-04-03     7

estimate_peak()

Note that the bootstrapping approach used for estimating the peak time makes the following assumptions:

  • the total number of event is known (no uncertainty on total incidence);
  • dates with no events (zero incidence) will never be in bootstrapped datasets; and
  • the reporting is assumed to be constant over time, i.e. every case is equally likely to be reported.
dat <- fluH7N9_china_2013
x <- incidence(dat, date_index = date_of_onset, groups = province)
#> 10 missing observations were removed.

# regrouping for overall peak (we suspend progress bar for markdown)
estimate_peak(regroup(x), progress = FALSE)
#> # A tibble: 1 x 6
#>   observed_peak observed_count bootstrap_peaks  lower_ci   median     upper_ci  
#>   <date>                 <int> <list>           <date>     <date>     <date>    
#> 1 2013-04-03                 7 <tibble [100 × … 2013-03-29 2013-04-06 2013-04-14

# across provinces
estimate_peak(x, progress = FALSE)
#> # A tibble: 13 x 7
#>    province  observed_peak observed_count bootstrap_peaks  lower_ci   median    
#>    <fct>     <date>                 <int> <list>           <date>     <date>    
#>  1 Anhui     2013-03-09                 1 <tibble [100 × … 2013-03-09 2013-03-28
#>  2 Beijing   2013-04-11                 1 <tibble [100 × … 2013-04-11 2013-04-11
#>  3 Fujian    2013-04-17                 1 <tibble [100 × … 2013-04-17 2013-04-18
#>  4 Guangdong 2013-07-27                 1 <tibble [100 × … 2013-07-27 2013-07-27
#>  5 Hebei     2013-07-10                 1 <tibble [100 × … 2013-07-10 2013-07-10
#>  6 Henan     2013-04-06                 1 <tibble [100 × … 2013-04-06 2013-04-08
#>  7 Hunan     2013-04-14                 1 <tibble [100 × … 2013-04-14 2013-04-14
#>  8 Jiangsu   2013-03-19                 2 <tibble [100 × … 2013-03-08 2013-03-21
#>  9 Jiangxi   2013-04-15                 1 <tibble [100 × … 2013-04-15 2013-04-17
#> 10 Shandong  2013-04-16                 1 <tibble [100 × … 2013-04-16 2013-04-16
#> 11 Shanghai  2013-04-01                 4 <tibble [100 × … 2013-02-19 2013-04-01
#> 12 Taiwan    2013-04-12                 1 <tibble [100 × … 2013-04-12 2013-04-12
#> 13 Zhejiang  2013-04-06                 5 <tibble [100 × … 2013-03-29 2013-04-08
#> # … with 1 more variable: upper_ci <date>