sample_approx_dist.Rd
Approximate Sampling a Distribution using Counts
sample_approx_dist( cases = NULL, dist_fn = NULL, max_value = 120, earliest_allowed_mapped = NULL, direction = "backwards", type = "sample", truncate_future = TRUE )
cases | A dataframe of cases (in date order) with the following variables:
|
---|---|
dist_fn | Function that takes two arguments with the first being numeric and the second being logical (and
defined as |
max_value | Numeric, maximum value to allow. Defaults to 120 days |
earliest_allowed_mapped | A character string representing a date ("2020-01-01"). Indicates the earlies allowed mapped value. |
direction | Character string, defato "backwards". Direction in which to map cases. Supports either "backwards" or "forwards". |
type | Character string indicating the method to use to transfrom counts. Supports either "sample" which approximates sampling or "median" would shift by the median of the distribution. |
truncate_future | Logical, should cases be truncted if they occur after the first date reported in the data.
Defaults to |
A data.table
of cases by date of onset
cases <- data.table::as.data.table(EpiSoon::example_obs_cases) cases <- cases[, cases := as.integer(cases)] ## Reported case distribution print(cases)#> cases date #> 1: 1 2020-01-20 #> 2: 0 2020-01-21 #> 3: 1 2020-01-22 #> 4: 0 2020-01-23 #> 5: 0 2020-01-24 #> 6: 0 2020-01-25 #> 7: 1 2020-01-26 #> 8: 0 2020-01-27 #> 9: 0 2020-01-28 #> 10: 0 2020-01-29 #> 11: 0 2020-01-30 #> 12: 1 2020-01-31 #> 13: 1 2020-02-01 #> 14: 1 2020-02-02 #> 15: 1 2020-02-03 #> 16: 1 2020-02-04 #> 17: 1 2020-02-05 #> 18: 1 2020-02-06 #> 19: 1 2020-02-07 #> 20: 1 2020-02-08 #> 21: 1 2020-02-09 #> 22: 1 2020-02-10 #> 23: 1 2020-02-11 #> 24: 1 2020-02-12 #> 25: 1 2020-02-13 #> 26: 1 2020-02-14 #> 27: 1 2020-02-15 #> 28: 2 2020-02-16 #> 29: 2 2020-02-17 #> 30: 2 2020-02-18 #> 31: 3 2020-02-19 #> 32: 3 2020-02-20 #> 33: 4 2020-02-21 #> 34: 6 2020-02-22 #> 35: 7 2020-02-23 #> 36: 9 2020-02-24 #> 37: 11 2020-02-25 #> 38: 14 2020-02-26 #> 39: 18 2020-02-27 #> 40: 21 2020-02-28 #> 41: 26 2020-02-29 #> 42: 31 2020-03-01 #> 43: 37 2020-03-02 #> 44: 45 2020-03-03 #> 45: 54 2020-03-04 #> 46: 63 2020-03-05 #> 47: 73 2020-03-06 #> 48: 88 2020-03-07 #> 49: 102 2020-03-08 #> 50: 116 2020-03-09 #> 51: 141 2020-03-10 #> 52: 167 2020-03-11 #> 53: 194 2020-03-12 #> 54: 208 2020-03-13 #> 55: 251 2020-03-14 #> 56: 273 2020-03-15 #> 57: 266 2020-03-16 #> 58: 296 2020-03-17 #> 59: 343 2020-03-18 #> 60: 399 2020-03-19 #> 61: 454 2020-03-20 #> 62: 605 2020-03-21 #> 63: 367 2020-03-22 #> cases date#> [1] 4720delay_fn <- function(n, dist, cum) { if(dist) { pgamma(n + 0.9999, 2, 1) - pgamma(n - 1e-5, 2, 1) }else{ as.integer(rgamma(n, 2, 1)) } } onsets <- sample_approx_dist(cases = cases, dist_fn = delay_fn) ## Estimated onset distribution print(onsets)#> date cases #> 1: 2020-01-19 1 #> 2: 2020-01-20 1 #> 3: 2020-01-21 0 #> 4: 2020-01-22 0 #> 5: 2020-01-23 0 #> 6: 2020-01-24 0 #> 7: 2020-01-25 0 #> 8: 2020-01-26 0 #> 9: 2020-01-27 1 #> 10: 2020-01-28 0 #> 11: 2020-01-29 1 #> 12: 2020-01-30 1 #> 13: 2020-01-31 2 #> 14: 2020-02-01 1 #> 15: 2020-02-02 1 #> 16: 2020-02-03 0 #> 17: 2020-02-04 1 #> 18: 2020-02-05 0 #> 19: 2020-02-06 1 #> 20: 2020-02-07 0 #> 21: 2020-02-08 2 #> 22: 2020-02-09 1 #> 23: 2020-02-10 0 #> 24: 2020-02-11 1 #> 25: 2020-02-12 1 #> 26: 2020-02-13 1 #> 27: 2020-02-14 1 #> 28: 2020-02-15 2 #> 29: 2020-02-16 1 #> 30: 2020-02-17 4 #> 31: 2020-02-18 4 #> 32: 2020-02-19 3 #> 33: 2020-02-20 3 #> 34: 2020-02-21 6 #> 35: 2020-02-22 7 #> 36: 2020-02-23 8 #> 37: 2020-02-24 10 #> 38: 2020-02-25 18 #> 39: 2020-02-26 23 #> 40: 2020-02-27 28 #> 41: 2020-02-28 28 #> 42: 2020-02-29 38 #> 43: 2020-03-01 41 #> 44: 2020-03-02 50 #> 45: 2020-03-03 55 #> 46: 2020-03-04 70 #> 47: 2020-03-05 80 #> 48: 2020-03-06 108 #> 49: 2020-03-07 101 #> 50: 2020-03-08 138 #> 51: 2020-03-09 132 #> 52: 2020-03-10 162 #> 53: 2020-03-11 210 #> 54: 2020-03-12 213 #> 55: 2020-03-13 249 #> 56: 2020-03-14 288 #> 57: 2020-03-15 277 #> 58: 2020-03-16 330 #> 59: 2020-03-17 352 #> 60: 2020-03-18 401 #> 61: 2020-03-19 402 #> 62: 2020-03-20 380 #> 63: 2020-03-21 286 #> 64: 2020-03-22 104 #> date cases## Check that sum is equal to reported cases total_onsets <- median( purrr::map_dbl(1:1000, ~ sum(sample_approx_dist(cases = cases, dist_fn = delay_fn)$cases))) total_onsets#> [1] 4716## Map from onset cases to reported reports <- sample_approx_dist(cases = cases, dist_fn = delay_fn, direction = "forwards") ## Map from onset cases to reported using a mean shift reports <- sample_approx_dist(cases = cases, dist_fn = delay_fn, direction = "forwards", type = "median")