Skip to contents

Installing and loading the package

The package can either be installed from CRAN, from our r-universe repository, or from GitHub. See the README for details. Once installed load the package using the following,

Worldwide data

Accessing national data

Both the World Health Organisation (WHO) and European Centre for Disease Control (ECDC) provide worldwide national data. Access national level data for any country using:

This returns daily new and cumulative (total) cases, and where available, deaths, hospitalisations, and tests. For a complete list of variables returned, see section 5, “Data glossary” below. See the documentation (?get_national_data) for details of optional arguments.

Data is returned with no gaps in the structure of the data by country over time, and NAs fill in where data are not available.

Sub-national time-series data

Accessing sub-national data

Access sub-national level data for a specific country over time using get_regional_data(). Use get_available_datasets() to explore the currently supported sub-national datasets and select the data set of interest using the country (selects the country of interest), and level (selects the spatial scale of the data) arguments of get_regional_data.

This function returns daily new and cumulative (total) cases, and where available, deaths, hospitalisations, and tests. For a complete list of variables returned, see section 5, “Data glossary” below. See the documentation (?get_regional_data) for details of optional arguments.

As for national level data any gaps in reported data are filled with NAs.

For example, data for Lithuania Level 1 regions over time can be accessed using:

get_regional_data(country = "lithuania")

This data then has the following format:

date county iso_3166_2 cases_new cases_total deaths_new deaths_total recovered_new recovered_total hosp_new hosp_total tested_new tested_total
2022-06-19 Tauragės apskritis LT-TA 1 23325 0 380 0 17792 NA NA 7 220108
2022-06-19 Telšių apskritis LT-TE 0 37515 0 481 0 29865 NA NA 2 406951
2022-06-19 Unknown NA 1 1854 0 16 0 641 NA NA 0 0
2022-06-19 Utenos apskritis LT-UT 1 37865 0 288 0 29063 NA NA 4 328907
2022-06-19 Vilniaus apskritis LT-VL 10 370071 1 2244 7 270287 NA NA 40 3520780

Alternatively, the same data can be accessed using the underlying class as follows (the Lithuania object now contains data at each processing step and the methods used at each step),

lithuania <- Lithuania$new(get = TRUE)
lithuania$return()

Level 1 and Level 2 regions

All countries included in the package (see below,“Coverage”) have data for regions at the admin-1 level, the largest administrative unit of the country (e.g. state in the USA). Some countries also have data for smaller areas at the admin-2 level (e.g. county in the USA).

Data for Level 2 units can be returned by using the level = "2" argument. The dataset will still show the corresponding Level 1 region.

An example of a country with Level 2 units is Lithuania, where Level 2 units are Lithuanian municipalities:

get_regional_data(country = "lithuania", level = "2")

This data again has the following format:

date county iso_3166_2 municipality iso_3166_2_municipality cases_new cases_total deaths_new deaths_total recovered_new recovered_total hosp_new hosp_total tested_new tested_total
2022-06-19 Vilniaus apskritis LT-VL Švenčionių r. sav. LT-49 0 7189 0 61 0 5361 NA NA 0 63041
2022-06-19 Vilniaus apskritis LT-VL Trakų r. sav. LT-52 0 11719 0 95 2 8342 NA NA 1 89880
2022-06-19 Vilniaus apskritis LT-VL Ukmergės r. sav. LT-53 0 12578 0 169 0 8718 NA NA 4 104655
2022-06-19 Vilniaus apskritis LT-VL Vilniaus m. sav. LT-57 10 288436 0 1434 6 211964 NA NA 32 2745763
2022-06-19 Vilniaus apskritis LT-VL Vilniaus r. sav. LT-58 0 28659 0 259 0 21255 NA NA 3 328978

Totals

For totalled data up to the most recent date available, use the totals argument.

get_regional_data("lithuania", totals = TRUE)

This data now has no date variable and reflects the latest total:

county iso_3166_2 cases_total deaths_total recovered_total hosp_total tested_total
Marijampolės apskritis LT-MR 43048 485 32712 0 372407
Utenos apskritis LT-UT 37865 288 29100 0 328907
Telšių apskritis LT-TE 37515 481 29895 0 406951
Tauragės apskritis LT-TA 23325 380 17814 0 220108
Unknown NA 1854 16 642 0 0

Data glossary

Subnational data

The data columns that will be returned by get_regional_data() are listed below.

To standardise across countries and regions, the columns returned for each country will always be the same. If the corresponding data was missing from the original source then that data field is filled with NA values (or 0 if accessing totals data).

Note that Date is not included if the totals argument is set to TRUE. Level 2 region/level 2 region code are not included if the level = "1".

  • date: the date that the counts were reported (YYYY-MM-DD).

  • level_1_region: the level 1 region name. This column will be named differently for different countries (e.g. state, province).

  • level_1_region_code: a standard code for the level 1 region. The column name reflects the specific administrative code used. Typically data returns the iso_3166_2 standard, although where not available the column will be named differently to reflect its source.

  • level_2_region: the level 2 region name. This column will be named differently for different countries (e.g. city, county).

  • level_2_region_code: a standard code for the level 2 region. The column will be named differently for different countries (e.g. fips in the USA).

  • cases_new: new reported cases for that day.

  • cases_total: total reported cases up to and including that day.

  • deaths_new: new reported deaths for that day.

  • deaths_total: total reported deaths up to and including that day.

  • recovered_new: new reported recoveries for that day.

  • recovered_total: total reported recoveries up to and including that day.

  • hosp_new: new reported hospitalisations for that day.

  • hosp_total: total reported hospitalisations up to and including that day (note this is cumulative total of new reported, not total currently in hospital).

  • tested_new: tests for that day.

  • tested_total: total tests completed up to and including that day.

National data

In addition to the above, the following columns are included when using get_national_data().

  • un_region: country geographical region defined by the United Nations.

  • who_region: only included when source = "WHO". Country geographical region defined by WHO.

  • population_2019: only included when source = "ECDC". Total country population estimate in 2019.