Skip to contents

Legislative statistics often require a minimum data coverage. This function takes short-term data, pads it with missing values to create a complete time series, and calculates a data capture percentage per year, pollutant and site. It may be prudent to run this function after validate_mod_obs_pairs(). pad_data() is also exported which performs the data padding without the data coverage validation step.

Usage

validate_coverage(
  data,
  resolution = c("guess", "hourly", "daily"),
  min_coverage = 0.75,
  mode = c("fix", "error", "warn"),
  dict = mqor::mqo_dict()
)

pad_data(
  data,
  resolution = c("guess", "hourly", "daily"),
  dict = mqor::mqo_dict()
)

Arguments

data

A short-term data.frame with columns defined by dict.

resolution

One of "guess", "hourly" or "daily". If "guess", the function will attempt to guess the resolution of the data. Note that only hourly and daily data are supported.

min_coverage

The minimum data coverage percent, expressed as a decimal (i.e., this option should be between 0 and 1, representing 0% and 100%). Defaults to 0.75 (75%).

mode

One of "fix" (the default), "error" or "warn". If "fix" the function will return a dataframe where site-year-pollutant combinations with low data capture have been removed. If "error" or "warn", the the function will either error or warn if mismatched values are detected. Regardless, a data capture column data_coverage will be appended to data.

dict

See mqo_dict() for more information. Acts as a data dictionary to specify the columns in the data {mqor} should use.

Author

Jack Davison